Custom expressions 自定义表达式
When you should use custom expressions and why you should take advantage of them in Metabase's notebook editor. 何时应使用自定义表达式,以及为何应在 Metabase 的笔记本编辑器中利用它们。
In mathematics, expressions are collections of symbols that together express a value. If you’ve ever worked with spreadsheet software before, expressions are formulas, like =SUM(A1, B1)
. 在数学中,表达式是符号的集合,这些符号共同表达一个值。如果您以前使用过电子表格软件,那么表达式就是公式,例如 =SUM(A1, B1)
。
Custom expressions in Metabase’s query builder are powerful tools that can cover the vast majority of analytics use cases without the need to take SQL out of the toolbox. In fact, there are big advantages to using the query builder that you don’t get when you use SQL Metabase 的查询构建器中的自定义表达式是强大的工具,可以涵盖绝大多数分析用例,而无需将 SQL 从工具箱中取出。事实上,使用查询构建器比使用 SQL 有很大的优势
- Extensibility: Using the query builder to build queries allows people to learn from and build on your questions without needing to know any SQL. 可扩展性:使用查询构建器构建查询,可以让人们从您的问题中学习并在此基础上构建,而无需了解任何 SQL。
- Drill-through allows people to break out records by category, zoom in, drill down to unaggregated records, or x-ray the data on click. Questions built with the Metabase query builder will give users the full power of drill-through, but questions built using SQL will only show limited options like filtering by a value. 下钻允许人们按类别细分记录、放大、向下钻取到未聚合的记录,或单击时 X 射线检查数据。使用 Metabase 查询构建器构建的问题将为用户提供下钻的全部功能,但使用 SQL 构建的问题仅显示有限的选项,例如按值筛选。
And you can always switch to SQL at any point during your development by converting an existing question to a native SQL question. 并且您始终可以在开发过程中的任何时候切换到 SQL,方法是将现有问题转换为原生 SQL 问题。
There are three places in the query builder where we can use custom expressions 在查询构建器中有三个地方可以使用自定义表达式
- Custom columns use functions to compute values (like
+
to add numbers) or manipulate text (likelower
to change text to lowercase). 自定义列使用函数来计算值(例如+
加数字)或操作文本(例如lower
将文本更改为小写)。 - Custom filters use functions like
contains
that evaluate to either true or false. 自定义筛选器使用诸如contains
之类的函数,这些函数计算结果为 true 或 false。 - Custom summaries use functions like
count
orsum
to aggregate records. 自定义汇总使用诸如count
或sum
之类的函数来聚合记录。
Custom columns 自定义列
We can add a custom column to our data using an expression to compute a new column. Let’s see an expression in action. Here is the Orders table from the Sample database included with Metabase. 我们可以使用表达式向数据添加自定义列,以计算新列。让我们看看表达式的实际应用。这是 Metabase 附带的示例数据库中的 Orders 表。
Let’s say we want to know the discount percentage applied to orders based on the pre-tax subtotal. For example, if we gave a $1 discount on a $10 order, we’d want to see a column show that we had discounted that order by 10%. 假设我们想知道基于税前小计应用于订单的折扣百分比。例如,如果我们在 10 美元的订单上给予 1 美元的折扣,我们希望看到一列显示我们已将该订单折扣了 10%。
Unfortunately, a quick scan of the columns in the preview tells us that the database does not store that computation (i.e. there is no “discount percentage” column). We only have the subtotal of the order, and the discount total. 遗憾的是,快速扫描预览中的列告诉我们,数据库未存储该计算(即,没有“折扣百分比”列)。我们只有订单的小计和折扣总额。
Thanks to math, however, we can use the discount total and order subtotal to compute the percentage. Here’s where expressions come into play: we can use an expression to compute the discount percentage for each row, and store that computed value in a new column. 然而,得益于数学,我们可以使用折扣总额和订单小计来计算百分比。这就是表达式发挥作用的地方:我们可以使用表达式计算每行的折扣百分比,并将计算出的值存储在新列中。
Let’s walk through how to create a custom column. 让我们逐步了解如何创建自定义列。
When in the query builder, we select Custom column in the Data section. 在查询构建器中时,我们在数据部分中选择自定义列。
To calculate the discount percentage, we’ll need we’ll need to divide the discount by the original total (the Subtotal
) to get the discount percentage. 为了计算折扣百分比,我们需要将折扣除以原始总额(Subtotal
)以获得折扣百分比。
In expressions, we reference columns using brackets. For example, we can refer to the Discount column in the Orders table as [Discount]
. If we need to reference a column from another table linked by a foreign key, we can use a .
between the table and the column, as in [Table.Column]
(alternatively you can select [Table → Column]
from the dropdown menu that appears when you type an open bracket ([
). For example, we could enter [Products.Category]
which will resolve to: [Products → Category]
. 在表达式中,我们使用方括号引用列。例如,我们可以将 Orders 表中的 Discount 列引用为 [Discount]
。如果我们需要引用通过外键链接的另一个表中的列,我们可以使用表和列之间的 .
,如 [Table.Column]
(或者,您可以从输入左方括号([
)时出现的下拉菜单中选择 [Table → Column]
)。例如,我们可以输入 [Products.Category]
,它将解析为:[Products → Category]
。
For now, we’re just interested in columns in the Orders table, so there’s no need to reference another table. Here’s the expression (or formula), we’ll use to compute our custom discount percentage column 目前,我们只对 Orders 表中的列感兴趣,因此无需引用另一个表。这是我们将用来计算自定义折扣百分比列的表达式(或公式)
= [Discount] / [Subtotal]
Enter that expression in the Expression field, then give the new column a name: Discount percentage
. 在表达式字段中输入该表达式,然后为新列命名:Discount percentage
。
Click done, then click the Visualization button to see your new column. 单击“完成”,然后单击可视化按钮以查看您的新列。
Since the value in our new Discount percentage
column concerns discounts, let’s move the column next to the Discount
column. You can move columns around on tables by clicking on the column header and dragging the column to your target location, like so 由于我们新的 Discount percentage
列中的值与折扣有关,因此让我们将该列移动到 Discount
列旁边。您可以通过单击列标题并将列拖动到目标位置来在表格上移动列,如下所示
Since we’re computing a percentage, let’s fix the formatting so it’s easier to read. Click on the Discount percentage
heading to bring up the Action Menu for the column, then click on the gears icon to format the column. 由于我们正在计算百分比,因此让我们修复格式,使其更易于阅读。单击 Discount percentage
标题以调出列的操作菜单,然后单击齿轮图标以格式化列。
Metabase will slide out a Formatting sidebar with options. Let’s change the style to Percent, and bump up the number of decimal places to 2. And since the title Discount percentage
takes up a lot of space, let’s rename the column to Discount %
. Metabase 将滑出一个带有选项的 Formatting sidebar。让我们将样式更改为百分比,并将小数位数增加到 2。并且由于标题 Discount percentage
占用大量空间,因此让我们将列重命名为 Discount %
。
There’s an option to add a mini bar chart as well. This bar chart won’t show the percentage with respect to 100%; instead the mini bar chart will show us the discount percentage relative to the discount percentage given to other orders. Let’s leave the mini bar chart off for now. 还有一个选项可以添加迷你条形图。此条形图不会显示相对于 100% 的百分比;相反,迷你条形图将向我们显示相对于其他订单的折扣百分比的折扣百分比。现在让我们关闭迷你条形图。
Here’s the finished question with the added Discount %
column 这是完成的问题,其中添加了 Discount %
列
Custom filters 自定义筛选器
Metabase comes with a lot of filtering options out of the box, but you can design more sophisticated filters using custom filter expressions. These are particularly useful for creating filters that use OR
statements, and that’s what we’ll be covering here. Metabase 提供了许多开箱即用的筛选选项,但您可以使用自定义筛选器表达式设计更复杂的筛选器。这些对于创建使用 OR
语句的筛选器特别有用,而这正是我们在这里将要介绍的内容。
Normally in the query builder, when we add multiple filters to our question, Metabase implicitly combines the filters with an AND
operator. For example, if we add a filter for products that start with Enormous
and a filter for products that ends with Computer
, our question will only return products that both start with Enormous
AND end with Computer
, which do not exist in Metabase’s Sample Database. 通常在查询构建器中,当我们向问题添加多个筛选器时,Metabase 会隐式地将筛选器与 AND
运算符组合在一起。例如,如果我们添加一个筛选器,用于筛选以 Enormous
开头的产品和一个筛选器,用于筛选以 Computer
结尾的产品,则我们的问题将仅返回以 Enormous
开头且以 Computer
结尾的产品,而这在 Metabase 的示例数据库中不存在。
To filter for products that either start with Enormous
OR end with Computer
, we’ll select Custom Expression from the Filter dropdown, and use the startsWith
and endsWith
functions 要筛选以 Enormous
开头或以 Computer
结尾的产品,我们将从筛选器下拉列表中选择自定义表达式,并使用 startsWith
和 endsWith
函数
startsWith(string1, string2)
endsWith(string1, string2)
Functions startsWith
and endsWith
check to see if string1
starts with/ends with string2
. So string1
is the string to check (the haystack), and string2
is the text to look for (the needle). And since we want to look for products that either start with Enormous
OR end with Computer
, we can use startsWith
and endsWith
expressions with an OR operator in between 函数 startsWith
和 endsWith
检查 string1
是否以 string2
开头/结尾。因此 string1
是要检查的字符串(大海捞针),string2
是要查找的文本(针)。由于我们要查找以 Enormous
开头或以 Computer
结尾的产品,因此我们可以将 startsWith
和 endsWith
表达式与中间的 OR 运算符一起使用
= startsWith([Title], "Enormous") OR endsWith([Title], "Computer")
The resulting data set will contain products that either start with Enormous
or end with Computer
生成的数据集将包含以 Enormous
开头或以 Computer
结尾的产品
Note that custom filter expressions must always resolve to either true or false. You can, however, nest expressions that do not resolve to true or false within statements, like 请注意,自定义筛选器表达式必须始终解析为 true 或 false。但是,您可以将不解析为 true 或 false 的表达式嵌套在语句中,例如
= contains(concat([First Name], [Last Name]), "Wizard")
because the outermost function (contains
) resolves to either true or false. Whereas you couldn’t use concat([First Name], [Last Name])
as a filter, as it would resolve to a string of text (though you could use concat to create a custom column like Full Name
). 因为最外层的函数 (contains
) 解析为 true 或 false。但是,您不能将 concat([First Name], [Last Name])
用作筛选器,因为它会解析为文本字符串(尽管您可以使用 concat 创建自定义列,例如 Full Name
)。
Custom summaries 自定义汇总
Custom expressions unlock many different ways to aggregate our data. Let’s consider the Share
function, which returns the percent of rows in the data that match the condition, as a decimal. For example, say we want to know the total percentage of paper products in our product line, i.e. what share of our product line is composed of paper products? 自定义表达式解锁了许多不同的数据聚合方式。让我们考虑 Share
函数,该函数以十进制形式返回数据中与条件匹配的行百分比。例如,假设我们想知道我们的产品线中纸制品的总百分比,即我们的产品线中纸制品占多少份额?
To start, we’ll select the Products table from the Sample Database. Next, we’ll click the Summarize button in the query builder and select Custom Expression. Then, we’ll select Share
from the dropdown menu, which will prompt us for a condition. In this case, we want to know which products have “Paper” in their title, so we’ll use the contains
function to search through Title
. 首先,我们将从示例数据库中选择 Products 表。接下来,我们将单击查询构建器中的汇总按钮,然后选择自定义表达式。然后,我们将从下拉菜单中选择 Share
,这将提示我们输入条件。在本例中,我们想知道哪些产品的标题中包含“Paper”,因此我们将使用 contains
函数来搜索 Title
。
= Share(contains([Title], "Paper"))
Then we name our expression (e.g., Percentage of paper products
) and click Done. Click the Visualize button and Metabase will compute the share of paper products. 然后,我们命名我们的表达式(例如,Percentage of paper products
)并单击完成。单击可视化按钮,Metabase 将计算纸制品的份额。
To change the formatting, select the Settings button in the bottom-left to bring up the Settings sidebar and change Number options → Style to Percent. 要更改格式,请选择左下角的设置按钮以调出 Settings sidebar,并将数字选项 → 样式更改为百分比。
Putting it all together 综合运用
Let’s create a fairly complex (contrived) question using expressions. Say we’ve been tasked to find the average net inflow for wool and cotton products by month in 2019, with net inflow being the selling price minus the cost we paid for the product. In other words: for each wool and cotton product unit sold, how much money on average did we make (or lose) per unit each month in 2019? 让我们使用表达式创建一个相当复杂(人为)的问题。假设我们已受命查找 2019 年每月羊毛和棉花产品的平均净流入量,其中净流入量为售价减去我们为产品支付的成本。换句话说:对于售出的每单位羊毛和棉花产品,我们在 2019 年每个月每单位平均赚取(或损失)多少钱?
To the get these fascinating numbers, we’ll need to use expressions to 为了获得这些引人入胜的数字,我们需要使用表达式来
- Compute the selling price per unit (custom column). 计算每单位的售价(自定义列)。
- Filter results to only include wool or cotton products (custom filter), and limit those results to 2019. 筛选结果以仅包含羊毛或棉花产品(自定义筛选器),并将这些结果限制为 2019 年。
- Compute the average net inflow (custom summary), and group by month. 计算平均净流入量(自定义汇总),并按月分组。
Let’s go 开始吧
-
我们创建一个名为
Unit price
的自定义列。为了计算Unit price
,我们将使用表达式将小计除以售出的单位数量 (Quantity
)= [Subtotal] / [Quantity]
-
接下来,我们将使用自定义筛选器表达式来筛选
Wool
和Cotton
产品的订单(即,产品标题Product.Title
中某处包含 “Wool” 或 “Cotton” 的产品)。= contains([Products → Title], "Wool") OR contains([Products → Title], "Cotton")
-
我们还将筛选
01/01/2019
到12/31/2019
之间的订单。 -
我们将使用自定义表达式来创建自定义摘要。假设标准零售加价为 50%(基石加价)。因此,如果
Product.Price
为 2 美元,我们将假设每个产品的采购成本为 1 美元。鉴于此假设,我们可以简单地将每单位售出的净流入定义为Unit price
减去Product.Price
的一半。然后,我们将通过取每个订单这些数字的平均值来汇总数据。= Average([Unit price] - [Products → Price] / 2)
-
最后,我们将按月份按
Orders.Created_At
对这些订单进行分组。
这是我们的 notebook
我们将选择将我们的数据可视化为折线图,我们可以点击折线图来深入查看我们的数据
延伸阅读
- 文档:查询构建器中的自定义表达式。表达式中还有很多我们没有介绍过的函数,请查看文档以获取函数完整列表。
- 时间序列比较 展示了如何使用自定义表达式来可视化和比较时间序列。
- 清理和格式化文本 使用自定义表达式来处理混乱和缺失的数据。
下一步:在 Metabase 中进行 Join 操作
如何在 Metabase 中使用 notebook 编辑器在简单和自定义问题中连接表。