Custom expressions 自定义表达式

When you should use custom expressions and why you should take advantage of them in Metabase's notebook editor. 何时应使用自定义表达式,以及为何应在 Metabase 的笔记本编辑器中利用它们。

In mathematics, expressions are collections of symbols that together express a value. If you’ve ever worked with spreadsheet software before, expressions are formulas, like =SUM(A1, B1). 在数学中,表达式是符号的集合,这些符号共同表达一个值。如果您以前使用过电子表格软件,那么表达式就是公式,例如 =SUM(A1, B1)

Custom expressions in Metabase’s query builder are powerful tools that can cover the vast majority of analytics use cases without the need to take SQL out of the toolbox. In fact, there are big advantages to using the query builder that you don’t get when you use SQL Metabase 的查询构建器中的自定义表达式是强大的工具,可以涵盖绝大多数分析用例,而无需将 SQL 从工具箱中取出。事实上,使用查询构建器比使用 SQL 有很大的优势

  • Extensibility: Using the query builder to build queries allows people to learn from and build on your questions without needing to know any SQL. 可扩展性:使用查询构建器构建查询,可以让人们从您的问题中学习并在此基础上构建,而无需了解任何 SQL。
  • Drill-through allows people to break out records by category, zoom in, drill down to unaggregated records, or x-ray the data on click. Questions built with the Metabase query builder will give users the full power of drill-through, but questions built using SQL will only show limited options like filtering by a value. 下钻允许人们按类别细分记录、放大、向下钻取到未聚合的记录,或单击时 X 射线检查数据。使用 Metabase 查询构建器构建的问题将为用户提供下钻的全部功能,但使用 SQL 构建的问题仅显示有限的选项,例如按值筛选。

And you can always switch to SQL at any point during your development by converting an existing question to a native SQL question. 并且您始终可以在开发过程中的任何时候切换到 SQL,方法是将现有问题转换为原生 SQL 问题

There are three places in the query builder where we can use custom expressions 在查询构建器中有三个地方可以使用自定义表达式

  • Custom columns use functions to compute values (like + to add numbers) or manipulate text (like lower to change text to lowercase). 自定义列使用函数来计算值(例如 + 加数字)或操作文本(例如 lower 将文本更改为小写)。
  • Custom filters use functions like contains that evaluate to either true or false. 自定义筛选器使用诸如 contains 之类的函数,这些函数计算结果为 true 或 false。
  • Custom summaries use functions like count or sum to aggregate records. 自定义汇总使用诸如 countsum 之类的函数来聚合记录。

Custom columns 自定义列

We can add a custom column to our data using an expression to compute a new column. Let’s see an expression in action. Here is the Orders table from the Sample database included with Metabase. 我们可以使用表达式向数据添加自定义列,以计算新列。让我们看看表达式的实际应用。这是 Metabase 附带的示例数据库中的 Orders 表。

The Orders table in the Sample Database included with Metabase.

Let’s say we want to know the discount percentage applied to orders based on the pre-tax subtotal. For example, if we gave a $1 discount on a $10 order, we’d want to see a column show that we had discounted that order by 10%. 假设我们想知道基于税前小计应用于订单的折扣百分比。例如,如果我们在 10 美元的订单上给予 1 美元的折扣,我们希望看到一列显示我们已将该订单折扣了 10%。

Unfortunately, a quick scan of the columns in the preview tells us that the database does not store that computation (i.e. there is no “discount percentage” column). We only have the subtotal of the order, and the discount total. 遗憾的是,快速扫描预览中的列告诉我们,数据库未存储该计算(即,没有“折扣百分比”列)。我们只有订单的小计和折扣总额。

Thanks to math, however, we can use the discount total and order subtotal to compute the percentage. Here’s where expressions come into play: we can use an expression to compute the discount percentage for each row, and store that computed value in a new column. 然而,得益于数学,我们可以使用折扣总额和订单小计来计算百分比。这就是表达式发挥作用的地方:我们可以使用表达式计算每行的折扣百分比,并将计算出的值存储在新列中。

Let’s walk through how to create a custom column. 让我们逐步了解如何创建自定义列。

When in the query builder, we select Custom column in the Data section. 在查询构建器中时,我们在数据部分中选择自定义列

To calculate the discount percentage, we’ll need we’ll need to divide the discount by the original total (the Subtotal) to get the discount percentage. 为了计算折扣百分比,我们需要将折扣除以原始总额(Subtotal)以获得折扣百分比。

In expressions, we reference columns using brackets. For example, we can refer to the Discount column in the Orders table as [Discount]. If we need to reference a column from another table linked by a foreign key, we can use a . between the table and the column, as in [Table.Column] (alternatively you can select [Table → Column] from the dropdown menu that appears when you type an open bracket ([). For example, we could enter [Products.Category] which will resolve to: [Products → Category]. 在表达式中,我们使用方括号引用列。例如,我们可以将 Orders 表中的 Discount 列引用为 [Discount]。如果我们需要引用通过外键链接的另一个表中的列,我们可以使用表和列之间的 .,如 [Table.Column](或者,您可以从输入左方括号([)时出现的下拉菜单中选择 [Table → Column])。例如,我们可以输入 [Products.Category],它将解析为:[Products → Category]

For now, we’re just interested in columns in the Orders table, so there’s no need to reference another table. Here’s the expression (or formula), we’ll use to compute our custom discount percentage column 目前,我们只对 Orders 表中的列感兴趣,因此无需引用另一个表。这是我们将用来计算自定义折扣百分比列的表达式(或公式)

= [Discount] / [Subtotal]

Enter that expression in the Expression field, then give the new column a name: Discount percentage. 在表达式字段中输入该表达式,然后为新列命名:Discount percentage

Entering a field formula to create a custom column.

Click done, then click the Visualization button to see your new column. 单击“完成”,然后单击可视化按钮以查看您的新列。

Since the value in our new Discount percentage column concerns discounts, let’s move the column next to the Discount column. You can move columns around on tables by clicking on the column header and dragging the column to your target location, like so 由于我们新的 Discount percentage 列中的值与折扣有关,因此让我们将该列移动到 Discount 列旁边。您可以通过单击列标题并将列拖动到目标位置来在表格上移动列,如下所示

Dragging a column to change its position in the table visualization.

Since we’re computing a percentage, let’s fix the formatting so it’s easier to read. Click on the Discount percentage heading to bring up the Action Menu for the column, then click on the gears icon to format the column. 由于我们正在计算百分比,因此让我们修复格式,使其更易于阅读。单击 Discount percentage 标题以调出列的操作菜单,然后单击齿轮图标以格式化列。

Metabase will slide out a Formatting sidebar with options. Let’s change the style to Percent, and bump up the number of decimal places to 2. And since the title Discount percentage takes up a lot of space, let’s rename the column to Discount %. Metabase 将滑出一个带有选项的 Formatting sidebar。让我们将样式更改为百分比,并将小数位数增加到 2。并且由于标题 Discount percentage 占用大量空间,因此让我们将列重命名为 Discount %

There’s an option to add a mini bar chart as well. This bar chart won’t show the percentage with respect to 100%; instead the mini bar chart will show us the discount percentage relative to the discount percentage given to other orders. Let’s leave the mini bar chart off for now. 还有一个选项可以添加迷你条形图。此条形图不会显示相对于 100% 的百分比;相反,迷你条形图将向我们显示相对于其他订单的折扣百分比的折扣百分比。现在让我们关闭迷你条形图。

Here’s the finished question with the added Discount % column 这是完成的问题,其中添加了 Discount %

Our finished Discount % column.

Custom filters 自定义筛选器

Metabase comes with a lot of filtering options out of the box, but you can design more sophisticated filters using custom filter expressions. These are particularly useful for creating filters that use OR statements, and that’s what we’ll be covering here. Metabase 提供了许多开箱即用的筛选选项,但您可以使用自定义筛选器表达式设计更复杂的筛选器。这些对于创建使用 OR 语句的筛选器特别有用,而这正是我们在这里将要介绍的内容。

Normally in the query builder, when we add multiple filters to our question, Metabase implicitly combines the filters with an AND operator. For example, if we add a filter for products that start with Enormous and a filter for products that ends with Computer, our question will only return products that both start with Enormous AND end with Computer, which do not exist in Metabase’s Sample Database. 通常在查询构建器中,当我们向问题添加多个筛选器时,Metabase 会隐式地将筛选器与 AND 运算符组合在一起。例如,如果我们添加一个筛选器,用于筛选以 Enormous 开头的产品和一个筛选器,用于筛选以 Computer 结尾的产品,则我们的问题将仅返回以 Enormous 开头且以 Computer 结尾的产品,而这在 Metabase 的示例数据库中不存在。

To filter for products that either start with Enormous OR end with Computer, we’ll select Custom Expression from the Filter dropdown, and use the startsWith and endsWith functions 要筛选以 Enormous 开头或以 Computer 结尾的产品,我们将从筛选器下拉列表中选择自定义表达式,并使用 startsWithendsWith 函数

startsWith(string1, string2)
endsWith(string1, string2)

Functions startsWith and endsWith check to see if string1 starts with/ends with string2. So string1 is the string to check (the haystack), and string2 is the text to look for (the needle). And since we want to look for products that either start with Enormous OR end with Computer, we can use startsWith and endsWith expressions with an OR operator in between 函数 startsWithendsWith 检查 string1 是否以 string2 开头/结尾。因此 string1 是要检查的字符串(大海捞针),string2 是要查找的文本(针)。由于我们要查找以 Enormous 开头或以 Computer 结尾的产品,因此我们可以将 startsWithendsWith 表达式与中间的 OR 运算符一起使用

= startsWith([Title], "Enormous") OR endsWith([Title], "Computer")

The resulting data set will contain products that either start with Enormous or end with Computer 生成的数据集将包含以 Enormous 开头或以 Computer 结尾的产品

Products that are either enormous or aerodynamic.

Note that custom filter expressions must always resolve to either true or false. You can, however, nest expressions that do not resolve to true or false within statements, like 请注意,自定义筛选器表达式必须始终解析为 true 或 false。但是,您可以将不解析为 true 或 false 的表达式嵌套在语句中,例如

= contains(concat([First Name], [Last Name]), "Wizard")

because the outermost function (contains) resolves to either true or false. Whereas you couldn’t use concat([First Name], [Last Name]) as a filter, as it would resolve to a string of text (though you could use concat to create a custom column like Full Name). 因为最外层的函数 (contains) 解析为 true 或 false。但是,您不能将 concat([First Name], [Last Name]) 用作筛选器,因为它会解析为文本字符串(尽管您可以使用 concat 创建自定义列,例如 Full Name)。

Custom summaries 自定义汇总

Custom expressions unlock many different ways to aggregate our data. Let’s consider the Share function, which returns the percent of rows in the data that match the condition, as a decimal. For example, say we want to know the total percentage of paper products in our product line, i.e. what share of our product line is composed of paper products? 自定义表达式解锁了许多不同的数据聚合方式。让我们考虑 Share 函数,该函数以十进制形式返回数据中与条件匹配的行百分比。例如,假设我们想知道我们的产品线中纸制品的总百分比,即我们的产品线中纸制品占多少份额?

To start, we’ll select the Products table from the Sample Database. Next, we’ll click the Summarize button in the query builder and select Custom Expression. Then, we’ll select Share from the dropdown menu, which will prompt us for a condition. In this case, we want to know which products have “Paper” in their title, so we’ll use the contains function to search through Title. 首先,我们将从示例数据库中选择 Products 表。接下来,我们将单击查询构建器中的汇总按钮,然后选择自定义表达式。然后,我们将从下拉菜单中选择 Share,这将提示我们输入条件。在本例中,我们想知道哪些产品的标题中包含“Paper”,因此我们将使用 contains 函数来搜索 Title

= Share(contains([Title], "Paper"))

Calculating the share of paper products.

Then we name our expression (e.g., Percentage of paper products) and click Done. Click the Visualize button and Metabase will compute the share of paper products. 然后,我们命名我们的表达式(例如,Percentage of paper products)并单击完成。单击可视化按钮,Metabase 将计算纸制品的份额。

To change the formatting, select the Settings button in the bottom-left to bring up the Settings sidebar and change Number options → Style to Percent. 要更改格式,请选择左下角的设置按钮以调出 Settings sidebar,并将数字选项 → 样式更改为百分比

The share of paper products, formatted as a percentage.

Putting it all together 综合运用

Let’s create a fairly complex (contrived) question using expressions. Say we’ve been tasked to find the average net inflow for wool and cotton products by month in 2019, with net inflow being the selling price minus the cost we paid for the product. In other words: for each wool and cotton product unit sold, how much money on average did we make (or lose) per unit each month in 2019? 让我们使用表达式创建一个相当复杂(人为)的问题。假设我们已受命查找 2019 年每月羊毛和棉花产品的平均净流入量,其中净流入量为售价减去我们为产品支付的成本。换句话说:对于售出的每单位羊毛和棉花产品,我们在 2019 年每个月每单位平均赚取(或损失)多少钱?

To the get these fascinating numbers, we’ll need to use expressions to 为了获得这些引人入胜的数字,我们需要使用表达式来

  • Compute the selling price per unit (custom column). 计算每单位的售价(自定义列)。
  • Filter results to only include wool or cotton products (custom filter), and limit those results to 2019. 筛选结果以仅包含羊毛或棉花产品(自定义筛选器),并将这些结果限制为 2019 年。
  • Compute the average net inflow (custom summary), and group by month. 计算平均净流入量(自定义汇总),并按月分组。

Let’s go 开始吧

  1. 我们创建一个名为 Unit price 的自定义列。为了计算 Unit price,我们将使用表达式将小计除以售出的单位数量 (Quantity)

    = [Subtotal] / [Quantity]
    
  2. 接下来,我们将使用自定义筛选器表达式来筛选 WoolCotton 产品的订单(即,产品标题 Product.Title 中某处包含 “Wool” 或 “Cotton” 的产品)。

    = contains([Products → Title], "Wool") OR contains([Products → Title], "Cotton")
    
  3. 我们还将筛选 01/01/201912/31/2019 之间的订单。

  4. 我们将使用自定义表达式来创建自定义摘要。假设标准零售加价为 50%(基石加价)。因此,如果 Product.Price 为 2 美元,我们将假设每个产品的采购成本为 1 美元。鉴于此假设,我们可以简单地将每单位售出的净流入定义为 Unit price 减去 Product.Price 的一半。然后,我们将通过取每个订单这些数字的平均值来汇总数据。

    = Average([Unit price] - [Products → Price] / 2)
    
  5. 最后,我们将按月份按 Orders.Created_At 对这些订单进行分组。

这是我们的 notebook

Our wool and cotton notebook.

我们将选择将我们的数据可视化为折线图,我们可以点击折线图来深入查看我们的数据

Drilling through fabrics to view individual orders.

延伸阅读

下一步:在 Metabase 中进行 Join 操作

如何在 Metabase 中使用 notebook 编辑器在简单和自定义问题中连接表。

下一篇文章