> ## Documentation Index
> Fetch the complete documentation index at: https://axiom.co/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# summarize

> This page explains how to use the summarize operator function in APL.

## Introduction

The `summarize` operator in APL enables you to perform data aggregation and create summary tables from large datasets. You can use it to group data by specified fields and apply aggregation functions such as `count()`, `sum()`, `avg()`, `min()`, `max()`, and many others. This is particularly useful when analyzing logs, tracing OpenTelemetry data, or reviewing security events. The `summarize` operator is helpful when you want to reduce the granularity of a dataset to extract insights or trends.

## For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

<AccordionGroup>
  <Accordion title="Splunk SPL users">
    In Splunk SPL, the `stats` command performs a similar function to APL’s `summarize` operator. Both operators are used to group data and apply aggregation functions. In APL, `summarize` is more explicit about the fields to group by and the aggregation functions to apply.

    <CodeGroup>
      ```sql Splunk example theme={null}
      index="sample-http-logs" | stats count by method
      ```

      ```kusto APL equivalent theme={null}
      ['sample-http-logs']
      | summarize count() by method
      ```
    </CodeGroup>
  </Accordion>

  <Accordion title="ANSI SQL users">
    The `summarize` operator in APL is conceptually similar to SQL’s `GROUP BY` clause with aggregation functions. In APL, you explicitly specify the aggregation function (like `count()`, `sum()`) and the fields to group by.

    <CodeGroup>
      ```sql SQL example theme={null}
      SELECT method, COUNT(*) 
      FROM sample_http_logs 
      GROUP BY method
      ```

      ```kusto APL equivalent theme={null}
      ['sample-http-logs']
      | summarize count() by method
      ```
    </CodeGroup>
  </Accordion>
</AccordionGroup>

## Usage

### Syntax

```kusto  theme={null}
| summarize [[Field1 =] AggregationFunction [, ...]] [by [Field2 =] GroupExpression [, ...]]
```

### Parameters

* `Field1`: A field name.
* `AggregationFunction`: The aggregation function to apply. Examples include `count()`, `sum()`, `avg()`, `min()`, and `max()`.
* `GroupExpression`: A scalar expression that can reference the dataset.

### Returns

The `summarize` operator returns a table where:

* The input rows are arranged into groups having the same values of the `by` expressions.
* The specified aggregation functions are computed over each group, producing a row for each group.
* The result contains the `by` fields and also at least one field for each computed aggregate. Some aggregation functions return multiple fields.

## Use case examples

<Tabs>
  <Tab title="Log analysis">
    In log analysis, you can use `summarize` to count the number of HTTP requests grouped by method, or to compute the average request duration.

    **Query**

    ```kusto  theme={null}
    ['sample-http-logs']
    | summarize count() by method
    ```

    [Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27sample-http-logs%27%5D%20%7C%20summarize%20count\(\)%20by%20method%22%7D)

    **Output**

    | method | count\_ |
    | ------ | ------- |
    | GET    | 1000    |
    | POST   | 450     |

    This query groups the HTTP requests by the `method` field and counts how many times each method is used.
  </Tab>

  <Tab title="OpenTelemetry traces">
    You can use `summarize` to analyze OpenTelemetry traces by calculating the average span duration for each service.

    **Query**

    ```kusto  theme={null}
    ['otel-demo-traces']
    | summarize avg(duration) by ['service.name']
    ```

    [Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27otel-demo-traces%27%5D%20%7C%20summarize%20avg\(duration\)%20by%20%5B%27service.name%27%5D%22%7D)

    **Output**

    | service.name | avg\_duration |
    | ------------ | ------------- |
    | frontend     | 50ms          |
    | cartservice  | 75ms          |

    This query calculates the average duration of traces for each service in the dataset.
  </Tab>

  <Tab title="Security logs">
    In security log analysis, `summarize` can help group events by status codes and see the distribution of HTTP responses.

    **Query**

    ```kusto  theme={null}
    ['sample-http-logs']
    | summarize count() by status
    ```

    [Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27sample-http-logs%27%5D%20%7C%20summarize%20count\(\)%20by%20status%22%7D)

    **Output**

    | status | count\_ |
    | ------ | ------- |
    | 200    | 1200    |
    | 404    | 300     |

    This query summarizes HTTP status codes, giving insight into the distribution of responses in your logs.
  </Tab>
</Tabs>

## Other examples

```kusto  theme={null}
['sample-http-logs']
| summarize topk(content_type, 20)
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27sample-http-logs%27%5D%20%7C%20summarize%20topk\(content_type%2C%2020\)%22%2C%22queryOptions%22%3A%7B%22quickRange%22%3A%2230d%22%7D%7D)

```kusto  theme={null}
['github-push-event']
| summarize topk(repo, 20) by bin(_time, 24h)
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27github-push-event%27%5D%7C%20summarize%20topk\(repo%2C%2020\)%20by%20bin\(_time%2C%2024h\)%22%2C%22queryOptions%22%3A%7B%22quickRange%22%3A%2230d%22%7D%7D)

Returns a table that shows the heatmap in each interval \[0, 30], \[30, 20, 10], and so on. This example has a cell for `HISTOGRAM(req_duration_ms)`.

```kusto  theme={null}
['sample-http-logs']
| summarize histogram(req_duration_ms, 30)
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27sample-http-logs%27%5D%20%7C%20summarize%20histogram\(req_duration_ms%2C%2030\)%22%2C%22queryOptions%22%3A%7B%22quickRange%22%3A%2230d%22%7D%7D)

```kusto  theme={null}
['github-push-event']
| where _time > ago(7d)
| where repo contains "axiom"
| summarize count(), numCommits=sum(size) by _time=bin(_time, 3h), repo
| take 100
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27github-push-event%27%5D%20%7C%20where%20_time%20%3E%20ago\(7d\)%20%7C%20where%20repo%20contains%20%5C%22axiom%5C%22%20%7C%20summarize%20count\(\)%2C%20numCommits%3Dsum\(size\)%20by%20_time%3Dbin\(_time%2C%203h\)%2C%20repo%20%7C%20take%20100%22%2C%22queryOptions%22%3A%7B%22quickRange%22%3A%2230d%22%7D%7D)

## Limit behavior with time-binning

When using `limit` or `take` with `summarize` where the first grouping expression is a time bin, Axiom applies the following limit behavior:

1. Compute the global top **N** groups across all time buckets, disregarding the time dimension.
2. Limit each time bucket to only include those groups that are in the global top **N**.

This means the total number of output rows can be more than **N** rows because each time bucket may contain up to **N** groups. For example, if you have 10 time buckets and limit to 5 groups, you can get up to 50 rows.

<Accordion title="Workarounds">
  To limit the result set to exactly **N** rows:

  * Apply a second `summarize` statement after the first to aggregate further and limit the results. For example:

    ```kusto  theme={null}
    ['sample-http-logs']
    | summarize count() by _time=bin(_time, 1h), status
    | summarize make_list(count_), make_list(_time) by status
    | limit 10
    ```

  * If you don’t need time as the first grouping expression, reorder your groups so time is not first. For example:

    ```kusto  theme={null}
    ['sample-http-logs']
    | summarize count() by status, _time=bin(_time, 1h)
    | limit 10
    ```

  * Convert time to an integer and bin on that instead:

    ```kusto  theme={null}
    ['sample-http-logs']
    | extend intTime = toint(_time)
    | summarize count() by bin(intTime, 3600), status
    | limit 10
    ```
</Accordion>

## List of related operators

* [count](/apl/tabular-operators/count-operator): Use when you only need to count rows without grouping by specific fields.
* [extend](/apl/tabular-operators/extend-operator): Use to add new calculated fields to a dataset.
* [project](/apl/tabular-operators/project-operator): Use to select specific fields or create new calculated fields, often in combination with `summarize`.


Built with [Mintlify](https://mintlify.com).