> ## Documentation Index
> Fetch the complete documentation index at: https://axiom.co/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# union

> This page explains how to use the union operator in APL.

The `union` operator in APL allows you to combine the results of two or more queries into a single output. The operator is useful when you need to analyze or compare data from different datasets or tables in a unified manner. By using `union`, you can merge multiple sets of records, keeping all data from the source tables without applying any aggregation or filtering.

The `union` operator is particularly helpful in scenarios like log analysis, tracing OpenTelemetry events, or correlating security logs across multiple sources. You can use it to perform comprehensive investigations by bringing together information from different datasets into one query.

## Union of two datasets

To understand how the `union` operator works, consider these datasets:

**Server requests**

| \_time | status | method | trace\_id |
| ------ | ------ | ------ | --------- |
| 12:10  | 200    | GET    | 1         |
| 12:15  | 200    | POST   | 2         |
| 12:20  | 503    | POST   | 3         |
| 12:25  | 200    | POST   | 4         |

**App logs**

| \_time | trace\_id | message |
| ------ | --------- | ------- |
| 12:12  | 1         | foo     |
| 12:21  | 3         | bar     |
| 13:35  | 27        | baz     |

Performing a union on `Server requests` and `Application logs` would result in a new dataset with all the rows from both `DatasetA` and `DatasetB`.

A union of **requests** and **logs** would produce the following result set:

| \_time | status | method | trace\_id | message |
| ------ | ------ | ------ | --------- | ------- |
| 12:10  | 200    | GET    | 1         |         |
| 12:12  |        |        | 1         | foo     |
| 12:15  | 200    | POST   | 2         |         |
| 12:20  | 503    | POST   | 3         |         |
| 12:21  |        |        | 3         | bar     |
| 12:25  | 200    | POST   | 4         |         |
| 13:35  |        |        | 27        | baz     |

This result combines the rows and merges types for overlapping fields.

## For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

<AccordionGroup>
  <Accordion title="Splunk SPL users">
    In Splunk SPL, the `append` command works similarly to the `union` operator in APL. Both operators are used to combine multiple datasets. However, while `append` in Splunk typically adds one dataset to the end of another, APL’s `union` merges datasets while preserving all records.

    <CodeGroup>
      ```splunk Splunk example theme={null}
      index=web OR index=security
      ```

      ```kusto APL equivalent theme={null}
      ['sample-http-logs']
      | union ['security-logs']
      ```
    </CodeGroup>
  </Accordion>

  <Accordion title="ANSI SQL users">
    In ANSI SQL, the `UNION` operator performs a similar function to the APL `union` operator. Both are used to combine the results of two or more queries. However, SQL’s `UNION` removes duplicates by default, whereas APL’s `union` keeps all rows unless you use `union with=kind=unique`.

    <CodeGroup>
      ```sql SQL example theme={null}
      SELECT * FROM web_logs
      UNION
      SELECT * FROM security_logs;
      ```

      ```kusto APL equivalent theme={null}
      ['sample-http-logs']
      | union ['security-logs']
      ```
    </CodeGroup>
  </Accordion>
</AccordionGroup>

## Usage

### Syntax

```kusto theme={null}
T1 | union [withsource=FieldName] [T2], [T3], ...
```

### Parameters

* `T1, T2, T3, ...`: Tables or query results you want to combine into a single output.
* `withsource`: Optional, adds a field to the output where each value specifies the source dataset of the row. Specify the name of this additional field in `FieldName`.

### Returns

The `union` operator returns all rows from the specified tables or queries. If fields overlap, they are merged. Non-overlapping fields are retained in their original form.

## Use case examples

<Tabs>
  <Tab title="Log analysis">
    In log analysis, you can use the `union` operator to combine HTTP logs from different sources, such as web servers and security systems, to analyze trends or detect anomalies.

    **Query**

    ```kusto theme={null}
    ['sample-http-logs']
    | union ['security-logs']
    | where status == '500'
    ```

    **Output**

    | \_time              | id      | status | uri                 | method | geo.city | geo.country | req\_duration\_ms |
    | ------------------- | ------- | ------ | ------------------- | ------ | -------- | ----------- | ----------------- |
    | 2024-10-17 12:34:56 | user123 | 500    | /api/login          | GET    | London   | UK          | 345               |
    | 2024-10-17 12:35:10 | user456 | 500    | /api/update-profile | POST   | Berlin   | Germany     | 123               |

    This query combines two datasets (HTTP logs and security logs) and filters the combined data to show only those entries where the HTTP status code is 500.
  </Tab>

  <Tab title="OpenTelemetry traces">
    When working with OpenTelemetry traces, you can use the `union` operator to combine tracing information from different services for a unified view of system performance.

    **Query**

    ```kusto theme={null}
    ['otel-demo-traces']
    | union ['otel-backend-traces']
    | where ['service.name'] == 'frontend' and status_code == 'error'
    ```

    **Output**

    | \_time              | trace\_id  | span\_id | \['service.name'] | kind   | status\_code |
    | ------------------- | ---------- | -------- | ----------------- | ------ | ------------ |
    | 2024-10-17 12:36:10 | trace-1234 | span-567 | frontend          | server | error        |
    | 2024-10-17 12:38:20 | trace-7890 | span-345 | frontend          | client | error        |

    This query combines traces from two different datasets and filters them to show only errors occurring in the `frontend` service.
  </Tab>

  <Tab title="Security logs">
    For security logs, the `union` operator is useful to combine logs from different sources, such as intrusion detection systems (IDS) and firewall logs.

    **Query**

    ```kusto theme={null}
    ['sample-http-logs']
    | union ['security-logs']
    | where ['geo.country'] == 'Germany'
    ```

    **Output**

    | \_time              | id      | status | uri              | method | geo.city | geo.country | req\_duration\_ms |
    | ------------------- | ------- | ------ | ---------------- | ------ | -------- | ----------- | ----------------- |
    | 2024-10-17 12:34:56 | user789 | 200    | /api/login       | GET    | Berlin   | Germany     | 245               |
    | 2024-10-17 12:40:22 | user456 | 404    | /api/nonexistent | GET    | Munich   | Germany     | 532               |

    This query combines web and security logs, then filters the results to show only those records where the request originated from Germany.
  </Tab>
</Tabs>

## Other examples

### Basic union

This example combines all rows from `github-push-event` and `github-pull-request-event` without any transformation or filtering.

```kusto theme={null}
['github-push-event']
| union ['github-pull-request-event']
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27github-push-event%27%5D%5Cn%7C%20union%20%5B%27github-pull-request-event%27%5D%22%7D)

### Filter after union

This example combines the datasets, and then filters the data to only include rows where the `method` is `GET`.

```kusto theme={null}
['sample-http-logs']
| union ['github-issues-event']
| where method == "GET"
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27sample-http-logs%27%5D%5Cn%7C%20union%20%5B%27github-issues-event%27%5D%5Cn%7C%20where%20method%20%3D%3D%20%5C%22GET%5C%22%22%7D)

### Aggregate after union

This example combines the datasets and summarizes the data, counting the occurrences of each combination of `content_type` and `actor`.

```kusto theme={null}
['sample-http-logs']
| union ['github-pull-request-event']
| summarize Count = count() by content_type, actor
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27sample-http-logs%27%5D%5Cn%7C%20union%20%5B%27github-pull-request-event%27%5D%5Cn%7C%20summarize%20Count%20%3D%20count%28%29%20by%20content_type%2C%20actor%22%7D)

### Filter and project specific data from combined log sources

This query combines GitHub pull request event logs and GitHub push events, filters by actions made by `github-actions[bot]`, and displays key event details such as `time`, `repository`,  `commits`, `head` , `id`.

```kusto theme={null}
['github-pull-request-event']
| union ['github-push-event']
| where actor == "github-actions[bot]"
| project _time, repo, ['id'], commits, head
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27github-pull-request-event%27%5D%5Cn%7C%20union%20%5B%27github-push-event%27%5D%5Cn%7C%20where%20actor%20%3D%3D%20%5C%22github-actions%5Bbot%5D%5C%22%5Cn%7C%20project%20_time%2C%20repo%2C%20%5B%27id%27%5D%2C%20commits%2C%20head%22%7D)

### Union with field removing

This example removes the `content_type` and `commits` field in the datasets `sample-http-logs` and `github-push-event` before combining the datasets.

```kusto theme={null}
['sample-http-logs']
| union ['github-push-event']
| project-away content_type, commits
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27sample-http-logs%27%5D%5Cn%7C%20union%20%5B%27github-push-event%27%5D%5Cn%7C%20project-away%20content_type%2C%20commits%22%7D)

### Filter after union

This example performs a union and then filters the resulting set to only include rows where the `method` is `GET`.

```kusto theme={null}
['sample-http-logs']
| union ['github-issues-event']
| where method == "GET"
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27sample-http-logs%27%5D%5Cn%7C%20union%20%5B%27github-issues-event%27%5D%5Cn%7C%20where%20method%20%3D%3D%20%5C%22GET%5C%22%22%7D)

### Union with order by

After the union, the result is ordered by the `type` field.

```kusto theme={null}
['sample-http-logs']
| union hn
| order by type
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27sample-http-logs%27%5D%5Cn%7C%20union%20hn%5Cn%7C%20order%20by%20type%22%7D)

### Union with joint conditions

This example performs a union and then filters the resulting dataset for rows where `content_type` contains the letter `a` and `city` is `seattle`.

```kusto theme={null}
['sample-http-logs']
| union ['github-pull-request-event']
| where content_type contains "a" and ['geo.city']  == "Seattle"
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27sample-http-logs%27%5D%5Cn%7C%20union%20%5B%27github-pull-request-event%27%5D%5Cn%7C%20where%20content_type%20contains%20%5C%22a%5C%22%20and%20%5B%27geo.city%27%5D%20%20%3D%3D%20%5C%22Seattle%5C%22%22%7D)

### Union and count unique values

After the union, the query calculates the number of unique `geo.city`  and `repo` entries in the combined dataset.

```kusto theme={null}
['sample-http-logs']
| union ['github-push-event']
| summarize UniqueNames = dcount(['geo.city']), UniqueData = dcount(repo)
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%20%22%5B%27sample-http-logs%27%5D%5Cn%7C%20union%20%5B%27github-push-event%27%5D%5Cn%7C%20summarize%20UniqueNames%20%3D%20dcount%28%5B%27geo.city%27%5D%29%2C%20UniqueData%20%3D%20dcount%28repo%29%22%7D)

### Union using withsource

The example below returns the union of all datasets that match the pattern `github*` and counts the number of events in each.

```kusto theme={null}
union withsource=dataset github*
| summarize count() by dataset
```

[Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22union%20withsource%3Ddataset%20github*%20%7C%20summarize%20count\(\)%20by%20dataset%22%7D)

### Union with wildcards

The `union` operator supports wildcards to combine multiple datasets matching a pattern. Use `*` to match all datasets, or a suffix pattern like `github*` to match datasets starting with a prefix.

<Warning>
  The wildcard `*` is useful to match multiple datasets, but it increases query complexity and decreases performance.

  Using `union *` to query all datasets is very expensive. Avoid it in production. Use specific dataset names or prefix patterns instead.
</Warning>

**Match all datasets:**

```kusto theme={null}
union *
| summarize count() by dataset
```

**Match datasets with a prefix:**

```kusto theme={null}
union withsource=dataset github*
| summarize count() by dataset
```

## Best practices for the union operator

To maximize the effectiveness of the union operator in APL, here are some best practices to consider:

* Before using the `union` operator, ensure that the fields being merged have compatible data types.
* Use `project` or `project-away` to include or exclude specific fields. This can improve performance and the clarity of your results, especially when you only need a subset of the available data.
