> ## Documentation Index
> Fetch the complete documentation index at: https://axiom.co/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# sample

> This page explains how to use the sample operator function in APL.

The `sample` operator in APL psuedo-randomly selects rows from the input dataset at a rate specified by a parameter. This operator is useful when you want to analyze a subset of data, reduce the dataset size for testing, or quickly explore patterns without processing the entire dataset. The sampling algorithm isn’t statistically rigorous but provides a way to explore and understand a dataset. For statistically rigorous analysis, use `summarize` instead.

You can find the `sample` operator useful when working with large datasets, where processing the entire dataset is resource-intensive or unnecessary. It’s ideal for scenarios like log analysis, performance monitoring, or sampling for data quality checks.

## For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

<AccordionGroup>
  <Accordion title="Splunk SPL users">
    In Splunk SPL, the `sample` command works similarly, returning a subset of data rows randomly. However, the APL `sample` operator requires a simpler syntax without additional arguments for biasing the randomness.

    <CodeGroup>
      ```sql Splunk example theme={null}
      | sample 10
      ```

      ```kusto APL equivalent theme={null}
      ['sample-http-logs'] 
      | sample 0.1
      ```
    </CodeGroup>
  </Accordion>

  <Accordion title="ANSI SQL users">
    In ANSI SQL, there is no direct equivalent to the `sample` operator, but you can achieve similar results using the `TABLESAMPLE` clause. In APL, `sample` operates independently and is more flexible, as it’s not tied to a table scan.

    <CodeGroup>
      ```sql SQL example theme={null}
      SELECT * FROM table TABLESAMPLE (10 ROWS);
      ```

      ```kusto APL equivalent theme={null}
      ['sample-http-logs'] 
      | sample 0.1
      ```
    </CodeGroup>
  </Accordion>
</AccordionGroup>

## Usage

### Syntax

```kusto theme={null}
| sample ProportionOfRows
```

### Parameters

* `ProportionOfRows`: A float greater than 0 and less than 1 which specifies the proportion of rows to return from the dataset. The rows are selected randomly.

### Returns

The operator returns a table containing the specified number of rows, selected randomly from the input dataset.

## Use case examples

<Tabs>
  <Tab title="Log analysis">
    In this use case, you sample a small number of rows from your HTTP logs to quickly analyze trends without working through the entire dataset.

    **Query**

    ```kusto theme={null}
    ['sample-http-logs']
    | sample 0.05
    ```

    [Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27sample-http-logs%27%5D%20%7C%20sample%200.05%22%7D)

    **Output**

    | \_time              | req\_duration\_ms | id    | status | uri       | method | geo.city | geo.country |
    | ------------------- | ----------------- | ----- | ------ | --------- | ------ | -------- | ----------- |
    | 2023-10-16 12:45:00 | 234               | user1 | 200    | /index    | GET    | New York | US          |
    | 2023-10-16 12:47:00 | 120               | user2 | 404    | /login    | POST   | Paris    | FR          |
    | 2023-10-16 12:48:00 | 543               | user3 | 500    | /checkout | POST   | Tokyo    | JP          |

    This query returns a random subset of 5 % of all rows from the HTTP logs, helping you quickly identify any potential issues or patterns without analyzing the entire dataset.
  </Tab>

  <Tab title="OpenTelemetry traces">
    In this use case, you sample traces to investigate performance metrics for a particular service across different spans.

    **Query**

    ```kusto theme={null}
    ['otel-demo-traces']
    | where ['service.name'] == 'checkoutservice'
    | sample 0.05
    ```

    [Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27otel-demo-traces%27%5D%20%7C%20where%20%5B%27service.name%27%5D%20%3D%3D%20%27checkoutservice%27%20%7C%20sample%200.05%22%7D)

    **Output**

    | \_time              | duration | span\_id | trace\_id | service.name    | kind   | status\_code |
    | ------------------- | -------- | -------- | --------- | --------------- | ------ | ------------ |
    | 2023-10-16 14:05:00 | 1.34s    | span5678 | trace123  | checkoutservice | client | 200          |
    | 2023-10-16 14:06:00 | 0.89s    | span3456 | trace456  | checkoutservice | server | 500          |

    This query returns 5 % of all traces for the `checkoutservice` to identify potential performance bottlenecks.
  </Tab>

  <Tab title="Security logs">
    In this use case, you sample security log data to spot irregular activity in requests, such as 500-level HTTP responses.

    **Query**

    ```kusto theme={null}
    ['sample-http-logs']
    | where status == '500'
    | sample 0.03
    ```

    [Run in Playground](https://play.axiom.co/axiom-play-qf1k/query?initForm=%7B%22apl%22%3A%22%5B%27sample-http-logs%27%5D%20%7C%20where%20status%20%3D%3D%20%27500%27%20%7C%20sample%200.03%22%7D)

    **Output**

    | \_time              | req\_duration\_ms | id    | status | uri      | method | geo.city | geo.country |
    | ------------------- | ----------------- | ----- | ------ | -------- | ------ | -------- | ----------- |
    | 2023-10-16 14:30:00 | 543               | user4 | 500    | /payment | POST   | Berlin   | DE          |
    | 2023-10-16 14:32:00 | 876               | user5 | 500    | /order   | POST   | London   | GB          |

    This query helps you quickly spot failed requests (HTTP 500 responses) and investigate any potential causes of these errors.
  </Tab>
</Tabs>

## List of related operators

* [take](/apl/tabular-operators/take-operator): Use `take` when you want to return the first N rows in the dataset rather than a random subset.
* [where](/apl/tabular-operators/where-operator): Use `where` to filter rows based on conditions rather than sampling randomly.
* [top](/apl/tabular-operators/top-operator): Use `top` to return the highest N rows based on a sorting criterion.
