The variance aggregation function in APL calculates the variance of a numeric expression across a set of records. Variance is a statistical measurement that represents the spread of data points in a dataset. It’s useful for understanding how much variation exists in your data. In scenarios such as performance analysis, network traffic monitoring, or anomaly detection, variance helps identify outliers and patterns by showing how data points deviate from the mean.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

Usage

Syntax

summarize variance(Expression)

Parameters

  • Expression: A numeric expression or field for which you want to compute the variance. The expression should evaluate to a numeric data type.

Returns

The function returns the variance (a numeric value) of the specified expression across the records.

Use case examples

You can use the variance function to measure the variability of request durations, which helps in identifying performance bottlenecks or anomalies in web services.

Query

['sample-http-logs'] 
| summarize variance(req_duration_ms)

Run in Playground

Output

variance_req_duration_ms
1024.5

This query calculates the variance of request durations from a dataset of HTTP logs. A high variance indicates greater variability in request durations, potentially signaling performance issues.

  • stdev: Computes the standard deviation, which is the square root of the variance. Use stdev when you need the spread of data in the same units as the original dataset.
  • avg: Computes the average of a numeric field. Combine avg with variance to analyze both the central tendency and the spread of data.
  • count: Counts the number of records. Use count alongside variance to get a sense of data size relative to variance.
  • percentile: Returns a value below which a given percentage of observations fall. Use percentile for a more detailed distribution analysis.
  • max: Returns the maximum value. Use max when you are looking for extreme values in addition to variance to detect anomalies.