Migrating from SQL to APL: A Beginner's Guide

Introduction

As data grows exponentially, organizations are continuously seeking more efficient and powerful tools to manage and analyze their data. Axiom Data Explorer, which utilizes the Axiom Processing Language (APL), is one such service that offers fast, scalable, and interactive data exploration capabilities. If you are an SQL user looking to migrate to APL, this guide will provide a gentle introduction to help you make the transition smoothly.

This tutorial will guide you through migrating SQL to APL, helping you understand key differences and providing you with query examples.

Introduction to Axiom Processing Language (APL)

Axiom Processing Langauge (APL) is the language used by Axiom Data Explorer, a fast and highly scalable data exploration service. APL is optimized for real-time and historical data analytics, making it a suitable choice for various data analysis tasks.

Tabular operators: In APL, there are several tabular operators that help you manipulate and filter data, similar to SQL's SELECT, FROM, WHERE, GROUP BY, and ORDER BY clauses. Some of the commonly used tabular operators are:

  • extend: Adds new columns to the result set.
  • project: Selects specific columns from the result set.
  • where: Filters rows based on a condition.
  • summarize: Groups and aggregates data similar to the GROUP BY clause in SQL.
  • sort: Sorts the result set based on one or more columns, similar to ORDER BY in SQL.

Key differences between SQL and APL

While SQL and APL are query languages, there are some key differences to consider:

  • APL is designed for querying large volumes of structured, semi-structured, and unstructured data.
  • APL is a pipe-based language, meaning you can chain multiple operations using the pipe operator (|) to create a data transformation flow.
  • APL does not use SELECT, and FROM clauses like SQL. Instead, it uses keywords such as summarize, extend, where, and project.
  • APL is case-sensitive, whereas SQL is not.

Benefits of migrating from SQL to APL:

  • Time Series Analysis: APL is particularly strong when it comes to analyzing time-series data (logs, telemetry data, etc.). It has a rich set of operators designed specifically for such scenarios, making it much easier to handle time-based analysis.

  • Pipelining: APL uses a pipelining model, much like the UNIX command line. You can chain commands together using the pipe (|) symbol, with each command operating on the results of the previous command. This makes it very easy to write complex queries.

  • Easy to Learn: APL is designed to be simple and easy to learn, especially for those already familiar with SQL. It does not require any knowledge of database schemas or the need to specify joins.

  • Scalability: APL is a more scalable platform than SQL. This means that it can handle larger amounts of data.

  • Flexibility: APL is a more flexible platform than SQL. This means that it can be used to analyze different types of data.

  • Features: APL offers more features and capabilities than SQL. This includes features such as real-time analytics, and time-based analysis.

Basic APL Syntax

A basic APL query follows this structure:

| <DatasetName>
| <FilteringOperation> 
| <ProjectionOperation> 
| <AggregationOperation>

Query Examples

Let's see some examples of how to convert SQL queries to APL.

SELECT with a simple filter

SQL:

SELECT *
FROM [Sample-http-logs]
WHERE method = 'GET';

APL:

['sample-http-logs']
| where method == 'GET'

COUNT with GROUP BY

SQL:

SELECT Country, COUNT(*)
FROM [Sample-http-logs]
GROUP BY method;

APL:

['sample-http-logs']
| summarize count() by method

Top N results

SQL:

SELECT TOP 10 Status, Method
FROM [Sample-http-logs]
ORDER BY Method DESC;

APL:

['sample-http-logs']
| top 10 by method desc
| project status, method

Simple filtering and projection

SQL:

SELECT method, status, geo.country
FROM [Sample-http-logs]
WHERE resp_header_size_bytes >= 18;

APL:

['sample-http-logs']
| where resp_header_size_bytes >= 18
| project method, status, ['geo.country']

COUNT with a HAVING clause

SQL:

SELECT geo.country
FROM [Sample-http-logs]
GROUP BY geo.country
HAVING COUNT(*) > 100;

APL:

['sample-http-logs']
| summarize count() by ['geo.country']
| where count_ > 100

Multiple Aggregations

SQL:

SELECT geo.country,
       COUNT(*) AS TotalRequests,
       AVG(req_duration_ms) AS AverageRequest,
       MIN(req_duration_ms) AS MinRequest,
       MAX(req_duration_ms) AS MaxRequest
FROM [Sample-http-logs]
GROUP BY geo.country;

APL:

Users
| summarize TotalRequests = count(),
            AverageRequest = avg(req_duration_ms),
            MinRequest = min(req_duration_ms),
            MaxRequest = max(req_duration_ms) by ['geo.country']

Sum of a column

SQL:

SELECT SUM(resp_body_size_bytes) AS TotalBytes
FROM  [Sample-http-logs];

APL:

[‘sample-http-logs’]
| summarize TotalBytes = sum(resp_body_size_bytes)

Average of a column

SQL:

SELECT AVG(req_duration_ms) AS AverageRequest
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| summarize AverageRequest = avg(req_duration_ms)

Minimum and Maximum Values of a column

SQL:

SELECT MIN(req_duration_ms) AS MinRequest, MAX(req_duration_ms) AS MaxRequest
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| summarize MinRequest = min(req_duration_ms), MaxRequest = max(req_duration_ms)

Count distinct values

SQL:

SELECT COUNT(DISTINCT method) AS UniqueMethods
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| summarize UniqueMethods = dcount(method)

Standard deviation of a data

SQL:

SELECT STDDEV(req_duration_ms) AS StdDevRequest
FROM  [Sample-http-logs];

APL:

['sample-http-logs']
| summarize StdDevRequest = stdev(req_duration_ms)

Variance of a data

SQL:

SELECT VAR(req_duration_ms) AS VarRequest
FROM  [Sample-http-logs];

APL:

['sample-http-logs']
| summarize VarRequest = variance(req_duration_ms)

Multiple aggregation functions

SQL:

SELECT COUNT(*) AS TotalDuration, SUM(req_duration_ms) AS TotalDuration, AVG(Price) AS AverageDuration
FROM  [Sample-http-logs];

APL:

['sample-http-logs']
| summarize TotalOrders = count(), TotalDuration = sum( req_duration_ms), AverageDuration = avg(req_duration_ms)

Aggregation with GROUP BY and ORDER BY

SQL:

SELECT status, COUNT(*) AS TotalStatus, SUM(resp_header_size_bytes) AS TotalRequest
FROM [Sample-http-logs];
GROUP BY status
ORDER BY TotalSpent DESC;

APL:

['sample-http-logs']
| summarize TotalStatus = count(), TotalRequest = sum(resp_header_size_bytes) by status
| order by TotalRequest desc

Count with a condition

SQL:

SELECT COUNT(*) AS HighContentStatus
FROM  [Sample-http-logs];
WHERE resp_header_size_bytes  > 1;

APL:

['sample-http-logs']
| where resp_header_size_bytes > 1
| summarize HighContentStatus = count()

Aggregation with HAVING

SQL:

SELECT Status
FROM [Sample-http-logs];
GROUP BY Status
HAVING COUNT(*) > 10;

APL:

['sample-http-logs']
| summarize OrderCount = count() by status
| where OrderCount > 10

Count occurrences of a value in a field

SQL:

SELECT content_type, COUNT(*) AS RequestCount
FROM  [Sample-http-logs];
WHERE content_type = ‘text/csv’;

APL:

['sample-http-logs'];
| where content_type == 'text/csv'
| summarize RequestCount = count()

String Functions:

Length of a string

SQL:

SELECT LEN(Status) AS NameLength
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| extend NameLength = strlen(status)

Concatentation

SQL:

SELECT CONCAT(content_type, ' ', method) AS FullLength
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| extend FullLength = strcat(content_type, ' ', method)

Substring

SQL:

SELECT SUBSTRING(content_type, 1, 10) AS ShortDescription
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| extend ShortDescription = substring(content_type, 0, 10)

Left and Right

SQL:

SELECT LEFT(content_type, 3) AS LeftTitle, RIGHT(content_type, 3) AS RightTitle
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| extend LeftTitle = substring(content_type, 0, 3), RightTitle = substring(content_type, strlen(content_type) - 3, 3)

Replace

SQL:

SELECT REPLACE(StaTUS, 'old', 'new') AS UpdatedStatus
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| extend UpdatedStatus = replace('old', 'new', status)

Upper and Lower

SQL:

SELECT UPPER(FirstName) AS UpperFirstName, LOWER(LastName) AS LowerLastName
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| project upperFirstName = toupper(content_type), LowerLastNmae = tolower(status)

LTrim and RTrim

SQL:

SELECT LTRIM(content_type) AS LeftTrimmedFirstName, RTRIM(content_type) AS RightTrimmedLastName
FROM  [Sample-http-logs];

APL:

['sample-http-logs']
| extend LeftTrimmedFirstName = trim_start(' ', content_type), RightTrimmedLastName = trim_end(' ', content_type)

Trim

SQL:

SELECT TRIM(content_type) AS TrimmedFirstName
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| extend TrimmedFirstName = trim(' ', content_type)

Reverse

SQL:

SELECT REVERSE(Method) AS ReversedFirstName
FROM [Sample-http-logs];

APL:

['sample-http-logs']
| extend ReversedFirstName = reverse(method)

Case-insensitive search

SQL:

SELECT Status, Method
FROM “Sample-http-logs”
WHERE LOWER(Method) LIKE 'get’';

APL:

['sample-http-logs']
| where tolower(method) contains 'GET'
| project status, method

Take the First Step Today: Dive into APL

The journey from SQL to APL might seem daunting at first, but with the right approach, it can become an empowering transition. It is about expanding your data query capabilities to leverage the advanced, versatile, and fast querying infrastructure that APL provides. In the end, the goal is to enable you to draw more value from your data, make faster decisions, and ultimately propel your business forward.

Try converting some of your existing SQL queries to APL and observe the performance difference. Explore the Axiom Processing Language and start experimenting with its unique features.

Happy querying!

Was this page helpful?