Online evaluations for AI engineering

This week we've added support for online evaluations in AI engineering workflows, shipped major upgrades to metrics dashboards and the query canvas, and made API token management significantly easier.

Online evaluations for AI engineering

Previously, evaluations only ran offline against curated test cases with expected outputs. You could catch regressions before deploying, but had no way to continuously score quality once your capability was live.

Online evaluations let you score your AI capability's outputs on live production traffic. Import onlineEval from the Axiom AI SDK and call it inside withSpan. Each evaluation creates OTel spans linked to the originating generation span, so you can trace scores back to the request that produced them.

Because there's no ground truth for live traffic, online evaluations use reference-free scorers that only see input and output. Write boolean scorers for pass/fail checks, numeric scorers for graded scoring, or async LLM-as-judge scorers for deeper quality assessment.

For more information, see the Online evaluation documentation and the blog post.

API token presets

Creating API tokens with the right permissions used to mean toggling individual capabilities one by one. Token presets let you pick a predefined permission set that matches common use cases, then customize from there. This cuts down the time it takes to set up tokens for ingestion, querying, or admin workflows.

GenAI dataset in Playground

We've added a new GenAI dataset in the Playground. This dataset contains a collection of GenAI traces that you can use to explore and understand the GenAI data in Axiom.

You can find the new dataset in:

The Datasets tab of the Playground
The automatically created GenAI dashboard in the Dashboards tab

Placeholder configurator component

The documentation just got smarter. Previously, you had to manually rewrite placeholders such as AXIOM_DOMAIN in code snippets. We've now added a configurator component below code examples that you can use to replace placeholders and copy code that's tailored to your specific environment.

To try it out, open a documentation page with a code example and start replacing placeholders using the configurator component.

Smart features for blog and changelog

We now support RSS feeds for the blog and the changelog. You can subscribe to the feed by clicking the RSS icon in the top right corner of the blog or changelog overview pages.

You can also filter changelog items by labels such as New feature, Improvement, or Bug fix, so you can easily find changes related to a specific feature or bug fix.

See these smart features in action in the blog and changelog overview pages.

More of our favorite changes

Added fit-columns-to-width mode for the events table, so data columns fill exactly 100% of the viewport width without horizontal scroll
Ctrl+Click now works on monitor list rows to open in a new tab

#LAUNCHEDMetrics are generally available. Logs, traces, metrics, and events in one platform.Learn more→

#PLATFORM

Observability

Distributed traces

Volumetric logging

High-cardinality metrics

Application performance monitoring

Infrastructure monitoring

AI Engineering

AI workflow tracing

AI SDK & telemetry

Long‑term active retention

Evaluation & experimentation

#LATEST

Latest from the blog

#SIGNALS

Features

Logs

Traces

Metrics

AI

#ARCHITECTURE

#TECHNOLOGIES

Technologies

OpenTelemetry

Events API

Vercel & AI SDK

Cloudflare

#INGEST_FROM_ANYWHERE

#CHANGELOG

See what’s new at Axiom

#GET_STARTED

Documentation

Axiom Playground

Axiom CLI

Support

#COMPANY

Blog

Changelog

About us

Careers

#NEWS

From burden to asset: reimagining logs at scale

Online evaluations for AI engineering