Author
Dominic Chapman
Head of Product
- Rich AI telemetry: Capture prompts, completions, tool calls, and costs with minimal code changes
- AI-native trace visualization: Debug complex multi-step workflows with dedicated views
- Automatic cost tracking: Real-time LLM cost calculation using our open-source ModelDB API
- Ready-to-use dashboard: Pre-built observability dashboard appears with your first AI telemetry
- AI engineering: OTel-based observability provides foundation for systematic evaluation and iteration
Building with generative AI can feel like navigating uncharted territory. Traditional software engineering best practices don’t quite fit. Outputs are probabilistic, not deterministic, which means a feature that worked perfectly yesterday might produce unexpected results today. This inherent uncertainty makes it challenging to ship with confidence.
At Axiom, we've seen this challenge firsthand. Frontier model providers trust Axiom to handle petabytes of machine data, from logs and traces to the critical annotations their domain experts provide to establish ground truth for model evaluation. This workflow highlighted a new, essential dimension to observability.
Conversations with dozens of engineering leaders from companies like incident.io, Pattern, and SchoolAI confirmed it: teams building with AI are grappling with the same fundamental problems. They need an extension to their traditional logging; a new toolkit built for the probabilistic world of generative AI.
That's why today, we're thrilled to announce our first step towards making confident AI engineering the standard: a new set of observability features purpose-built for AI development.
What's new today: A practical toolkit for AI observability
This announcement is the first milestone in a much larger initiative. We're starting with the foundational layer that every team needs: deep, actionable observability. Here’s a closer look at what you can use today and why it matters.
From generic logs to rich AI telemetry
Standard logging and tracing weren't designed for the nuances of AI. To truly understand a generative AI system, you need to capture a new class of data: the prompts, the completions, the tool calls, the token counts, and the user feedback.
Our new AI SDK at github.com/axiomhq/ai, built on the OpenTelemetry standard, makes this effortless.
npm i axiom
But we didn't just implement the spec; we improved it. We learned from customer interviews that developers think in terms of features or workflows, but the current OTel semantic conventions for Gen AI lack a way to represent this. That's why we’ve augmented the spec with two key attributes: gen_ai.capability.name
and gen_ai.step.name
.
Now, you can tie every piece of telemetry back to a specific business capability and a human-readable step, no matter if it’s a simple single-turn model interaction, a workflow, or a single- or multi-agent system.
As an example, here’s how easy it is to add rich tracing to a Gen AI model call with Axiom, with a lightweight wrapper around Vercel’s AI SDK:
import { withSpan, wrapAISDKModel } from 'axiom/ai';
import { generateText } from 'ai';
import { createOpenAI } from '@ai-sdk/openai';
// 1. Wrap your existing model client
const gpt4oMini = wrapAISDKModel(
createOpenAI()('gpt-4o-mini')
);
// 2. Use withSpan to add business context
async function handleSupportRequest(userId: string, question: string) {
const { text } = await withSpan(
{ capability: 'support_assistant', step: 'generate_answer' },
async (span) => {
span.setAttribute('user_id', userId);
return generateText({
model: gpt4oMini,
messages: [{ role: 'user', content: question }],
});
}
);
return text;
}
A trace view that understands AI
A standard distributed trace is useful, but it can't tell the full story of an AI interaction. Debugging often requires review of an interaction history—every message, every function call, and every response—in a single, coherent view.
Our refreshed trace waterfall is purpose-built for this use case. It visualizes entire multi-step workflows, allowing you to:
- Inspect inputs and outputs at every stage of the chain.
- See exactly what arguments were passed to your tools and what data was returned.
- Quickly identify high-latency steps or unexpected errors in complex agentic loops.
This is the difference between having a simple log of events and having a complete, interactive narrative of how your AI capability reasoned its way to specific output.
█ █ █
Your mission control: The Generative AI Overview dashboard
Once you have data, teams need a way to review the complete picture. As soon as you send AI telemetry to Axiom, a pre-built Generative AI Overview dashboard automatically appears. This dashboard provides at-a-glance answers to the most critical questions engineering leaders are asking:
- Cost: How much are we spending per capability, per model, or even per user?
- Performance: Which models or capabilities are the slowest on average, or in outlier cases?
- Usage: Which models are being used most frequently? Are we seeing a spike in errors?
- Tools: Which tools are being called most often, and which ones are failing?
This isn't just a collection of charts; it’s a command center that turns raw telemetry into strategic insights, helping you manage costs and optimize performance without manual configuration.
Effortless cost tracking with ModelDB
A frequent pain point we heard from teams is the difficulty of tracking LLM costs in real-time. Calculating costs often involves manual lookups, custom code, or delayed batch processing.
We’ve addressed this with our open-source, free to use ModelDB—a REST API that maintains a comprehensive, up-to-date database of AI models and their associated costs. When your telemetry arrives, Axiom uses ModelDB to enrich spans with precise cost data based on the model used and the input/output token counts. This means every trace and every dashboard panel contains accurate cost information out of the box, giving you the immediate financial visibility needed to run AI features profitably.
curl "https://modeldb.axiom.co/api/v1/models/gpt-5?pretty"
{
"model_id": "gpt-5",
"model_name": "GPT 5",
"provider_id": "openai",
"max_input_tokens": 400000,
"max_output_tokens": 128000,
"input_cost_per_token": 0.00000125,
"output_cost_per_token": 0.00001,
"supports_function_calling": true,
"supports_vision": true,
"supports_reasoning": true,
"model_type": "chat"
}
Our approach: Start fast, build on open standards
Our initial release prioritizes TypeScript with seamless integration for Vercel’s AI SDK—the fastest path to get started. But the underlying OpenTelemetry schema is language-agnostic and designed to enable the advanced AI engineering features we’re building—like systematic evaluations and A/B testing.
If you're building with Python, Go, or another stack, you can start capturing AI telemetry today using manual instrumentation. We plan to expand our SDK support to make the experience as smooth as TypeScript. Your insights will be invaluable as we prioritize which languages and frameworks to support next. Please get in touch.
This is just the beginning
As experimental projects mature, the "vibe coding" approach has to change. With Axiom, you can build, ship and iterate on AI capabilities with the same rigor and confidence we’ve come to expect from traditional software.
This release provides the critical observability foundation. Next, we’ll be launching a comprehensive evaluation framework to help you measure the quality of your capabilities and iterate faster. Ready to take the first step?