February 9, 2024

#company, #product, #engineering

Freedom from limits


Blog Screenshot
Author
Neil Jagdish Patel

CEO / Co-founder

Axiom exists because what my co-founders and I wanted as engineers wasn’t possible with the datastores that already existed. Every one of them, open or closed source, was wrought with limitations which constrained our product vision. Instead of accepting that compromise, we shelved our previous product ideas and dove into the actual root cause of these limitations. We applied first-principles thinking to event data handling. This finally let us build the product we had always dreamed of.


TL;DR

The way you handle event data today is constrained by limitations imposed by the applications you use. Limits on ingest, storage, query, and cost come from old thinking that prevailed when these applications were designed and built. Axiom, redesigned from the ground up for the cloud, frees you from limits via serverless architecture, compute-on-demand, and unparalleled compression. Axiom destroys the false premise that you can’t afford to have immediate access to all your event data.


The Problem - thinking molded to fit the limits

When I talk to customers, I try very early to throw out everything that’s been done so far. Even the best products still have constraints over the way they were built, because they were built with old thinking. A lot of teams developed their way of working around the database, and around what the database could do. Without thinking about it, they were molding themselves to the limitations of what they were using.

For instance, they’re used to having to pre-prepare data before ingesting it, otherwise queries will be slow. But if you’re pre-preparing data to send to a database, you are essentially a human doing something that the database should have been doing automatically for you.

I try to communicate that the only goal of ingest is to be at the receiving end of a fire hose of whatever is happening from any point. Whether it’s one event at a time, a thousand events, or a hundred thousand events at a time, it doesn’t matter. Its goal is to, as quickly as possible, take that data in, make it robust, and make it available for querying. Anything else — any preparation, any worrying about budgets, any worrying about limitations takes you away from the goal of what you should be doing.

The datastore shouldn’t be a limiting factor. It should be an unlock on what’s possible. If the entire world moves towards AI, like, as it has done, they index their docs site or maybe their intranet, and now they have an LLM that tells them stuff. Suddenly it hits them: I wish I also had the data that I just disposed of, because I could train something or find something or whatever is in line with my company. The problem is everything is based on the minimal thing that you can do. What’s the minimal license I can afford to do what I need to do? The minimal data I can ingest and keep for the minimal period? The minimal way I can query this minimal stuff? Everything is tied into cost or storage or maintenance.

Now imagine the flip side. Imagine where you didn’t treat it as, “I just need to get this done, or I just need to push this much data for it to be valuable.” Imagine instead if your thinking were, “I’m not sure this is useful now, but it cost me so little and is so easy to have it for later that as the world evolves.” What you would do with logging, or observability, or security, or whatever else?

Very few organizations, whether they’re startups or enterprises, can tell you in the calendar year that just passed, how did their infrastructure evolve? Because everyone’s minimizing retention, you learn nothing. You’re always on the remediation step after something’s gone wrong. You don’t know that there’s been a trending upline on your Postgres since the beginning of the year, because you didn’t keep the data points together in storage long enough to see it.

It’s like if a healthcare system only treats patients once they’re in hospice. Treatment starts too late and prevention never happens. You don’t have the time, you don’t have the data. Thinking ahead is something you just can’t do with what’s out there.

You can’t ingest all your events to look for patterns. You’d be out of budget on month three. You don’t know the way your events are going to be used in the future, so the same thinking applies to storage.

Finally, neither of those are important if you can’t search it. Limits mean you have to plan how to search it. How big of a warehouse do I need to do the query that I want to do? You have to provision your searching, because Snowflake is charging by the hour and might auto-stop at your spending limit before you realize it’s happening.

Limitation, limitation, limitation. There are blockers all the way down.

You would never build a business like this, right? If you were blocking the sales feed, they’d tell you to get out of the way. But in these datastores, you accept it every time you accept the blocker on ingest, you accept the blocker on retention, and you accept the blocker on what data is available, on how can I curate it, on how fast will it come back to me. It’s not the fault of the team, but those limitations poison everything around them. It leads to people having 15-day retention. It leads to not being able to get an answer that, hey, we just detected X in Y IP space did something three months ago, can you tell us if it actually affected us? We don’t know — we don’t have the data.

Remember before the cloud? You had to provision servers. You had to know how much those servers needed for RAM, CPU, disk, whatever. You had to think before you built. If you had access, which team would provision it for you? Was it secure? What else? Now, you don’t even think about all this.

That moment is now for event datastores. You’ve been provisioning, you’ve been thinking, you’ve been pre-preparing, you’ve been scaling, you’ve been maintaining, you’ve been doing all these things.

Now, Axiom is just there. It’s ready to go. What do you want to do? (By the way, do you have an archive from three years ago? You can just send that data and Axiom will build that into your archive. You don’t have to ask us how to do it. You don’t have to tell us. It’s all ready to go.

Once you’ve learned to think and work without the limits that constrained every idea, you can’t go back.


The Solution — redesign everything to remove the limits

Removing the limits for event datastores was like the cloud: It had to be re-invented from scratch. There are no off-the-shelf parts inside Axiom. We needed to understand each domain individually — ingest, storage, query — and then how they all sync together.

We didn’t try to amend something that was already out there, hacking it into existence from the pieces that we already had. We had to go further down, back to the primordial soup of what’s available in the cloud, and build it back up into something totally new.

The deep, deep efficiency and value you get out of our ingest pipeline, it’s as if all the effort and thought and work that’s done around having a Kafka pipeline running so you’re feeling safe, or worrying about how many nodes you need to ingest X amount of data — we compressed all of that into a container by using what’s out in the cloud. We used new ways of thinking — including distributed systems thinking — and compressed that into a container. Two containers do the work of six or ten nodes in the old days.

Forget about megabytes of RAM, we worried about kilobytes of RAM so we can pack as much density into each container when you are ingesting and give you the robustness guarantee that you expect from a service like this, without having to resort to off-the-shelf products like Kafka.

We wanted to make sure it’s so efficient that sampling would be something you could eject from your mind. You’re not worrying about the schemas, either. You’re not worrying about the shape of the data. You send us what you have as it comes out of whatever you’re using, whether it’s a file, your Golang app, or whatever it is. Axiom doesn’t care. Its job is to make sure you get the 200 status back, that the data is robust and written to disk somewhere.

The other part of it is also making it available for querying straight away, so once you get that 200 back from the API, you’re ready to go. You don’t need to worry about what’s happening internally. There’s no syncing, there’s no shards that have to balance out, there’s no worry about this node may go down or that thing may not scale, or this one’s running out of memory. Our work was to make the density of ingest per container as high as possible. You see the benefits and the reduced cost, too.

All that data can’t live inside of a container, it has to land somewhere. If you’re thinking about limitless as a goal of Axiom, that’s why we chose object store. It’s essentially limitless to everyone. Off the shelf would have been taking Parquet, more generic formats that were made for archiving, throwing them inside of S3 and calling it a day.

That wouldn’t have given you the kind of retention that you’re seeking, the density of data, the compression that we do around the data. This not only has an effect on writing — for the same dollar, you’re storing much, much more data if the compression is higher. It also affects the read side: when you now have 100% of your data, unsampled, in one store, which is constantly hot, the next thing you want to do is search it.

That compression not only helps on the write side, but then on the read side it has a massive effect. Now with limitless access you just tell us what you want and we will find it. We do what we need to make it come back to you whether the question is about five years ago or five seconds, or a hundred milliseconds.

We went deep into every domain and turned each knob up to eleven. To do that you have to think about it from scratch, and also how it all fits together. I mentioned not preparing data as it comes in. Well, you need a query language that’s flexible beyond SQL, that can manipulate data in flight as your query derives new columns. We have that in APL, which can apply schemas as it reads the stored data and can pipe the output of one query to be the input of another.

On top of all this, it’s important that you also have a provisionless architecture. You shouldn’t need to worry about maxing out your team’s capacity, or running out of room for whatever you can think of to do with Axiom and your infrastructure’s events.

Limitless ingest, storage and query may be new to some ears, but they’ve been in production for more than two years. They handle 30,000 users’ worth of data. They’re used by different customers for all sorts of different workloads all the time.

You have your own use cases, but in any of them you’ll really feel the effects of taking off the limitations. The time to value is so quick — it’s a leap like when the cloud blew past old limits in the previous decade. If you’re using Axiom for volumetric observability or for long-term retention, on day one you’re doing things which you couldn’t and still can’t do anywhere else.

That’s what we’re trying to do for the world. Hopefully you’ll test every part of Axiom because you are to us what we are (trust me) to AWS. Axiom likewise gives you the ability to wake up every morning and decide where you’re going to go with it today. You’ve gone far by pushing your systems to their limits, but wait until you discover that this one doesn’t have any.


Interested to learn more about Axiom?

100% of your data for every possible need: o11y, security, analytics, and new insights.

Sign up for free or contact us at sales@axiom.co to talk with one of the team about our enterprise plans.

Share
Get started with Axiom

Learn how to start ingesting, streaming, and
querying data into Axiom in less than 10 minutes.