Manage datasets

This reference article explains how to manage datasets in Axiom, including creating new datasets, importing data, and deleting datasets.

What datasets are

In Axiom, an individual piece of data is an event, and a dataset is a collection of similar events. Datasets contain incoming event data.

Dataset names are 1 to 128 characters in length. They only contain ASCII alphanumeric characters and the hyphen (-) character.

Create dataset

To create a dataset, follow these steps:

  1. Click Settings icon Settings > Datasets.
  2. Click New dataset.
  3. Name the dataset, and then click Add.

Import data

You can import data to your dataset in one of the following formats:

  • Newline delimited JSON (NDJSON)
  • Arrays of JSON objects
  • CSV

To import data to a dataset, follow these steps:

  1. Click Settings icon Settings > Datasets.
  2. In the list, click the dataset where you want to import data.
  3. Click Import icon Import.
  4. Optional: Specify the timestamp field. This is only necessary if your data contains a timestamp field and it's different from _time.
  5. Upload the file, and then click Import.

Trim dataset

Trimming a dataset deletes all data in the dataset before a date you specify. This can be useful if your dataset contains too many fields or takes up too much storage space, and you want to reduce its size to ensure you stay within the allowed limits.

Warning

Trimming a dataset deletes all data before the specified date.

To trim a dataset, follow these steps:

  1. Click Settings icon Settings > Datasets.
  2. In the list, click the dataset that you want to trim.
  3. Click Trim dataset icon Trim dataset.
  4. Specify the date before which you want to delete data.
  5. Enter the name of the dataset, and then click Trim.

Vacuum fields

The data schema of your dataset is defined on read. Axiom continuously creates and updates the data structures during the data ingestion process. At the same time, Axiom only retains data for the retention period defined by your pricing plan. This means that the data schema can contain fields that you ingested into the dataset in the past, but these fields are no longer present in the data currently associated with the dataset. This can be an issue if the number of fields in the dataset exceeds the allowed limits.

In this case, vacuuming fields in a dataset can help you reduce the number of fields associated with a dataset and stay within the allowed limits. Vacuuming fields resets the number of fields associated with a dataset to the fields that occur in events within your retention period. Technically, it wipes the data schema and rebuilds it from the data you currently have in the dataset, which is partly defined by the retention period. For example, you have ingested 500 fields over the last year and 50 fields in the last 95 days, which is your retention period. In this case, before vacuuming, your data schema contains 500 fields. After vacuuming, the dataset only contains 50 fields.

Vacuuming fields doesn't delete any events from your dataset. To delete events, trim the dataset. You can use trimming and vacuuming in combination. For example, if you accidentally ingested events with fields you didn't want to send to Axiom, and these events are within your retention period, vacuuming alone doesn't solve your problem. In this case, first trim the dataset to delete the events with the unintended fields, and then vacuum the fields to rebuild the data schema.

Info

You can only vacuum fields once per day for each dataset.

To vacuum fields, follow these steps:

  1. Click Settings icon Settings > Datasets.
  2. In the list, click the dataset where you want to vacuum fields.
  3. Click Vacuum fields icon Vacuum fields.
  4. Select the checkbox, and then click Vacuum.

Delete dataset

Warning

Deleting a dataset deletes all data contained in the dataset.

To delete a dataset, follow these steps:

  1. Click Settings icon Settings > Datasets.
  2. In the list, click the dataset that you want to delete.
  3. Click Delete dataset icon Delete dataset.
  4. Enter the name of the dataset, and then click Delete.

Was this page helpful?