The Axiom team is inordinately excited by datasets. While we have various ones crafted for stress-testing our product in every way, it's always more special to work with real data that has quirks and wrinkles.
On our search for such data, we came across the Github Archive
. This is event data generated from GitHub that traces back around nine years to 2011. This means every commit, every push, every issue, every commment...everything.
The team was immediately excited at the prospect of loading that data into Axiom and being able to gain instant insights into how things have changed over the years. Whether it's the number of repos, or repos per organisation, issues, comments containing profanity, etc. It is a treasure-trove of data and we have had a lot of fun going through it.
Our plan is to blog about our working with the archive, sharing some of the insights gained - both obvious trends and maybe some less obvious ones! GitHub is such an important part of the software ecosystem that we're excited to be able to work with their data.
If you're interested in receiving this case study when available, as well as keeping up to date on other Axiom news, share your email below.