Telling the difference between time series data and events
Time series data can be defined as a collection of data points or measurements taken at different times (as opposed to data about many objects concerning a single point in time). So structurally, time series data shares many characteristics with event streams.
In this article, we will go through some tips on telling the two apart.
Does it have a timestamp?
Time series data always has a timestamp; event data usually does as well. What makes this confusing is that very many other data models also include timestamps, so we’ll need to investigate further.
Is it ever updated?
Neither time series data nor event streams are ever updated. After the data has been recorded, the records are immutable. Time series data is data collected from a single point in time. Events (as the name implies) refer to a single event. If something else happens later, that’s a new data point in both cases.
How many fields does it have?
Time series data typically only has one measurement in each data point. It may also include many “labels” indicating what metric it is, which server and CPU it relates to, and so on.
Events usually have more fields, to provide the detail on the event that occurred. If a user logged in, we’ll get the user and perhaps the referrer information, not just a numeric measurement.
Would you graph this value over time?
Another feature of time series data is that since it’s usually point-in-time measurement, repeated an unimaginably large number of times both over time and usually across multiple measured items, it’s hard to work with the raw data. If it graphs nicely, and especially if it would make sense to ask quantitative questions of this data, then it can be characterised as time series data.
The events transmitting a series of unrelated one-off messages don’t fit this model well, so this is another useful distinction.
The right tools for the job
Understanding and modelling your data can really help identify the tools and features you want to deploy in your next application. More complex applications may use multiple data solutions to build the platform required, but whatever your needs, you can pick from the Aiven catalogue of open source products.
For the two examples in today’s article, try Apache Kafka for streaming events from one place to another, and M3 for your time series data needs.
Try it yourself
Aiven has both Kafka and M3 available on its platform and offers a free trial — so go ahead and give it a try https://console.aiven.io/signup !
Originally published at https://aiven.io.