Timeseries object - Gorilla or TSXor Compression #155
Replies: 3 comments
-
While this is quite interesting, is Garnet a good fit for Timeseries data? Garnet excells as a cache, supporting random access to records. I would expect a SQL or dedicated timeseries database would be a better fit for timeseries data since they excel at sequential reads and aggregations. |
Beta Was this translation helpful? Give feedback.
-
We haven't looked at time series management, but it might a good fit for Garnet on case-by-case basis. |
Beta Was this translation helpful? Give feedback.
-
Does native expand the data type like redis time series ? |
Beta Was this translation helpful? Give feedback.
-
I'm thinking of using Garnet to store a set of timeseries (power production curves) to store the state of a scheduling system.
Since these timeseries have super regular time points (1min, 5min, 15min, 1h) granularities and often multiple identical values one after another, I'm thinking of using a in-memory efficient compression. Generally, in Garnet, the compressed representation of the timeseries could be stored as a string / binary directly, but I'm thinking of a custom object which compresses the timeseries and returns it uncompressed in responses to Get/Set requests.
For the compression, I'm thinking of basing on the compression used in the paper https://www.vldb.org/pvldb/vol8/p1816-teller.pdf
I re-implemented the timeseries compression in cpp last time as a test (with no look at performance), but I'll start looking for a C# implementation or write my own and bench it. Also comparing with TSXor https://github.com/andybbruno/TSXor for performance and compression size.
Since with the super regular time series, I have a time delta of deltas returning zero, it will only cost 1 bit per timepoint for the timestamp, so I see a big improvement in storage cost of a (time, value) timeseries. Of course, I could store the start point and granularity as a base of the object, but Gorilla is generic and thus with little effort I can support non-regular timeseries. Most of the value have small jumps between values and long repeated sequences, which means a good compression ratio. (Side note: RLE might also be appropriate for some of the timeseries I have - potentially a pluggable choice of compression based on knowledge of the TS? - but most of my TS do have enough variations that RLE is not ideal)
Has someone else looked at this?
Do you have a similar use case?
Is this also something of interest to the Garnet team?
Have you found a good C# Gorilla Compression or TSXor implementation?
Beta Was this translation helpful? Give feedback.
All reactions