Follow

Uff, data downsampling can be a weird problem sometimes!
Averaging over regular-spaced data is fine, but when you have short, sharp spikes in-between lower frequency samples, that can lead to very werid behaviour.

Picture no. 1 shows awkward graph behaviour, resulting from averaging over the leftmost edge of a sharp spike.

, luckily, has tools to fix this. Using their first and last operators to extract the edges of each time bucket (generated with time_bucket_gapfill), it’s possible to “fill in” buckets without any samples in them to use the edge of an adjacent bucket. This ensures that edges of sharp spikes remain preserved.

Picture no. 2 shows the exact same data but with a bit of edge preservation done. The data looks much closer to the truth with almost no extra points plotted!

Something for the people maybe? ;)

@xaseiresh You may also want to look at the #TimescaleDB toolkit which provides functions like lttb or time-weighted averages, such as the asap algorithm 🙂

@noctarius2k Thanks! I think lttb would solve this issue very nicely, however, the server that Timescale is running on is already set up and without a direct internet connection ((another plus for Timescale not fussing with any DRM)).

Does LTTB support group-bys, anyhow?
The documentation only shows it for a single stream of data that then needs to be unnest’ed, which wouldn’t quite fit our use case here.

Adding the toolkit would be possible, I’d just have to contact the IT department to get them to sing off, and with the start of the official experiment phase of the they’re very busy ATM.
This was a quick but good-enough fix.

Actually, I should update how the project has been doing :D

@xaseiresh my colleague David has way more knowledge on the lttb algorithm. We were talking about it just yesterday. I guess if you join our community slack, I can redirect him to you. Unfortunately, I didn't have too much time to play with it myself (just trying it out for my electronic power consumption dataset) 😅

@noctarius2k I’m already on your Slack, it’s been a good experience so far and I enjoyed writing my Q&A blog post for your team!

One quick related question: Is there an easy way of switching between tables in a Grafana SQL query to utilize Continuous Aggregates for “zoomed out” data and the raw data table for more “zoomed in” shots?
Optimally I’d define a PostgreSQL view, but from what I gathered, an IF block or a UNION between two SELECTs with a complementary WHERE statement works just as well.

Would make for a great optimization for plotting stuff in Grafana, we have some graphs here that have 1ms resolution in some areas, and we often switch between viewing the last hour(s) of data before digging in, so an automatic switch between CAGG and raw data would be super helpful!

@xaseiresh I bet there's some magic in Grafana to make it happen, but I'm not a Grafana wizard myself.

I thought we had a blog post about it, but the one I was thinking about is for asap and lttb (timescale.com/blog/slow-grafan).

My colleagues Attila or Mathis may have an idea. If you can drop the question in slack, I'm happy to get them to look at it :)

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.