Homegrown monitoring for personal projects: Prometheus, TimescaleDB, and Grafana FTW
Learn how Tyler - one of our newest additions to the team - setup a "roll-your-own monitoring" solution with 99% storage savings – and how you can do the same, be it for your personal projects or a critical piece of business infrastructure.
Like many of you, I love to tinker. I overcomplicate the things in and around my home because, while I don’t NEED an enterprise-ready network, it’s fun to play with anyway. I’ve got an Owncloud instance I run for a friend, a pi-hole, a lab Postgres instance, a CIFS server for backing up the dashcam and sentry footage from my Tesla, and a Docker host for running various miscellany (including all the bits for this project!), plus other assorted things.
As a new addition to the Timescale team, I decided I needed a good dogfood project, and monitoring all of these things felt like a good way to get started. So, I did some reading and decided to set up Prometheus, TimescaleDB (+ the Timescale Prometheus Adapter), and Grafana to get acquainted with the various pieces.
I hoped to get a better sense of how our customers might be using our software, as well as just get more familiar with TimescaleDB, Postgres, Grafana – and how they all fit together. Plus, it’d be good to know if my Owncloud server was running out of storage, or if that pi-hole was running out of RAM.
Prometheus was straight-forward for me to get up and going; their focus on discrete parts doing one thing and doing them well really shines.
- The Prometheus server and the exporters are both hassle-free: run the exporter on the host you want to monitor, point the prometheus server at the right host and port, and you're done!
- The node-exporter is lightweight, and the settings for Prometheus itself are likewise minimal, making for a quick and easy solution.
If you’d like to tinker along with me, I’ve got an example of what I’m using for the monitoring bits on GitHub (PRs always welcome!). Did I mention I like to overcomplicate things? :)
In any event, I’m gathering Prometheus node-exporter data from 4 hosts, and Prometheus pi-hole-exporter data from the pi-hole in addition.
Here’s a look at what this looks like, with different parts at two locations (some at my office, and some at home).
This monitoring setup generates about 3.5GB of raw data every day - not a lot in terms of monitoring for a business, but quite a lot for a side project! Since this is just for some messing around, I don’t want to have to put a bunch of disk storage behind it.
Here's the monitoring view, using a great dashboard by starsL.
What to do with my data?
I considered dropping data older than a few days, but before I did that, I thought I should check out TimescaleDB’s native compression.
My reasoning: the more data you can collect from your systems, the better you can see trends, plan for growth, and understand ideal service windows for maintenance. Not super important for home stuff, but in the interest of making this as practical as possible, I wanted the full picture of the insights I could glean. To do that, I needed to grab and retain as much data as I reasonably could.
Enabling compression? Just as straightforward as getting Prometheus up and running 🎉:
ALTER TABLE metrics_values SET (timescaledb.compress, timescaledb.compress_orderby = 'labels_id, time DESC');
Then, I set my policy to compress data older than 3 days:
SELECT add_compress_chunks_policy ('metrics_values', INTERVAL '3 days');
I’ve let this run for a couple weeks now, and the results have been astounding. I figured that I’d be able to keep roughly two weeks worth of data before rolling it off – but the compression on my Prometheus data is so effective, I’ll be able to keep months worth of data without using any significant storage (1.5 GB/month rather than about 105GB/month). This lets me see much longer usage patterns.
See for yourself:
SELECT uncompressed_total_bytes, compressed_total_bytes FROM timescaledb_information.compressed_hypertable_stats;
|58 GB||637 MB|
Admittedly, this is just a single sample with a relatively small amount of data, but I’m seeing compression coming in at roughly 99% space savings (approximately 58000MB → 637MB) for various kinds of Prometheus data (no small feat!).
If you want more information about how compression works and the benchmarks we saw during internal testing, check out this blog post.
If you want to try it out yourself and see how much your data compresses, our Quick Start documentation is a great place to start.
If you have questions, need help troubleshooting, or want to get some recommendations, reach out to us on Slack at any time. There’s always at least one Timescale Engineer - including myself - “on duty” and community members are keen to jump in and help too.