What is the right way to "vacuum" chunks after data is deleted?

The normal vacuum command doesn’t seem to do anything when Timescale is in use. Is there any equivalent command or procedure to shrink database size after data is deleted in various chunks?

Are chunks a kind of partition?

I’m searching too for this topic. I believe this postgresql official link could be useful:

  • If you are using manual VACUUM or ANALYZE commands, don’t forget that you need to run them on each child table individually. A command like:

ANALYZE measurement;

will only process the root table.

It seems to be the same result that throws the equivalent timescale function:

select * from show_chunks('series.devices');

I’m only a newbie though,

Regards.

That’s really my question… how to properly vacuum the database, and how to do it in one go. I’m sure there must be a way using the extension.

:wave: @benneharli,

Vacuuming via the hypertable (parent table) should work as expected. If you do a VACUUM VERBOSE... as @nando showed, you should see the output through the console. Feel free to provide more detail about what you’re seeing.

Depending on your use case, one of the advantages of using hypertables is the ability to set data retention policies to just drop chunks as they get older. It sounds like you’re doing more targeted data deletion inside of each chunk, but just in case I wanted to make sure I mentioned that. That is, if you were doing something like

DELETE FROM hypertable WHERE ts < now()-'1 month'::interval,

data retention would be the better approach:

select drop_chunks('hypertable','1 month');

Thanks, I can see that vacuum actually does seem to work, and does take the underlying chunk into account. Thanks for the “verbose” hint.

We do use rentention, and that’s great for overall storage limits… this is however as you write more targeted deletes.

Cool. There are definitely times where targeted deletes are necessary and I get that.

Are you using TimescaleDB compression? Not only does it save space on disk and often speed up historical queries, it’s actually possible to make more efficient deletes of historical data if it’s part of the segmentby argument.

Could explain more if you’re interested.

Out next release (schema version) includes compression setup, but the issue we have right now is the lack of support for deletes/inserts in compressed chunks.

Yep, I understand that. For the record, you can insert into compressed chunks (starting with TimescaleDB 2.2 I think) - but you cannot UPDATE/DELETE yet… at least not directly.

My comment on DELETE is that if you segmantby the relevant column (say something like account_id), then the rows in the compressed chunks are all grouped by account_id. You’re guaranteed that no other data points in that compressed row will contain data for another account_id.

This means it is possible to delete rows from the compressed chunk based on a segmentby column if that would be helpful (you do have to query directly against the compressed hypertable name, you can’t do a delete through the hypertable). Consider something like GDPR for example, if your data was all segmented by account_id and you needed to remove historical, compressed data, you don’t have to decompress the chunk if it’s a global delete like that (independent of time, specifically).

As soon as you try to delete compressed data based on segmentby and another column that is compressed, it’s not possible without decompressing first.

Hope that makes sense. Might not be helpful in your situation, but if we’re talking about VACUUM, that usually implies larger deletes and this might be a useful thing to know. :slight_smile: