Compression policy has left chunks 20x larger than before compression

ianf · January 5, 2023, 9:46am

This is a db running in kubernetes using PostgreSQL 13.6 and timescale 2.6, we have a compression policy with a 1 month threshold. In the time range affected, we expected an uncompressed chunk size of <4GB and after compression it was usual to get chunk size 400MB. At the time the db had about 10GB RAM, so less than ideal for the chunk size, and was quite prone to OOM restarts. The problems observed were slow performance when querying certain time ranges, continuous aggregate policies also failed in the same ranges.

Looking at the timeseries chunks in the affected ranges, I see that some of the chunk sizes are incredibly large eg 50GB+ (there are about 10 of these)

I think that compression policy probably failed in such a way that chunks were left in this kind of bloated state. I expect it was caused by our general server instability. Since we’ve improved in that regard, and provided more resources to the server, I don’t think it will happen again, I just need to clean up and move on.

I will try to illustrate the problem, first by looking at details for the chunks on 3 consecutive days in July - all should have a similar number of rows, and all should have been compressed.

SELECT chunkInfo.chunk_name, chunkInfo.range_start, range_end, pg_size_pretty(sizeInfo.total_bytes) as Total_Size FROM chunks_detailed_size('vector_events') as sizeInfo
INNER JOIN timescaledb_information.chunks as chunkInfo ON sizeInfo.chunk_name = chunkInfo.chunk_name
WHERE range_start >= '20220710' AND range_start <= '20220712' ORDER by chunkInfo.range_start;

     chunk_name      |      range_start       |       range_end        | total_size
---------------------+------------------------+------------------------+------------
 _hyper_1_2010_chunk | 2022-07-10 00:00:00+00 | 2022-07-11 00:00:00+00 | 374 MB
 _hyper_1_2012_chunk | 2022-07-11 00:00:00+00 | 2022-07-12 00:00:00+00 | 66 GB
 _hyper_1_2014_chunk | 2022-07-12 00:00:00+00 | 2022-07-13 00:00:00+00 | 3191 MB

Note the 66GB ‘megachunk’ but also the differences in size between the 3 chunks. Look also at the row counts:

SELECT count(*) from vector_events where event_time >= '20220710' and event_time < '20220711'; -- 61936536 (took 5 sec to count)
SELECT count(*) from vector_events where event_time >= '20220711' and event_time < '20220712'; -- 60385707 (took 4 minutes to count)
SELECT count(*) from vector_events where event_time >= '20220712' and event_time < '20220713'; -- 58562431 (took 1 sec)

All 3 have a similar number of rows, although it was far slower to execute a count query on the megachunk.

So something is clearly wrong here, now look at more detail on the chunks_detailed_size

SELECT * FROM chunks_detailed_size('vector_events')
WHERE chunk_name in ('_hyper_1_2010_chunk','_hyper_1_2012_chunk','_hyper_1_2014_chunk')
ORDER BY chunk_name;

     chunk_schema      |     chunk_name      | table_bytes | index_bytes | toast_bytes | total_bytes | node_name
-----------------------+---------------------+-------------+-------------+-------------+-------------+-----------
 _timescaledb_internal | _hyper_1_2010_chunk |    11911168 |     1597440 |   378617856 |   392126464 |
 _timescaledb_internal | _hyper_1_2012_chunk | 68586569728 |  1904730112 |           0 | 70491299840 |
 _timescaledb_internal | _hyper_1_2014_chunk |  3068485632 |     1540096 |   275857408 |  3345883136 |
(3 rows)

So the toast bytes for our megachunk is 0, both the index and table bytes are huge for this chunk. There’s also a big difference in the table bytes for the other 2 chunks, although their indexes are similar. Remember that _hyper_1_2014_chunk actually has fewer rows than _hyper_1_2010_chunk, yet _hyper_1_2014_chunk 's table size is 250x that of _hyper_1_2010_chunk. There must be something wrong with compression here…

Examine the chunks using chunk_compression_stats()

SELECT chunk_name, compression_status, before_compression_table_bytes, before_compression_index_bytes, before_compression_toast_bytes, before_compression_total_bytes,
after_compression_table_bytes, after_compression_index_bytes, after_compression_toast_bytes, after_compression_total_bytes
FROM chunk_compression_stats('vector_events')
WHERE chunk_name in ('_hyper_1_2010_chunk','_hyper_1_2012_chunk','_hyper_1_2014_chunk')
ORDER BY chunk_name;

     chunk_name      | compression_status | before_compression_table_bytes | before_compression_index_bytes | before_compression_toast_bytes | before_compression_total_bytes | after_compression_table_bytes | after_compression_index_bytes | after_compression_toast_bytes | after_compression_total_bytes 
---------------------+--------------------+--------------------------------+--------------------------------+--------------------------------+--------------------------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------
 _hyper_1_2010_chunk | Compressed         |                     6465093632 |                     1953816576 |                              0 |                     8418910208 |                      11902976 |                       1589248 |                     378601472 |                     392093696
 _hyper_1_2012_chunk | Uncompressed       |                                |                                |                                |                                |                               |                               |                               |                              
 _hyper_1_2014_chunk | Compressed         |                    70183092224 |                     1847230464 |                              0 |                    72030322688 |                      12009472 |                       1531904 |                     275841024 |                     289382400
(3 rows)

According to this, the megachunk _hyper_1_2012_chunk is uncompressed, but the other 2 chunks are meant to be compressed. Apparently the after compression table bytes for _hyper_1_2014_chunk is 12009472, that does not agree with the table_bytes in chunks_detailed_size.

So I suspect that there was some kind of crash during compression on these chunks, and that has left them in this weird state, there are 2 separate issues here:

Megachunks like _hyper_1_2012_chunk are not compressed, and are 20x larger than their expected uncompressed size.
Chunks like _hyper_1_2014_chunk which timescale thinks are compressed, but their size stats suggest they are actually not compressed

I am a bit curious how this can have happened, especially how a chunk can get to be 20x larger than its uncompressed size. But I mostly would like to know how I can best fix it.

I have already tried taking a backup of the production db and restoring it in a test environment. We used pgBackrest to take the backup, interestingly this backup/restore technique also carried over the weird chunk sizes to our test env. So I can’t just fix it using a simple backup / restore.

I’m considering a couple of approaches to clean this up:
“nuke it all and start over”
Backup the entire db’s raw event data to a csv or similar, and restore that into a new db. I think this would work, but it’s a 1.5TB database, so the size and time considerations might be a problem.

“identify and fix the broken bits”
Write a script to do the following:
Fix megachunks: Identify the megachunks, and their time ranges. copy their event data to a holding table, drop the megachunks, and then copy from the holding table back into the hypertable. Call compress_chunk()
Fix chunks that timescale thinks are compressed: Identify chunks with compression status ‘compressed’ and total_bytes > 1000000000. Call decompress_chunk, then call compress_chunk();

So I was just wondering if there are any, better ways of fixing this, and really how can a 3GB table become 60GB!

Thanks in advance, and wishing you a happy new year!

James · January 6, 2023, 8:19am

This sounds like an interesting one!

Before we go too far down the rabbit hole could you please try cloning and compressing one of the megachunks. This will validate that the issue isn’t in any way related to the data in the chunks (I’m sure it’s not, but stranger things have happened), and let us focus on trying to work out how we got into this state.

If you could also provide the compress.segment_by and compress.order_by settings for the table that would help a lot.

-James

James · January 6, 2023, 10:11am

Actually it looks like this issue is documented here:

github.com/timescale/timescaledb

Size of the chunks increase without being decompressed if decompress_chunk(i) fails in middle

opened 07:46AM - 30 Jun 21 UTC

closed 06:24AM - 08 Sep 22 UTC

hardikm10

bug compression

**Relevant system information:** OS: [e.g. Ubuntu 16.04, Windows 10 x64, etc]: …Docker / timescale/timescaledb:2.3.0-pg12 - PostgreSQL version (output of `postgres --version`): 12.7 - TimescaleDB version (output of `\dx` in `psql`): 2.3.0 - Installation method: [e.g., "using Docker", "apt install", "source"] : "Using Docker" **Describe the bug** If `select decompress_chunk(i) FROM show_chunks('<hypertable>') i` fails in the middle because there was 'a chunk' which was already decompressed, the chunks on which it has acted upon (before the failure) still shows as compressed (which is expected) but their size increases somehow( which isn't expected). Upon repeated execution of the above failing command, the size increase continuously with the status still remain as 'compressed' **To Reproduce** Steps to reproduce the behavior: ``` drop table sensor_data; create table sensor_data( time timestamptz not null, sensor_id integer not null, cpu double precision null, temperature double precision null ); select from create_hypertable('sensor_data','time'); INSERT INTO sensor_data SELECT time + (INTERVAL '1 minute' * random()) AS time, sensor_id, random() AS cpu, random()* 100 AS temperature FROM generate_series(now() - INTERVAL '1 months', now() - INTERVAL '1 week', INTERVAL '1 minute') AS g1(time), generate_series(1, 100, 1 ) AS g2(sensor_id) ORDER BY time; ``` ``` SELECT *,pg_size_pretty(total_bytes) FROM chunks_detailed_size('sensor_data'); SELECT chunk_name,pg_size_pretty(total_bytes) as total_size FROM chunks_detailed_size('sensor_data'); chunk_name | total_size ------------------+------------ _hyper_1_1_chunk | 51 MB _hyper_1_2_chunk | 97 MB _hyper_1_3_chunk | 97 MB _hyper_1_4_chunk | 87 MB ``` ``` alter table sensor_data set (timescaledb.compress, timescaledb.compress_orderby = 'time DESC'); defaultdb=> select compress_chunk(i) FROM show_chunks('sensor_data', older_than => '8 days'::INTERVAL) i; ------------------------------------------ _timescaledb_internal._hyper_1_1_chunk _timescaledb_internal._hyper_1_2_chunk _timescaledb_internal._hyper_1_3_chunk (3 rows) SELECT chunk_name,pg_size_pretty(total_bytes) as total_size FROM chunks_detailed_size('sensor_data') order by chunk_name;; chunk_name | total_size ------------------+------------ _hyper_1_1_chunk | 11 MB _hyper_1_2_chunk | 20 MB _hyper_1_3_chunk | 20 MB _hyper_1_4_chunk | 87 MB (4 rows) ``` ``` select decompress_chunk('_timescaledb_internal._hyper_1_2_chunk'); decompress_chunk ---------------------------------------- _timescaledb_internal._hyper_1_2_chunk SELECT chunk_name,pg_size_pretty(total_bytes) as total_size FROM chunks_detailed_size('sensor_data') order by chunk_name;; chunk_name | total_size ------------------+------------ _hyper_1_1_chunk | 11 MB _hyper_1_2_chunk | 80 MB _hyper_1_3_chunk | 20 MB _hyper_1_4_chunk | 87 MB (4 rows) ``` ``` select decompress_chunk(i) FROM show_chunks('sensor_data', older_than => '8 days'::INTERVAL) i; ERROR: chunk "_hyper_1_2_chunk" is not compressed SELECT chunk_name,pg_size_pretty(total_bytes) as total_size FROM chunks_detailed_size('sensor_data') order by chunk_name;; chunk_name | total_size ------------------+------------ _hyper_1_1_chunk | 41 MB <== notice here that the size is increased _hyper_1_2_chunk | 80 MB _hyper_1_3_chunk | 20 MB _hyper_1_4_chunk | 87 MB (4 rows) select chunk_name,is_compressed from timescaledb_information.chunks; chunk_name | is_compressed ------------------+--------------- _hyper_1_1_chunk | t <== the status is still shown as compressed. _hyper_1_2_chunk | f _hyper_1_3_chunk | t _hyper_1_4_chunk | f (4 rows) ``` Also if it's repeated, the chunk size increases again. Ex: ``` SELECT chunk_name,pg_size_pretty(total_bytes) as total_size FROM chunks_detailed_size('sensor_data') order by chunk_name;; chunk_name | total_size ------------------+------------ _hyper_1_1_chunk | 72 MB <== the size increased again, and status is still 't' in metadata _hyper_1_2_chunk | 80 MB _hyper_1_3_chunk | 20 MB _hyper_1_4_chunk | 87 MB (4 rows) ``` **Expected behavior** If the idea is to fail the complete transaction, the size of the chunks before failure should not increase. **Actual behavior** Size of the chunks before the command fails increases and is repeatable

Try running VACUUM on the problem chunks, if that works VACUUM the whole hypertable.

ianf · January 6, 2023, 10:17am

@James , yes interesting is one word for it!

I had some success with this yesterday, and found that I can indeed compress these ‘megachunks’ down to the kind of compressed size I would expect - so the chunk that was 66GB could be reduced to 320MB. Having done this, query performance within the chunk range was ok - took about 4 sec to count all events in the chunk range, rather than 4 minutes.

Then for the chunks which timescale thinks are compressed, but are still have a suspiciously large table size.(a couple of these were also of size > 10GB, one was 40GB). Here I found that I could first call decompress_chunk, then call compress_chunk, and again, I can get them down to the expected size ~300MB.

I made some tooling to identify these types of chunk and ran it overnight on our devtest copy, and it seems to have been quite successful. It took about 9h to get through the most badly affected 40 day period, and refresh the aggregates. Then I will still need to compress all subsequent chunks up until december, where our compression policy was blocked by the earlier chunking issue.

Here are some log extracts from the tooling, showing the before and after sizes for the 2 scenarios.

Log extract for compress megachunks:

About to compress _hyper_1_2012_chunk | 
Before compression: ChunkName:_hyper_1_2012_chunk StartDate:2022.07.11 TotalSize: 66 GB IsCompressed:False IndexSize:1904730112 TableSize:68586569728 ToastSize:0 | 
After compression: ChunkName:_hyper_1_2012_chunk StartDate:2022.07.11 TotalSize: 317 MB IsCompressed:True IndexSize:1572864 TableSize:12386304 ToastSize:317980672 | 
About to compress _hyper_1_2264_chunk | 
Before compression: ChunkName:_hyper_1_2264_chunk StartDate:2022.07.19 TotalSize: 8713 MB IsCompressed:False IndexSize:2120859648 TableSize:7015186432 ToastSize:0 | 
After compression: ChunkName:_hyper_1_2264_chunk StartDate:2022.07.19 TotalSize: 318 MB IsCompressed:True IndexSize:1728512 TableSize:13475840 ToastSize:318488576 |

Log extract for decompress/recompress on falsely compressed chunks:

About to decompress _hyper_1_1058_chunk | 
Before decompression: ChunkName:_hyper_1_1058_chunk StartDate:2022.06.17 TotalSize: 11 GB IsCompressed:True IndexSize:1236992 TableSize:11932508160 ToastSize:261103616 | 
After decompression: ChunkName:_hyper_1_1058_chunk StartDate:2022.06.17 TotalSize: 15 GB IsCompressed:False IndexSize:1441095680 TableSize:14306648064 ToastSize:0 | 
About to compress _hyper_1_1058_chunk | 
After compression: ChunkName:_hyper_1_1058_chunk StartDate:2022.06.17 TotalSize: 260 MB IsCompressed:True IndexSize:1236992 TableSize:10313728 ToastSize:261095424 |

Here it decompressed from 11GB to 15GB, the compressed from 15GB to 260MB!

So it seems to be curable, but the curiosity of what happened remains.

Now, to answer your question:

SELECT hypertable_name, attname, segmentby_column_index, orderby_column_index, orderby_asc, orderby_nullsfirst
FROM timescaledb_information.compression_settings
WHERE hypertable_name = 'vector_events';

 hypertable_name |     attname      | segmentby_column_index | orderby_column_index | orderby_asc | orderby_nullsfirst
-----------------+------------------+------------------------+----------------------+-------------+--------------------
 vector_events   | vector_stream_id |                      1 |                      |             |
 vector_events   | event_time       |                        |                    1 | f           | t

I’m not really familiar with this view, so not sure if it explains enough. Here’s the create script and create_hypertable as well in case it helps to get a picture of the layout:

-- Create script, the PRIMARY KEY gives us the index we want on vector_stream_id and event_time. Plus a uniqueness constraint on the same.
CREATE TABLE public.vector_events
(
    event_time timestamp with time zone NOT NULL,
    vector_stream_id integer NOT NULL,
    event_data0 double precision NOT NULL,
    CONSTRAINT vector_events_pkey PRIMARY KEY (vector_stream_id, event_time)
);
-- Create hypertable, space partitioning on vector stream id - one query is only ever for a single vector_stream_id
SELECT create_hypertable('vector_events', 'event_time', 'vector_stream_id', 1, chunk_time_interval => INTERVAL '1 day', create_default_indexes => FALSE, migrate_data => TRUE);

ianf · January 6, 2023, 10:28am

I just saw your comment, that sounds like a good match for what I was seeing - so this was actually a result of a failed attempt to decompress a chunk? (I thought failed compression was more likely here, but we do occasionally have to call decompress).

Regarding the github issue, I had tried running VACUUM (VERBOSE, ANALYZE) on the entire db, but not on the specific chunks. That may have made a difference

Anyway, I am happy to be able to clean it up, and to understand a bit more about the cause of it. Thanks!