TimeScale only uses a single core in compression?

asiayeah · April 15, 2024, 6:55am

I have tested compressing chunks manually in TimeScale on premise using:

select compress_chunk(i, if_not_compressed => true) from show_chunks(‘table’) i;

Question: The above will compress all chunks, but it will be run sequentially and in a single CPU core only.

Is there a way to improve the compression performance in a multiple CPU core machine?

Thank you.

jonatasdp · April 15, 2024, 2:07pm

Hi @asiayeah , you can spin up compress in different connections.
Send a compress of each chunk from a different connection.

asiayeah · April 17, 2024, 3:05am

Hi @jonatasdp, I have tried your suggestion. However, it doesn’t compress in parallel.

I started a psql and tried to compress a chunk:
=> select compress_chunk(‘_timescale_internal._hyper_148_3030_chunk’);

Then I started another psql and tried to compress another chunk:
=> select compress_chunk('_timescale_internal._hyper_148_3031_chunk");

I observed only a single core is used at a time and the 2nd compress_chunk seems to actually start after the 1st one completed. Eventually, the 2nd compress_chunk took a double time to complete.

It looks like there is a lock at table or index level for compressing chunks at TimeScale (2.13.1, Postgres 15.3). Could you confirm if this behavior is expected?

My current use case took 1.5 hours to compress a day of data. With a multi-core machine, I hope we could reduce this time significantly.

jonatasdp · April 18, 2024, 12:42pm

Hi @asiayeah, it seems that’s the actual behavior, but we’re tracking to change it:

github.com/timescale/timescaledb

Parallel Compression threads for high throughput data

opened 10:00AM - 31 Mar 21 UTC

alexsanjoseph

performance compression feature-request

Currently the compression policy seems to run be a single-threaded serial proces…s. However, at high throughput, the ingest rates can be significantly more than what a single thread can handle, and hence cause the lots of uncompressed chunks, even though they're all in the queue. Is there a way to run compression in parallel such that all the chunks which are uncompressed gets compressed in parallel assuming there is sufficient CPU, RAM etc? Also, might be related, while trying to create multiple independent threads for compression by running multiple sessions manually, then seem to get blocked with a `ShareUpdateExclusiveLock`. System Details: - Deployment: Single Node, Kubernetes Helm - Throughput - 500K records/s -> 5M metrics per second (but mostly sparse data) - Resources - 64 CPU, 128 GB RAM - Compression settings - - Chunk sizes - 15 minutes chunk size, comes to ~30-40 GB per chunk, compressed to ~500MB (98% efficiency - Kickass!) - Compression policy settings Compression settings: ``` ALTER TABLE rawdata SET ( timescaledb.compress, timescaledb.compress_segmentby = 'device_uuid', timescaledb.compress_orderby = 'data_item_name, timestamp DESC' ); SELECT add_compression_policy('rawdata', INTERVAL '50 hour'); ``` PG Settings: ``` max_connections: 500 shared_buffers: 25GB effective_cache_size: 75GB maintenance_work_mem: 2GB checkpoint_completion_target: 0.9 wal_buffers: 16MB default_statistics_target: 500 random_page_cost: 1.1 effective_io_concurrency: 1000 work_mem: 873kB min_wal_size: 4GB max_wal_size: 128GB max_worker_processes: 60 max_parallel_workers_per_gather: 30 max_parallel_workers: 60 max_parallel_maintenance_workers: 4 ``` Timescale Settings: ``` timescaledbTune: args: max-bg-workers: 50 ``` Deployment Settings: ``` image: # Image was built from # https://github.com/timescale/timescaledb-docker-ha tag: pg12-ts2.0-latest # https://hub.docker.com/r/timescale/timescaledb/tags?page=1&ordering=last_updated pullPolicy: IfNotPresent persistentVolumes: data: size: 5000G wal: size: 500G resources: limits: cpu: 64000m memory: 120960Mi requests: cpu: 55000m memory: 100480Mi ``` Currently the compresssion is lagging by 56 hours since it is not able to keep up with the data

I realized that my confusion comes from our new parallel refreshes in the same continuous aggregates and mixed the subjects.

asiayeah · April 18, 2024, 1:44pm

Thank you. I added my vote there.

asiayeah · April 19, 2024, 6:22am

There is another related issue, [Enhancement]: compress chunks in the same hypertable in parallel · Issue #6239 · timescale/timescaledb (github.com). This issue was marked as closed.

So I re-tested it with the latest TimescaleDB 2.14.1.

I can confirm 2.14.1 can compress_chunk(‘1_chunk’) in parallel in 2 connections.

Now the caveat is I can’t trigger the following command in 2 connections to have 2 parallel compression:

select compress_chunk(i, if_not_compressed => true) from show_chunks(‘table’) i;

The reason is if I trigger above, they will attempt to compress the same outstanding uncompressed chunk. Effectively this will serialize again.

My current thought is we can implement parallel compress by having multiple connections, but each needs to compress different chunks. Ideally I hope TimescaleDB will make it easier.

Can we add a skip_if_already_compressing or max_outstanding_compression parameter to compress_chunk()? What do you think?

jonatasdp · April 19, 2024, 1:38pm

Thanks for the details @asiayeah,

you can also get more info from chunk_compression_stats and probably use the status to skip what is in progress…