What (if anything) can be done to speed up migrate_data?

These days I am converting a lot of regular tables to hypertables. In some cases, we’ll want a retention policy so I delete old data before converting to a hypertable. But in other cases, we want to preserve all existing data.

Typically I use migrate_data=>TRUE for an in-place migration. This can take many minutes, up to an hour even. I wonder what (if any) faster alternatives exist. Would a standalone index on the hypertable key candidate help, or not at all? How about some much more involved method of creating an empty hypertable and loading it with a CSV using timescaledb-parallel-copy?

I’m somewhat presuming that your extension, written in C, is probably as efficient as it gets. But I’d love confirmation from the authors.

If you have more indices, it will need to also sync the indices. So, more indices will slow down the inserts.

I think still this is one of the most efficient ways. @JamesGuthrie is working on some python scripts that just breaks data into CSVs and calls back the parallel copy tool.

Yes! I’m not the author, but I talk with them very often :nerd_face:

So your suggestion is that I should drop all existing indexes before I migrate to a hypertable, and recreate them later?

Yes, depending on your batch size, you’ll end up having several back and forth and it will slow down it.

This blog post may help you.