Space dimension on single node

TimescaleDB 2.11.1 is installed as single node (single computer) on PostgreSQL 15.3 on Red Hat 8.8.

In my biggest hypertable I have five companies data to manage. Currently we have one single table with the following table fields (simplified sample):

  • mytime
  • company
  • value

Data from individual company is never mixed with other company data (those companies are strictly separated). In SQL statements there is always WHERE company = 1 and ... condition (1 is for company 1, 2 is for company 2 etc).

Now I am thinking of somehow separate company data in separate categories.

Options:

  • create PostgreSQL instance data separated by company (advantage is I can quickly create new Linux and then separate data into several virtual machines),
  • create hypertable for each of the company,
  • create space dimension (data are partitioned by time and “space”).

What is bothering me is third option. In documentation it is written that primary intention for “space” dimension is for multinode (multiple computers) configuration and it is not recommended for single node installations. Space dimensions on single node should only be used in special cases.

I am wondering if my case is such special case when space dimensioning would be beneficial? The benefit of “space” dimension would be no applications changes required.

Is it smart to do a space dimensioning in my case?

That blog post[1] may help you. Have you’ve already seen it?

[1] https://www.timescale.com/blog/best-practices-for-time-series-data-modeling-single-or-multiple-partitioned-table-s-a-k-a-hypertables/

1 Like

@noctarius, thanks for the article. Article is about modeling single vs. multiple tables regarding some multi-tenant column (e.g. customer).

My question is more related to single table and partitioning because of performance reasons.

By default new chunks get separated by time dimension. I would like to know if someone has successfully partition tables by space dimension (additional to default time dimension) on single node (single computer)? If yes, did this improve performance?

@noctarius - What conclusion did you come to in the end? I have the same question for my data. The documentation does not make it very clear what the “specialized use cases” are.

The best answer is: you have to try with your dataset and query patterns. There is no general rule. Ingest a meaningful (large) amount of data, much larger than the time range you want to query, and run a good set of queries against the test dataset (or even better an actual clone of a real database). If you find performance benefits, you’re golden. If not, leave it be until you run into actual performance issues.

1 Like