Duplicate rows in continuous aggregate, watermark seems wrong

I’m trying to create a multi-day continuous aggregate with an offset/origin. The problem I’m having is that there are duplicate rows at the boundary of the materialized results and the real-time results.

CREATE MATERIALIZED VIEW daily_buckets_3_days_0_offset WITH (timescaledb.continuous) AS
SELECT time_bucket(‘3 day’, d.time, timezone => ‘UTC’, origin => ‘2000-01-01’) AS “time”,

FROM data_1day d
GROUP BY 1
WITH NO DATA;

SELECT add_continuous_aggregate_policy(‘daily_buckets_3_bars_0_offset’,
start_offset => INTERVAL ‘10 day’,
end_offset => INTERVAL ‘4 day’,
schedule_interval => INTERVAL ‘1 day’);

Today is 2022-12-12, and the output of a SELECT here is

2022-12-10 00:00:00+00
2022-12-07 00:00:00+00
2022-12-07 00:00:00+00

My data_1day table’s time column is a timestamptz, in case that matters.

When I create the same continuous aggregate without an offset, things look right. But I want the offset; I have 3 continuous aggregates, offset by 0, 1, and 2 days. One of the other offsets also gives duplicate values, and the third is normal.

When I do an EXPLAIN ANALYZE, the _timescaledb_internal.cagg_watermark is wrong, set to some value that’s not on the edge of a bucket boundary, e.g. 2022-12-09 10:57:25.094403+00

The time there is not related to the times I’m building the table or creating the policy. It’s always 10:57:25.

I’ve also tried experimental.time_bucket_ng(), and while the time values go to 00:00:00, the date on the watermark is still wrong.

I’m pretty sure I looked for limitations like this in the documentation, so I’m hoping this is just some parameter I’ve misunderstood.

As a workaround, I re-created the query underlying the view in a new view.
I found the definition by doing:
\d+ daily_buckets_3_days_0_offset

Basically, the new view is:

SELECT 
FROM data_1day
WHERE time >= time_bucket('3 day', time_bucket('3 days'::interval, COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(<underlying id>)), '-infinity'::timestamp with time zone), , timezone => ‘UTC’, origin => ‘2000-01-01’)
UNION ALL
SELECT 
FROM _timescaledb_internal._materialized_hypertable_<id>
WHERE time < time_bucket('3 day', time_bucket('3 days'::interval, COALESCE(_timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(<underlying id>)), '-infinity'::timestamp with time zone), , timezone => ‘UTC’, origin => ‘2000-01-01’)

The key difference here is that the cagg uses the watermark in its where clause:
time < COALESCE( _timescaledb_internal.to_timestamp(_timescaledb_internal.cagg_watermark(<underlying id>)), '-infinity'::timestamp with time zone)

I tried to dig into the code to figure out where cagg_watermark comes from and why it’s not equal to the time bucketed value, but got lost.

The new view is slightly less performant that the original cagg, but not enough that I’ve investigated yet.

Not exactly convenient, as I had to go get the id of each continuous aggregate, and hopefully I’ll remember why I did this in 3 years, and hopefully the next version of timescale doesn’t change how caggs work breaking my workaround.