In Polars / pandas I can do a rolling sum where row each row the window is (row - 10 minutes, row]
. For example:
import polars as pl
data = {
"timestamp": [
"2023-08-04 10:00:00",
"2023-08-04 10:05:00",
"2023-08-04 10:10:00",
"2023-08-04 10:10:00",
"2023-08-04 10:20:00",
"2023-08-04 10:20:00",
],
"value": [1, 2, 3, 4, 5, 6],
}
df = pl.DataFrame(data).with_columns(pl.col("timestamp").str.strptime(pl.Datetime))
print(
df.with_columns(pl.col("value").rolling_sum_by("timestamp", "10m", closed="right"))
)
This outputs
shape: (6, 2)
┌─────────────────────┬───────┐
│ timestamp ┆ value │
│ --- ┆ --- │
│ datetime[μs] ┆ i64 │
╞═════════════════════╪═══════╡
│ 2023-08-04 10:00:00 ┆ 1 │
│ 2023-08-04 10:05:00 ┆ 3 │
│ 2023-08-04 10:10:00 ┆ 9 │
│ 2023-08-04 10:10:00 ┆ 9 │
│ 2023-08-04 10:20:00 ┆ 11 │
│ 2023-08-04 10:20:00 ┆ 11 │
└─────────────────────┴───────┘
How can I do this in DuckDB? Closest I could come up with is:
rel = duckdb.sql("""
SELECT
timestamp,
value,
SUM(value) OVER roll AS rolling_sum
FROM df
WINDOW roll AS (
ORDER BY timestamp
RANGE BETWEEN INTERVAL '9 minutes 59 seconds' PRECEDING AND CURRENT ROW
)
ORDER BY timestamp;
""")
print(rel)
but that makes the window [row - 10 minutes, row]
, not (row - 10 minutes, row]
You need to sign in to view this answers
Leave feedback about this