Код: Выделить всё
import polars as pl
data = {
"timestamp": [
"2023-08-04 10:00:00",
"2023-08-04 10:05:00",
"2023-08-04 10:10:00",
"2023-08-04 10:10:00",
"2023-08-04 10:20:00",
"2023-08-04 10:20:00",
],
"value": [1, 2, 3, 4, 5, 6],
}
df = pl.DataFrame(data).with_columns(pl.col("timestamp").str.strptime(pl.Datetime))
print(
df.with_columns(pl.col("value").rolling_sum_by("timestamp", "10m", closed="right"))
)
Код: Выделить всё
shape: (6, 2)
┌─────────────────────┬───────┐
│ timestamp ┆ value │
│ --- ┆ --- │
│ datetime[μs] ┆ i64 │
╞═════════════════════╪═══════╡
│ 2023-08-04 10:00:00 ┆ 1 │
│ 2023-08-04 10:05:00 ┆ 3 │
│ 2023-08-04 10:10:00 ┆ 9 │
│ 2023-08-04 10:10:00 ┆ 9 │
│ 2023-08-04 10:20:00 ┆ 11 │
│ 2023-08-04 10:20:00 ┆ 11 │
└─────────────────────┴───────┘
Код: Выделить всё
rel = duckdb.sql("""
SELECT
timestamp,
value,
SUM(value) OVER roll AS rolling_sum
FROM df
WINDOW roll AS (
ORDER BY timestamp
RANGE BETWEEN INTERVAL 10 minutes PRECEDING AND CURRENT ROW
)
ORDER BY timestamp;
""")
print(rel)
В качестве альтернативы я мог бы сделать
Код: Выделить всё
rel = duckdb.sql("""
SELECT
timestamp,
value,
SUM(value) OVER roll AS rolling_sum
FROM df
WINDOW roll AS (
ORDER BY timestamp
RANGE BETWEEN INTERVAL '10 minutes' - INTERVAL '1 microsecond' PRECEDING AND CURRENT ROW
)
ORDER BY timestamp;
""")
Подробнее здесь: https://stackoverflow.com/questions/790 ... -in-duckdb