OiO.lk Blog SQL How to Incrementally Append and Upsert Rows in Kedro directly to PostgreSQL DB (SQLTableORMDataset)?
SQL

How to Incrementally Append and Upsert Rows in Kedro directly to PostgreSQL DB (SQLTableORMDataset)?


I’m working on a Kedro project where I have a dataset defined in catalog.yml as follows:

daily_stats_dataset:
  type: SQLTableDataSet
  orm_model: my_proj.schemas.sqla_schemas.DailyStats
  credentials: my_database


monthly_stats_dataset:
  type: SQLTableDataSet
  orm_model: my_proj.schemas.sqla_schemas.MonthlyStats
  credentials: my_database

I need to incrementally append new rows from my daily_stats_dataset to my monthly_stats_dataset. However the daily stats are aggregated each time and added to the monthly stats. This means that, when the monthly stats already exist, I need to overwrite that line. I want to ensure that I do not insert duplicate entries based on a date column. I would thus like to implement upsert functionality for existing rows.

Is it possible to do this with Kedro, since the nodes look like this

    node(
            func=update_monthly_stats,
            inputs=daily_stats_dataset,
            outputs="monthly_stats_dataset",
        ),

but the monthly_stats will be inserted, resulting in unique constraints to be violated.



You need to sign in to view this answers

Exit mobile version