stouputils.collections.dataframe module#

upsert_in_dataframe(
df: pl.DataFrame,
new_entry: dict[str, Any],
primary_keys: list[str] | dict[str, Any] | None = None,
) pl.DataFrame[source]#

Insert or update a row in the Polars DataFrame based on primary keys.

Parameters:
  • df (pl.DataFrame) – The Polars DataFrame to update.

  • new_entry (dict[str, Any]) – The new entry to insert or update.

  • primary_keys (list[str] | dict[str, Any] | None) – The primary keys to identify the row (for updates).

Returns:

The updated Polars DataFrame.

Return type:

pl.DataFrame

Examples

>>> import polars as pl
>>> df = pl.DataFrame({"id": [1, 2], "value": ["a", "b"]})
>>> new_entry = {"id": 2, "value": "updated"}
>>> updated_df = upsert_in_dataframe(df, new_entry, primary_keys=["id"])
>>> print(updated_df)
shape: (2, 2)
┌─────┬─────────┐
│ id  ┆ value   │
│ --- ┆ ---     │
│ i64 ┆ str     │
╞═════╪═════════╡
│ 1   ┆ a       │
│ 2   ┆ updated │
└─────┴─────────┘
>>> new_entry = {"id": 3, "value": "new"}
>>> updated_df = upsert_in_dataframe(updated_df, new_entry, primary_keys=["id"])
>>> print(updated_df)
shape: (3, 2)
┌─────┬─────────┐
│ id  ┆ value   │
│ --- ┆ ---     │
│ i64 ┆ str     │
╞═════╪═════════╡
│ 1   ┆ a       │
│ 2   ┆ updated │
│ 3   ┆ new     │
└─────┴─────────┘