Pandas - FutureWarning in concat - how to fix or opt into new behavior

I have code like the following where I split up a dataframe into different groups. The "treatment" group is where I might want to delete rows and/or modify rows; and for performance reasons I split it into away from a group of rows that should survive unchanged.

It is guaranteed that all DFs have the same columns and dtypes (they all come from the original df parameter).

At the end of the treatment, I want to concat them back to a single DF. Now, I do not know in advance if any of the DFs will be empty. (and if df is empty, all DFs will be empty (happens especially in testing) – usually though df has ~500k rows).

See code:

def some_fn(df: pd.DataFrame) -> pd.DataFrame:
    df_no_treatment, df_treatment = split_df(df)
    df_treatment = do_something_complex(df_treatment)

    assert (df.dtypes == df_treatment.dtypes).all()
    assert (df.dtypes == df_no_treatment.dtypes).all()

    result = pd.concat([df_no_treatment, df_treatment]).sort_index()

    assert (df.dtypes == result.dtypes).all()
    return result

Now, concat throws a FutureWarning: The behavior of array concatenation with empty entries is deprecated. In a future version, this will no longer exclude empty items when determining the result dtype. To retain the old behavior, exclude the empty entries before the concat operation.

Note the asserts in the code above, it seems to work as intended?

How do I fix the warning or opt into the new behavior? I don’t want any automatics with dtypes, they do match and concat should just concat and do not do anything else.

I find code like

    if df_no_treatment.empty:
        return df_treatment
    if df_treatment.empty:
        return df_no_treatment
    return pd.concat([df_no_treatment, df_treatment]).sort_index()

absolutely over the top for what was previously a simple concat. What am I missing?

You need to sign in to view this answers

Related Post