Skip to content

Commit 56a6641

Browse files
boris-koganaquemy
authored andcommitted
fix: update duplicates_pandas.py (#1427)
Fixing Bug Report #1384 Dataset with categorical features causes memory error even on tiny dataset.
1 parent 36e2fa7 commit 56a6641

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

src/ydata_profiling/model/pandas/duplicates_pandas.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ def pandas_get_duplicates(
3535
duplicated_rows = df.duplicated(subset=supported_columns, keep=False)
3636
duplicated_rows = (
3737
df[duplicated_rows]
38-
.groupby(supported_columns, dropna=False)
38+
.groupby(supported_columns, dropna=False, observed=True)
3939
.size()
4040
.reset_index(name=duplicates_key)
4141
)

0 commit comments

Comments
 (0)