Skip to content

Add test to check numeric precision GH33234 #51753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 13, 2023

Conversation

liang3zy22
Copy link
Contributor

This a new PR to fix #33234 since old PR closed.

@liang3zy22 liang3zy22 force-pushed the testnumericprecision branch 2 times, most recently from 689ae3d to 23a03df Compare March 4, 2023 00:40
else 2147483647
],
}
).astype({"key3": dtype})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you define all those objects explicitly? e.g. no astype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Could I drop the astype part and keep just DataFrame creation expression?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I defined all those objects explicitly in the updated commit.

@mroeschke mroeschke added the Testing pandas testing functions or related to the test suite label Mar 6, 2023
@liang3zy22 liang3zy22 force-pushed the testnumericprecision branch 2 times, most recently from 92a9b7b to cb38b93 Compare March 8, 2023 07:09
"Float32",
],
)
def test_groupby_agg_precision(dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use the any_real_numeric_dtype fixture here?

"key1": ["a"],
"key2": ["b"],
"key3": [
pd.array([1583715738627261039], dtype=dtype)[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you specify a max_value variable above this df assignment and then write this as

df = pd.DataFrame({..., "key3": pd.array([max_value], dtype=dtype)})

}
)
expected = df[["key3"]]
result = df.groupby(["key1", "key2"]).agg(lambda x: x).reset_index()[["key3"]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you construct expected such that result is just df.groupby(["key1", "key2"]).agg(lambda x: x)?

@liang3zy22 liang3zy22 force-pushed the testnumericprecision branch 2 times, most recently from 66bd297 to 68f8fb2 Compare March 13, 2023 04:42
Signed-off-by: Liang Yan <ckgppl_yan@sina.cn>
@liang3zy22 liang3zy22 force-pushed the testnumericprecision branch from 68f8fb2 to 5f7327f Compare March 13, 2023 06:26
@mroeschke mroeschke added this to the 2.1 milestone Mar 13, 2023
@mroeschke mroeschke merged commit 7888cf4 into pandas-dev:main Mar 13, 2023
@mroeschke
Copy link
Member

Thanks @liang3zy22

@liang3zy22 liang3zy22 deleted the testnumericprecision branch March 13, 2023 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Groupby multiple columns causes agg to have precision loss on int64
3 participants