Skip to content

BUG: Failing tests on pyarrow_nightly CI  #54650

Closed
@amithkk

Description

@amithkk

Pandas version checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

When running CI against the code currently on the main branch, a couple of tests on the pyarrow_nightly suite seems to be failing:

_______________ TestArrowArray.test_fillna_series[duration[s]] ________________
[gw1] linux -- Python 3.11.4 /home/runner/micromamba/envs/test/bin/python3.11

self = <pandas.tests.extension.test_arrow.TestArrowArray object at 0x7facf9479000>
data_missing = <ArrowExtensionArray>
[<NA>, Timedelta('1 days 00:00:00')]
Length: 2, dtype: duration[s][pyarrow]

    def test_fillna_series(self, data_missing):
        fill_value = data_missing[1]
        ser = pd.Series(data_missing)
    
        result = ser.fillna(fill_value)
        expected = pd.Series(
            data_missing._from_sequence(
                [fill_value, fill_value], dtype=data_missing.dtype
            )
        )
>       tm.assert_series_equal(result, expected)

pandas/tests/extension/base/missing.py:111: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_testing/asserters.py:777: in assert_extension_array_equal
    _testing.assert_almost_equal(
testing.pyx:55: in pandas._libs.testing.assert_almost_equal
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   AssertionError: Series are different
E   
E   Series values are different (50.0 %)
E   [index]: [0, 1]
E   [left]:  [0 days 00:00:00, 1 days 00:00:00]
E   [right]: [1 days 00:00:00, 1 days 00:00:00]
E   At positional index 0, first diff: 0 days 00:00:00 != 1 days 00:00:00

testing.pyx:173: AssertionError
_______________ TestArrowArray.test_fillna_series[duration[ms]] ________________
[gw1] linux -- Python 3.11.4 /home/runner/micromamba/envs/test/bin/python3.11

self = <pandas.tests.extension.test_arrow.TestArrowArray object at 0x7facf94791e0>
data_missing = <ArrowExtensionArray>
[<NA>, Timedelta('1 days 00:00:00')]
Length: 2, dtype: duration[ms][pyarrow]

    def test_fillna_series(self, data_missing):
        fill_value = data_missing[1]
        ser = pd.Series(data_missing)
    
        result = ser.fillna(fill_value)
        expected = pd.Series(
            data_missing._from_sequence(
                [fill_value, fill_value], dtype=data_missing.dtype
            )
        )
>       tm.assert_series_equal(result, expected)

pandas/tests/extension/base/missing.py:111: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_testing/asserters.py:777: in assert_extension_array_equal
    _testing.assert_almost_equal(
testing.pyx:55: in pandas._libs.testing.assert_almost_equal
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   AssertionError: Series are different
E   
E   Series values are different (50.0 %)
E   [index]: [0, 1]
E   [left]:  [0 days 00:00:00, 1 days 00:00:00]
E   [right]: [1 days 00:00:00, 1 days 00:00:00]
E   At positional index 0, first diff: 0 days 00:00:00 != 1 days 00:00:00

testing.pyx:173: AssertionError
________________ TestArrowArray.test_fillna_frame[duration[s]] _________________
[gw1] linux -- Python 3.11.4 /home/runner/micromamba/envs/test/bin/python3.11

self = <pandas.tests.extension.test_arrow.TestArrowArray object at 0x7facf94eb300>
data_missing = <ArrowExtensionArray>
[<NA>, Timedelta('1 days 00:00:00')]
Length: 2, dtype: duration[s][pyarrow]

    def test_fillna_frame(self, data_missing):
        fill_value = data_missing[1]
    
        result = pd.DataFrame({"A": data_missing, "B": [1, 2]}).fillna(fill_value)
    
        expected = pd.DataFrame(
            {
                "A": data_missing._from_sequence(
                    [fill_value, fill_value], dtype=data_missing.dtype
                ),
                "B": [1, 2],
            }
        )
    
>       tm.assert_frame_equal(result, expected)

pandas/tests/extension/base/missing.py:150: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_testing/asserters.py:777: in assert_extension_array_equal
    _testing.assert_almost_equal(
testing.pyx:55: in pandas._libs.testing.assert_almost_equal
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   AssertionError: DataFrame.iloc[:, 0] (column name="A") are different
E   
E   DataFrame.iloc[:, 0] (column name="A") values are different (50.0 %)
E   [index]: [0, 1]
E   [left]:  [0 days 00:00:00, 1 days 00:00:00]
E   [right]: [1 days 00:00:00, 1 days 00:00:00]
E   At positional index 0, first diff: 0 days 00:00:00 != 1 days 00:00:00

testing.pyx:173: AssertionError
________________ TestArrowArray.test_fillna_frame[duration[ms]] ________________
[gw1] linux -- Python 3.11.4 /home/runner/micromamba/envs/test/bin/python3.11

self = <pandas.tests.extension.test_arrow.TestArrowArray object at 0x7facf94eb4e0>
data_missing = <ArrowExtensionArray>
[<NA>, Timedelta('1 days 00:00:00')]
Length: 2, dtype: duration[ms][pyarrow]

    def test_fillna_frame(self, data_missing):
        fill_value = data_missing[1]
    
        result = pd.DataFrame({"A": data_missing, "B": [1, 2]}).fillna(fill_value)
    
        expected = pd.DataFrame(
            {
                "A": data_missing._from_sequence(
                    [fill_value, fill_value], dtype=data_missing.dtype
                ),
                "B": [1, 2],
            }
        )
    
>       tm.assert_frame_equal(result, expected)

pandas/tests/extension/base/missing.py:150: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/_testing/asserters.py:777: in assert_extension_array_equal
    _testing.assert_almost_equal(
testing.pyx:55: in pandas._libs.testing.assert_almost_equal
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   AssertionError: DataFrame.iloc[:, 0] (column name="A") are different
E   
E   DataFrame.iloc[:, 0] (column name="A") values are different (50.0 %)
E   [index]: [0, 1]
E   [left]:  [0 days 00:00:00, 1 days 00:00:00]
E   [right]: [1 days 00:00:00, 1 days 00:00:00]
E   At positional index 0, first diff: 0 days 00:00:00 != 1 days 00:00:00

testing.pyx:173: AssertionError

See logs from a fork I created to verify that this error exists on the main branch

Issue Description

I noticed this initially on a PR that I raised earlier (#54643). Since the PR doesn't touch any functionality that would potentially cause that test to break, I tested it on a copy of the main branch and could reproduce it.

This also seems to be failing on other PR CI Builds that have run today:

Expected Behavior

The tests pass

Installed Versions

Version built from main(9d70a49)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arrowpyarrow functionalityCIContinuous IntegrationUpstream issueIssue related to pandas dependency

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions