BUG: test_str_encode[utf32] fails on big-endian machine

The `test_str_encode` test here:
https://github.com/pandas-dev/pandas/blob/4e40afd9d9a6206a1cab83b4fdb0365cd739f576/pandas/tests/extension/test_arrow.py#L2171-L2183
appears to encode to _native_ byte order, but the expected value `b"\xff\xfe\x00\x00a\x00\x00\x00b\x00\x00\x00c\x00\x00\x00"` is given in _little-endian_ order.

This causes the test to fail on big-endian systems such as s390x:
```
E   AssertionError: Series are different
E   
E   Series values are different (100.0 %)
E   [index]: [0, 1]
E   [left]:  [b'\x00\x00\xfe\xff\x00\x00\x00a\x00\x00\x00b\x00\x00\x00c']
E   [right]: [b'\xff\xfe\x00\x00a\x00\x00\x00b\x00\x00\x00c\x00\x00\x00']
E   At positional index 0, first diff: b'\x00\x00\xfe\xff\x00\x00\x00a\x00\x00\x00b\x00\x00\x00c' != b'\xff\xfe\x00\x00a\x00\x00\x00b\x00\x00\x00c\x00\x00\x00'
testing.pyx:173: AssertionError
```

<details>

INSTALLED VERSIONS
------------------
commit                : f538741432edf55c6b9fb5d0d496d2dd1d7c2457
python                : 3.12.2.final.0
python-bits           : 64
OS                    : Linux
OS-release            : 6.6.11-200.fc39.x86_64
Version               : #1 SMP PREEMPT_DYNAMIC Wed Jan 10 19:25:59 UTC 2024
machine               : x86_64
processor             : 
byteorder             : little
LC_ALL                : None
LANG                  : C.UTF-8
LOCALE                : C.UTF-8

pandas                : 2.2.0
numpy                 : 1.26.2
pytz                  : 2024.1
dateutil              : 2.8.2
setuptools            : 69.0.3
pip                   : 23.3.2
Cython                : 3.0.8
pytest                : 7.4.3
hypothesis            : 6.96.1
sphinx                : 7.2.6
blosc                 : None
feather               : None
xlsxwriter            : 3.1.9
lxml.etree            : 5.1.0
html5lib              : 1.1
pymysql               : 1.4.6
psycopg2              : 2.9.9
jinja2                : 3.1.3
IPython               : 8.21.0
pandas_datareader     : 0.10.0
adbc-driver-postgresql: None
adbc-driver-sqlite    : None
bs4                   : 4.12.3
bottleneck            : 1.3.7
dataframe-api-compat  : None
fastparquet           : None
fsspec                : 2024.2.0
gcsfs                 : 2023.6.0+1.g7cc53d9
matplotlib            : 3.8.2
numba                 : None
numexpr               : 2.8.5
odfpy                 : None
openpyxl              : 3.1.2
pandas_gbq            : None
pyarrow               : 15.0.0
pyreadstat            : None
python-calamine       : None
pyxlsb                : None
s3fs                  : None
scipy                 : 1.11.3
sqlalchemy            : 2.0.25
tables                : 3.9.2
tabulate              : 0.9.0
xarray                : 2023.8.0
xlrd                  : 2.0.1
zstandard             : 0.22.0
tzdata                : None
qtpy                  : 2.4.1
pyqt5                 : None
</details>


	@pytest.mark.parametrize("errors", ["ignore", "strict"])
	@pytest.mark.parametrize(
	"encoding, exp",
	[
	["utf8", b"abc"],
	["utf32", b"\xff\xfe\x00\x00a\x00\x00\x00b\x00\x00\x00c\x00\x00\x00"],
	],
	)
	def test_str_encode(errors, encoding, exp):
	ser = pd.Series(["abc", None], dtype=ArrowDtype(pa.string()))
	result = ser.str.encode(encoding, errors)
	expected = pd.Series([exp, None], dtype=ArrowDtype(pa.binary()))
	tm.assert_series_equal(result, expected)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: test_str_encode[utf32] fails on big-endian machine #57373

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BUG: test_str_encode[utf32] fails on big-endian machine #57373

Description

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions