Skip to content

API: should values_for_factorize and _from_factorized round-trip missing values? #32673

Closed
@jbrockmendel

Description

@jbrockmendel

(fixtures make this so much harder to give a copy/pasteable example)

from pandas.tests.extension.json.test_json import *

dtype = JSONDtype()

data = make_data()
while len(data[0]) == len(data[1]):
        data = make_data()

data = JSONArray(data)

values = data._values_for_factorize()[0]
result = type(data)._from_factorized(values, data)
assert len(values) == len(result)  # <-- nope!

Am I wrong in thinking this assertion should hold? If we had a .equals method, i'd strengthen this assertion to assert result.equals(data)

cc @TomAugspurger @WillAyd

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementExtensionArrayExtending pandas with custom dtypes or arrays.Testingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions