Skip to content

Should from_dict be annotated to allow sequence data values? #929

Closed
@Fruglemonkey

Description

@Fruglemonkey

Describe the bug
It's not explicitly documented, but Dataframe.from_dict allows for a Sequence of values to be passed in for data.

Is this something we want to annotate? It's fairly easy to demonstrate the failure case, but I'm not sure if this would be 'wider' than we'd want to allow for.

To Reproduce
Example code demonstrating the issue with mypy version 1.10.0, pandas 2.2.2

import pandas as pd

tuple_of_dicts = (
    {'A': 1, 'B': 0},
    {'A': 0},
)

list_of_dicts = [
    {'A': 1, 'B': 0},
    {'A': 0},
]

tuple_df = pd.DataFrame.from_dict(data=tuple_of_dicts)
list_df = pd.DataFrame.from_dict(data=list_of_dicts)

print(tuple_df.to_string())
print()
print(list_df.to_string())

I receive the following errors:

error: No overload variant of "from_dict" of "DataFrame" matches argument type "tuple[dict[str, int], dict[str, int]]" [call-overload]

and

error: No overload variant of "from_dict" of "DataFrame" matches argument type "list[dict[str, int]]" [call-overload]

The code however, runs as you'd expect:

python foo.py
   A    B
0  1  0.0
1  0  NaN

   A    B
0  1  0.0
1  0  NaN

Additional context
Happy to raise the PR to address this, just wanted to check what the intended behaviour is first before doing the work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    pandas_docsFor issues where there is a conflict in behavior with pandas docs and stubs that needs resolution

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions