DataFrame constructor is inconsistent when coercing values to strings with `dtype=str`.

When providing `dtype=str` to the DataFrame constructor, we're inconsistent about coercing values to strings.

When there's no overlap between the keys of `data` and columns, things are probably OK.

```python
In [4]: pd.DataFrame(index=[0, 1], columns=[0, 1], dtype=str)
Out[4]:
     0    1
0  NaN  NaN
1  NaN  NaN
```

(those values are np.nan).

But when there is an overlap between keys of `data` and columns, the newly introduced values are coerced to strings.


```python
In [8]: pd.DataFrame({'A': [1, 2]}, index=[0, 1], columns=['A', 'B'], dtype=str)
Out[8]:
   A    B
0  1  nan
1  2  nan
```

(everything in that dataframe is a string, like `"1"` or `"nan"`)

That's be cause `init_dict` relies on `arrays_to_mgr` to coerce the *values* to the `dtype`, and `arrays_to_mgr` only gets a single `dtype`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DataFrame constructor is inconsistent when coercing values to strings with `dtype=str`. #24388

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

DataFrame constructor is inconsistent when coercing values to strings with dtype=str. #24388

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

DataFrame constructor is inconsistent when coercing values to strings with `dtype=str`. #24388