Skip to content

DOC: read_excel dtype parameter - str vs. object  #16655

Closed
@RobinFiveWords

Description

@RobinFiveWords

Code Sample, a copy-pastable example if possible

In [16]: import pandas as pd
    ...: import numpy as np
    ...: pd.DataFrame(['a', 1, np.nan]).to_excel('test.xlsx')
    ...: df_str = pd.read_excel('test.xlsx', dtype=str, names=['col_str'])
    ...: df_str['type_str'] = df_str.col_str.map(type)
    ...: df_obj = pd.read_excel('test.xlsx', dtype=object, names=['col_obj'])
    ...: df_obj['type_obj'] = df_obj.col_obj.map(type)
    ...: pd.concat([df_str, df_obj], axis=1)
    ...: 
Out[16]: 
  col_str       type_str col_obj         type_obj
0       a  <class 'str'>       a    <class 'str'>
1       1  <class 'str'>       1    <class 'int'>
2     nan  <class 'str'>     NaN  <class 'float'>

Problem description

I imagine read_excel's dtype parameter description should just read "Use object to preserve and not interpret dtype" and not "Use str or object".

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions