Skip to content

Inconsistency on dtype and fillna #49251

Closed
@borisiskra

Description

@borisiskra

Python 3.9.7
Pandas: 1.3.5

Same result in
Python 3.9.12
Pandas 1.4.2

Using the following dictionary to create a df, all numbers are integers, 'drop' is boolean and 'status' is a string:

dct = {
 'name1': {'grade1': {'drop': True,
   'q1': 955,
   'q2': 7633,
   'q3': 0,
   'q4': 1670,
   'q5': 5963,
   'q6': 53384,
   'e1': 1535065,
   'e2': 0,
   'e3': 432747,
   'e4': 1102318},
  'grade2': {'drop': True,
   'q1': 54,
   'e1': 507,
   'e2': 0,
   'e3': 37,
   'e4': 470,
   'status': 'bad'}},
 'name2': {'grade1': {'drop': False,
   'q1': 70,
   'e1': 21706,
   'q2': 730,
   'e1': 317792,
   'status': 'good'},
  'grade2': {'drop': True,
   'q1': 11,
   'e4': 6414,
   'e1': 0,
   'e2': 605,
   'e3': 5809}}
      }

using the following code:

df =pd.concat( {k: pd.DataFrame.from_dict(v, 'index', dtype=str).fillna('EMPTY') for k, v in dct.items()}, axis=0, sort=False).reset_index()
get the following DataFrame:

	level_0	level_1	drop	q1	q2	q3	q4	q5	q6	e1	e2	e3	e4	status
0	name1	grade1	True	955	7633.0	0.0	1670.0	5963.0	53384.0	1535065	0	432747	1102318	EMPTY
1	name1	grade2	True	54	EMPTY	EMPTY	EMPTY	EMPTY	EMPTY	507	0	37	470	bad
2	name2	grade1	False	70	730.0	NaN	NaN	NaN	NaN	317792	EMPTY	EMPTY	EMPTY	good
3	name2	grade2	True	11	EMPTY	NaN	NaN	NaN	NaN	0	605.0	5809.0	6414.0	EMPTY

line 0 q3 and e2 where both 0, now one in 0 the other is 0.0
line 0 q1 = 995 is int, but q2 = 7633.0 is float, before been set as string
In lines 2 and 3, some NaN where fill as "EMPTY" others were not.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions