Description
I'm trying to replace np.nan with None, so that I can query the parquet files from presto like is null
or is not null
.
I've done
df.column_name.replace(np.nan, None, inplace=True)
Expected it to fill 'nan' with None. But, it will some of the columns with the value from columns where it is not nan.
But I couldn't understand why it filled another additional fields, and why only some of the fields filled up why not all though it is not expected behaviour?
>>> data = [
... {'hello': 1, 'mad': 2, 'world': 3},
... {'mad': 2, 'world': 3},
... {'world': 1}
... ]
>>> df = pd.DataFrame(data)
>>> df
hello mad world
0 1.0 2.0 3
1 NaN 2.0 3
2 NaN NaN 1
>>> df.hello.dropna()
0 1.0
Name: hello, dtype: float64
>>> import numpy as np
>>> df.hello.replace(np.nan, None, inplace=True)
>>> df.hello
0 1.0
1 1.0
2 1.0
Name: hello, dtype: float64
>>>
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-1032-aws
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.4.0
Cython: 0.26.1
numpy: 1.13.1
scipy: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: 0.1.2
pandas_gbq: None
pandas_datareader: None