Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import numpy as np
list_of_random_numbers = np.random.randint(0,100,size=10)
list_of_random_numbers_as_string = [str(x) for x in np.random.randint(0,100,size=10)]
list_of_random_numbers_as_string[0] = list_of_random_numbers_as_string[0] + "'"
df = pd.concat(
[pd.DataFrame({'value':list_of_random_numbers}),
pd.DataFrame({'value':list_of_random_numbers_as_string})], ignore_index=True)
df['value'] = df['value'].str.replace("'", "")
df.notnull().sum()
Issue Description
When replacing a string in a column with mixed data, this replaces all non-string data with NaN without a warning (silent failure).
See also: https://stackoverflow.com/q/43187436/3903778
Expected Behavior
Expected behaviour would either be an error or a warning, as this is the case when applying str.replace to data of non-string type (see example below).
import pandas as pd
import numpy as np
list_of_random_numbers = np.random.randint(0,100,size=10)
number_only_df = pd.DataFrame({'value': list_of_random_numbers})
number_only_df['value'] = number_only_df['value'].str.replace("'", "")
Fails with: AttributeError: Can only use .str accessor with string values!
Installed Versions
INSTALLED VERSIONS
commit : 73c6825
python : 3.8.11.final.0
python-bits : 64
OS : Darwin
OS-release : 21.1.0
Version : Darwin Kernel Version 21.1.0: Wed Oct 13 17:33:23 PDT 2021; root:xnu-8019.41.5~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.3.3
numpy : 1.20.3
pytz : 2021.3
dateutil : 2.8.2
pip : 21.0.1
setuptools : 52.0.0.post20210125
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.26.0
pandas_datareader: None
bs4 : None
bottleneck : 1.3.2
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None