-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
API: Deprecate regex=True default in Series.str.replace #36695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 20 commits
52f6d78
3b9bcb2
c51e837
3761625
7583b7f
8e0d1a5
79e4400
ecc5786
cf659b0
628cfa3
f216022
e6c79a3
e6b3e0f
e3ce080
752d322
ee67284
c832a21
09b07e6
c8541da
bd31857
377a2ba
d6b4fbe
84dfe71
5cfbc04
1b53051
1931c31
20bfe16
6ff5955
cea44a7
4376bca
482d5c3
cd18347
f872011
c0a473e
727986e
f49f778
d3d155a
89013db
e78017c
0150b2b
e396ce5
e799b12
9f7545f
6be90a4
8a4a833
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -262,25 +262,30 @@ i.e., from the end of the string to the beginning of the string: | |
'', np.nan, 'CABA', 'dog', 'cat'], | ||
dtype="string") | ||
s3 | ||
s3.str.replace('^.a|dog', 'XX-XX ', case=False) | ||
s3.str.replace('^.a|dog', 'XX-XX ', case=False, regex=True) | ||
|
||
Some caution must be taken to keep regular expressions in mind! For example, the | ||
following code will cause trouble because of the regular expression meaning of | ||
``$``: | ||
Some caution must be taken when dealing with regular expressions! The current behavior | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can make this a note / warning There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you do this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also add a versionchanged tag There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a versionchanged tag. Do we also need one in the docstring? I'm not sure what's meant by note / warning since I've added a whatsnew note and warning. Do you mean put the text itself in the whatsnew? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no i mean a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we could also do a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
is to treat single character patterns as literal strings, even when ``regex`` is set | ||
to ``True``. (This behavior is deprecated and will be removed in a future version so | ||
that the ``regex`` keyword is always respected.) For example, the following code will | ||
cause trouble because of the regular expression meaning of ``$``: | ||
|
||
.. ipython:: python | ||
|
||
# Consider the following badly formatted financial data | ||
dollars = pd.Series(['12', '-$10', '$10,000'], dtype="string") | ||
|
||
# This does what you'd naively expect: | ||
dollars.str.replace('$', '') | ||
# Here $ is treated as a literal character | ||
dollars.str.replace('$', '', regex=True) | ||
dsaxton marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# But this doesn't: | ||
dollars.str.replace('-$', '-') | ||
# But here it is not | ||
dollars.str.replace('-$', '-', regex=True) | ||
|
||
# We need to escape the special character (for >1 len patterns) | ||
dollars.str.replace(r'-\$', '-') | ||
dollars.str.replace(r'-\$', '-', regex=True) | ||
|
||
# Or set regex equal to False | ||
dollars.str.replace('-$', '-', regex=False) | ||
dsaxton marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
If you do want literal replacement of a string (equivalent to | ||
:meth:`str.replace`), you can set the optional ``regex`` parameter to | ||
|
@@ -290,7 +295,7 @@ and ``repl`` must be strings: | |
.. ipython:: python | ||
|
||
# These lines are equivalent | ||
dollars.str.replace(r'-\$', '-') | ||
dollars.str.replace(r'-\$', '-', regex=True) | ||
dollars.str.replace('-$', '-', regex=False) | ||
|
||
The ``replace`` method can also take a callable as replacement. It is called | ||
|
@@ -306,7 +311,7 @@ positional argument (a regex object) and return a string. | |
return m.group(0)[::-1] | ||
|
||
pd.Series(['foo 123', 'bar baz', np.nan], | ||
dtype="string").str.replace(pat, repl) | ||
dtype="string").str.replace(pat, repl, regex=True) | ||
|
||
# Using regex groups | ||
pat = r"(?P<one>\w+) (?P<two>\w+) (?P<three>\w+)" | ||
|
@@ -315,7 +320,7 @@ positional argument (a regex object) and return a string. | |
return m.group('two').swapcase() | ||
|
||
pd.Series(['Foo Bar Baz', np.nan], | ||
dtype="string").str.replace(pat, repl) | ||
dtype="string").str.replace(pat, repl, regex=True) | ||
|
||
The ``replace`` method also accepts a compiled regular expression object | ||
from :func:`re.compile` as a pattern. All flags should be included in the | ||
|
@@ -325,7 +330,7 @@ compiled regular expression object. | |
|
||
import re | ||
regex_pat = re.compile(r'^.a|dog', flags=re.IGNORECASE) | ||
s3.str.replace(regex_pat, 'XX-XX ') | ||
s3.str.replace(regex_pat, 'XX-XX ', regex=True) | ||
|
||
Including a ``flags`` argument when calling ``replace`` with a compiled | ||
regular expression object will raise a ``ValueError``. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -449,3 +449,9 @@ def test_replace_with_compiled_regex(self): | |
result = s.replace({regex: "z"}, regex=True) | ||
expected = pd.Series(["z", "b", "c"]) | ||
tm.assert_series_equal(result, expected) | ||
|
||
def test_str_replace_regex_default_raises_warning(self): | ||
# https://github.com/pandas-dev/pandas/pull/24809 | ||
s = pd.Series(["a", "b", "c"]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you check the messages on this warning |
||
with tm.assert_produces_warning(FutureWarning, check_stacklevel=False): | ||
s.str.replace("^.$", "") |
Uh oh!
There was an error while loading. Please reload this page.