Skip to content

Commit c7de1d7

Browse files
miss-islingtoncsabella
authored andcommitted
bpo-32614: Modify re examples to use a raw string to prevent warning (GH-5265) (#5499)
Modify RE examples in documentation to use raw strings to prevent DeprecationWarning. Add text to REGEX HOWTO to highlight the deprecation. Approved by Serhiy Storchaka. (cherry picked from commit 6677142) Co-authored-by: Cheryl Sabella <cheryl.sabella@gmail.com>
1 parent 29fd9ea commit c7de1d7

File tree

4 files changed

+24
-6
lines changed

4 files changed

+24
-6
lines changed

Doc/howto/regex.rst

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
289289
disadvantage which is the topic of the next section.
290290

291291

292+
.. _the-backslash-plague:
293+
292294
The Backslash Plague
293295
--------------------
294296

@@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
327329
while ``"\n"`` is a one-character string containing a newline. Regular
328330
expressions will often be written in Python code using this raw string notation.
329331

332+
In addition, special escape sequences that are valid in regular expressions,
333+
but not valid as Python string literals, now result in a
334+
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
335+
which means the sequences will be invalid if raw string notation or escaping
336+
the backslashes isn't used.
337+
338+
330339
+-------------------+------------------+
331340
| Regular String | Raw string |
332341
+===================+==================+
@@ -457,10 +466,16 @@ In actual programs, the most common style is to store the
457466
Two pattern methods return all of the matches for a pattern.
458467
:meth:`~re.Pattern.findall` returns a list of matching strings::
459468

460-
>>> p = re.compile('\d+')
469+
>>> p = re.compile(r'\d+')
461470
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
462471
['12', '11', '10']
463472

473+
The ``r`` prefix, making the literal a raw string literal, is needed in this
474+
example because escape sequences in a normal "cooked" string literal that are
475+
not recognized by Python, as opposed to regular expressions, now result in a
476+
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
477+
:ref:`the-backslash-plague`.
478+
464479
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
465480
result. The :meth:`~re.Pattern.finditer` method returns a sequence of
466481
:ref:`match object <match-objects>` instances as an :term:`iterator`::
@@ -1096,11 +1111,11 @@ following calls::
10961111
The module-level function :func:`re.split` adds the RE to be used as the first
10971112
argument, but is otherwise the same. ::
10981113

1099-
>>> re.split('[\W]+', 'Words, words, words.')
1114+
>>> re.split(r'[\W]+', 'Words, words, words.')
11001115
['Words', 'words', 'words', '']
1101-
>>> re.split('([\W]+)', 'Words, words, words.')
1116+
>>> re.split(r'([\W]+)', 'Words, words, words.')
11021117
['Words', ', ', 'words', ', ', 'words', '.', '']
1103-
>>> re.split('[\W]+', 'Words, words, words.', 1)
1118+
>>> re.split(r'[\W]+', 'Words, words, words.', 1)
11041119
['Words', 'words, words.']
11051120

11061121

Doc/howto/unicode.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
463463
Arabic numerals::
464464

465465
import re
466-
p = re.compile('\d+')
466+
p = re.compile(r'\d+')
467467

468468
s = "Over \u0e55\u0e57 57 flavours"
469469
m = p.search(s)

Doc/library/re.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -345,7 +345,7 @@ The special characters are:
345345

346346
This example looks for a word following a hyphen:
347347

348-
>>> m = re.search('(?<=-)\w+', 'spam-egg')
348+
>>> m = re.search(r'(?<=-)\w+', 'spam-egg')
349349
>>> m.group(0)
350350
'egg'
351351

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Modify RE examples in documentation to use raw strings to prevent
2+
:exc:`DeprecationWarning` and add text to REGEX HOWTO to highlight the
3+
deprecation.

0 commit comments

Comments
 (0)