Skip to content

Commit fbf8e82

Browse files
authored
[3.6] bpo-32614: Modify re examples to use a raw string to prevent wa… …rning (GH-5265) (GH-5500)
Modify RE examples in documentation to use raw strings to prevent DeprecationWarning. Add text to REGEX HOWTO to highlight the deprecation. Approved by Serhiy Storchaka. (cherry picked from commit 6677142)
1 parent f61951b commit fbf8e82

File tree

4 files changed

+26
-8
lines changed

4 files changed

+26
-8
lines changed

Doc/howto/regex.rst

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
289289
disadvantage which is the topic of the next section.
290290

291291

292+
.. _the-backslash-plague:
293+
292294
The Backslash Plague
293295
--------------------
294296

@@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
327329
while ``"\n"`` is a one-character string containing a newline. Regular
328330
expressions will often be written in Python code using this raw string notation.
329331

332+
In addition, special escape sequences that are valid in regular expressions,
333+
but not valid as Python string literals, now result in a
334+
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
335+
which means the sequences will be invalid if raw string notation or escaping
336+
the backslashes isn't used.
337+
338+
330339
+-------------------+------------------+
331340
| Regular String | Raw string |
332341
+===================+==================+
@@ -457,12 +466,18 @@ In actual programs, the most common style is to store the
457466
Two pattern methods return all of the matches for a pattern.
458467
:meth:`~re.pattern.findall` returns a list of matching strings::
459468

460-
>>> p = re.compile('\d+')
469+
>>> p = re.compile(r'\d+')
461470
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
462471
['12', '11', '10']
463472

464-
:meth:`~re.pattern.findall` has to create the entire list before it can be returned as the
465-
result. The :meth:`~re.pattern.finditer` method returns a sequence of
473+
The ``r`` prefix, making the literal a raw string literal, is needed in this
474+
example because escape sequences in a normal "cooked" string literal that are
475+
not recognized by Python, as opposed to regular expressions, now result in a
476+
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
477+
:ref:`the-backslash-plague`.
478+
479+
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
480+
result. The :meth:`~re.Pattern.finditer` method returns a sequence of
466481
:ref:`match object <match-objects>` instances as an :term:`iterator`::
467482

468483
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
@@ -1096,11 +1111,11 @@ following calls::
10961111
The module-level function :func:`re.split` adds the RE to be used as the first
10971112
argument, but is otherwise the same. ::
10981113

1099-
>>> re.split('[\W]+', 'Words, words, words.')
1114+
>>> re.split(r'[\W]+', 'Words, words, words.')
11001115
['Words', 'words', 'words', '']
1101-
>>> re.split('([\W]+)', 'Words, words, words.')
1116+
>>> re.split(r'([\W]+)', 'Words, words, words.')
11021117
['Words', ', ', 'words', ', ', 'words', '.', '']
1103-
>>> re.split('[\W]+', 'Words, words, words.', 1)
1118+
>>> re.split(r'[\W]+', 'Words, words, words.', 1)
11041119
['Words', 'words, words.']
11051120

11061121

Doc/howto/unicode.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
463463
Arabic numerals::
464464

465465
import re
466-
p = re.compile('\d+')
466+
p = re.compile(r'\d+')
467467

468468
s = "Over \u0e55\u0e57 57 flavours"
469469
m = p.search(s)

Doc/library/re.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -315,7 +315,7 @@ The special characters are:
315315

316316
This example looks for a word following a hyphen:
317317

318-
>>> m = re.search('(?<=-)\w+', 'spam-egg')
318+
>>> m = re.search(r'(?<=-)\w+', 'spam-egg')
319319
>>> m.group(0)
320320
'egg'
321321

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Modify RE examples in documentation to use raw strings to prevent
2+
:exc:`DeprecationWarning` and add text to REGEX HOWTO to highlight the
3+
deprecation.

0 commit comments

Comments
 (0)