@@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
289
289
disadvantage which is the topic of the next section.
290
290
291
291
292
+ .. _the-backslash-plague :
293
+
292
294
The Backslash Plague
293
295
--------------------
294
296
@@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
327
329
while ``"\n" `` is a one-character string containing a newline. Regular
328
330
expressions will often be written in Python code using this raw string notation.
329
331
332
+ In addition, special escape sequences that are valid in regular expressions,
333
+ but not valid as Python string literals, now result in a
334
+ :exc: `DeprecationWarning ` and will eventually become a :exc: `SyntaxError `,
335
+ which means the sequences will be invalid if raw string notation or escaping
336
+ the backslashes isn't used.
337
+
338
+
330
339
+-------------------+------------------+
331
340
| Regular String | Raw string |
332
341
+===================+==================+
@@ -457,10 +466,16 @@ In actual programs, the most common style is to store the
457
466
Two pattern methods return all of the matches for a pattern.
458
467
:meth: `~re.Pattern.findall ` returns a list of matching strings::
459
468
460
- >>> p = re.compile('\d+')
469
+ >>> p = re.compile(r '\d+')
461
470
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
462
471
['12', '11', '10']
463
472
473
+ The ``r `` prefix, making the literal a raw string literal, is needed in this
474
+ example because escape sequences in a normal "cooked" string literal that are
475
+ not recognized by Python, as opposed to regular expressions, now result in a
476
+ :exc: `DeprecationWarning ` and will eventually become a :exc: `SyntaxError `. See
477
+ :ref: `the-backslash-plague `.
478
+
464
479
:meth: `~re.Pattern.findall ` has to create the entire list before it can be returned as the
465
480
result. The :meth: `~re.Pattern.finditer ` method returns a sequence of
466
481
:ref: `match object <match-objects >` instances as an :term: `iterator `::
@@ -1096,11 +1111,11 @@ following calls::
1096
1111
The module-level function :func: `re.split ` adds the RE to be used as the first
1097
1112
argument, but is otherwise the same. ::
1098
1113
1099
- >>> re.split('[\W]+', 'Words, words, words.')
1114
+ >>> re.split(r '[\W]+', 'Words, words, words.')
1100
1115
['Words', 'words', 'words', '']
1101
- >>> re.split('([\W]+)', 'Words, words, words.')
1116
+ >>> re.split(r '([\W]+)', 'Words, words, words.')
1102
1117
['Words', ', ', 'words', ', ', 'words', '.', '']
1103
- >>> re.split('[\W]+', 'Words, words, words.', 1)
1118
+ >>> re.split(r '[\W]+', 'Words, words, words.', 1)
1104
1119
['Words', 'words, words.']
1105
1120
1106
1121
0 commit comments