Skip to content

bpo-32614: Modify re examples to use a raw string to prevent warning #5265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 2, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 19 additions & 4 deletions Doc/howto/regex.rst
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
disadvantage which is the topic of the next section.


.. _the-backslash-plague:

The Backslash Plague
--------------------

Expand Down Expand Up @@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
while ``"\n"`` is a one-character string containing a newline. Regular
expressions will often be written in Python code using this raw string notation.

In addition, special escape sequences that are valid in regular expressions,
but not valid as Python string literals, now result in a
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
which means the sequences will be invalid if raw string notation or escaping
the backslashes isn't used.


+-------------------+------------------+
| Regular String | Raw string |
+===================+==================+
Expand Down Expand Up @@ -457,10 +466,16 @@ In actual programs, the most common style is to store the
Two pattern methods return all of the matches for a pattern.
:meth:`~re.Pattern.findall` returns a list of matching strings::

>>> p = re.compile('\d+')
>>> p = re.compile(r'\d+')
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
['12', '11', '10']

The ``r`` prefix, making the literal a raw string literal, is needed in this
example because escape sequences in a normal "cooked" string literal that are
not recognized by Python, as opposed to regular expressions, now result in a
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
:ref:`the-backslash-plague`.

:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
result. The :meth:`~re.Pattern.finditer` method returns a sequence of
:ref:`match object <match-objects>` instances as an :term:`iterator`::
Expand Down Expand Up @@ -1096,11 +1111,11 @@ following calls::
The module-level function :func:`re.split` adds the RE to be used as the first
argument, but is otherwise the same. ::

>>> re.split('[\W]+', 'Words, words, words.')
>>> re.split(r'[\W]+', 'Words, words, words.')
['Words', 'words', 'words', '']
>>> re.split('([\W]+)', 'Words, words, words.')
>>> re.split(r'([\W]+)', 'Words, words, words.')
['Words', ', ', 'words', ', ', 'words', '.', '']
>>> re.split('[\W]+', 'Words, words, words.', 1)
>>> re.split(r'[\W]+', 'Words, words, words.', 1)
['Words', 'words, words.']


Expand Down
2 changes: 1 addition & 1 deletion Doc/howto/unicode.rst
Original file line number Diff line number Diff line change
Expand Up @@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
Arabic numerals::

import re
p = re.compile('\d+')
p = re.compile(r'\d+')

s = "Over \u0e55\u0e57 57 flavours"
m = p.search(s)
Expand Down
2 changes: 1 addition & 1 deletion Doc/library/re.rst
Original file line number Diff line number Diff line change
Expand Up @@ -345,7 +345,7 @@ The special characters are:

This example looks for a word following a hyphen:

>>> m = re.search('(?<=-)\w+', 'spam-egg')
>>> m = re.search(r'(?<=-)\w+', 'spam-egg')
>>> m.group(0)
'egg'

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Modify RE examples in documentation to use raw strings to prevent
:exc:`DeprecationWarning` and add text to REGEX HOWTO to highlight the
deprecation.