Python original string of raw strings usage instance

  • 2020-04-02 14:13:48
  • OfStack

This article illustrates the use of Python raw strings as an example and shares it with you for your reference. The details are as follows:
 
Python's raw strings are produced precisely because of the presence of regular expressions. The cause is the conflict between ASCII characters and regular expression special characters. For example, the special symbol "\b" represents a backspace key in ASCII characters, but "\b" is also a special symbol for regular expressions, meaning "match a word boundary."

In order for the RE compiler to treat the two characters "\b" as if they were strings you wanted to express, instead of a backspace key, you need to escape them with another backslash that says, "\b".

But doing so can complicate matters, especially if you have a lot of special characters in your regular expression string. In general, raw strings are often used to simplify the complexity of regular expressions.

In fact, many Python programmers define regular expressions using only raw strings.

The following example illustrates the difference between the backspace key "\b" and the regular expression "\b" (with or without the original string) :

>>> m = re.match('bblow', 'blow') # backspace, no match # backspace , There is no matching >>> if m is not None: m.group()
...
>>> m = re.match('\bblow', 'blow') # escaped , now it works # After escaping, it now matches
>>> if m is not None: m.group()
...
'blow'
>>> m = re.match(r'bblow', 'blow') # use raw string instead # Let's go to the original string >>> if m is not None: m.group()
...
'blow'

You may have noticed that we used "\d" in the regular expression without using the original string and without any problems. That's because there are no special characters in ASCII, so the regular expression compiler knows you're referring to a decimal number.

This feature of raw strings makes it easy to do things like create regular expressions. Regular expressions are strings that define advanced search matching methods and are typically composed of special symbols representing characters, groups, matching information, variable names, character classes, and so on. The regular expression module already contains enough symbols to be useful. But when you have to insert extra symbols to make special characters behave like normal characters, you get stuck in the "character number" trap! This is where the original string comes in handy.

With the exception of the original string notation (the letter "r" before the quotation marks), the original string has almost exactly the same syntax as a normal string. The 'r' can be lowercase or uppercase, the only requirement being that it should be close to the first quote. In the first of three examples, we need a backslash with an "n" instead of a newline character.

>>> 'n'
'n'
>>> print 'n'
>>> r'n'
'\n'
>>> print r'n'
n

In the next example, we can't open our README file. Why? Because '\t' and '\r' are treated as special symbols not in our file name, they are actually four separate characters in the file path.
>>> f = open('C:windowstempreadme.txt', 'r') Traceback (most recent call last):
File "", line 1, in ?
f = open('C:windowstempreadme.txt', 'r')IOError: [Errno 2] No such file or directory: 'C:\win- dows\tempreadme.txt'
>>> f = open(r'C:windowstempreadme.txt', 'r')>>> f.readline()
'Table of Contents (please check timestamps for last update!)n'
>>> f.close()

I hope this article has helped you with your Python programming.


Related articles: