Python regular expressions tutorial one: the basics

  • 2020-05-26 09:33:39
  • OfStack

preface

Someone mentioned a requirement before, I think it is best to use regular expressions for this requirement. Considering that every time I used regular expressions before, I had to cram at the eleventh hour, so this time I systematically learned regular expressions once while completing task 1. The main reference is a video Regular Expressions on PyCon2016.

I'll summarize the regular expressions in several articles.

Here is part 1, the basics:

Based on some

This is a summary of the most basic USES of regular expressions, most of which are familiar to me (and to most programmers), so I'm going to skip over them and just give a few examples.

.all characters except line feed

^ the beginning of a line

$end-of-line

[abcd] one of the characters of abcd

[^abcd] any character other than abcd

[a-d] is equivalent to [abcd]

[a-dz] is equivalent to [abcdz]

\b word boundary

\w alphanumeric or underlined equivalent [a-zA-Z0-9_]

\W is the opposite of \w

\d number, equivalent to [0-9]

\D is the opposite of \d

\s blank character equivalent to [\t\n\r\f\v]

\S is the opposite of \s

{5} the regular expression part (the same below) appears exactly 5 times before this

{2,5} ~ occurs 2 to 5 times

{2,} ~ occurs twice or more times

{,5} ~ occurs 0 to 5 times

* ~ occurs 0 or more times

? ~ occurs 0 or 1 times

+ ~ occurs once or more times

ABC|DEF matches ABC or DEF

\ escapes characters such as \ for match * and \$for match $*

\b, \ to illustrate 1 with the following examples:

\ b:


>>> re.search(r'\bhello\b', 'hello')
<_sre.SRE_Match object; span=(0, 5), match='hello'>
>>> re.search(r'\bhello\b', 'hello world')
<_sre.SRE_Match object; span=(0, 5), match='hello'>
>>> re.search(r'\bhello\b', 'hello,world')
<_sre.SRE_Match object; span=(0, 5), match='hello'>
>>> re.search(r'\bhello\b', 'hello_world') 
>>> 

In fact, \b is roughly the same as \W1, but \b can match non-display class characters such as the first and last line, while \W cannot.

\:


>>> re.search(r'\$100', '$100')
<_sre.SRE_Match object; span=(0, 4), match='$100'>
>>> re.search(r'$100', '$100') 
>>> 

To match characters that have special meanings in regular expressions, such as $, ^, *, etc., you need to escape with \.

raw string:

In addition, in the previous example, the pattern string (pattern) was preceded by an r, which means raw string. The Pyhton interpreter does not need to escape the following string. Because, \ in Python strings and regular expressions have special meanings, so if not raw string, so to say 1 \ characters, will need 4 \ [to escape the first one in the Python interpreter, 2 1 \ \ said, the remaining 2 \, escape once again in the regular expression, eventually the remaining 1 \). Such as:


>>> re.search(r'\bhello\b', 'hello')
<_sre.SRE_Match object; span=(0, 5), match='hello'>
>>> re.search('\bhello\b', 'hello') 
>>> re.search('\\bhello\\b', 'hello')
<_sre.SRE_Match object; span=(0, 5), match='hello'>

>>> re.search('\\\\hello\\\\', '\\hello\\') 
<_sre.SRE_Match object; span=(0, 7), match='\\hello\\'>
>>> re.search(r'\\hello\\', '\\hello\\') 
<_sre.SRE_Match object; span=(0, 7), match='\\hello\\'>
>>> print('\\hello\\')
\hello\

conclusion

That's all you need to know about the basics of Python regular expressions. For 1 special case, you need to master another 1 advanced usage, please look forward to the following articles. I hope the content of this article can bring you a definite help in your study or work. If you have any questions, you can leave a message for communication. If you have any questions, you can leave a message for communication.


Related articles: