How to match version information with python regular expressions

  • 2021-08-21 20:49:37
  • OfStack

Problem description:

Use regular expressions to extract version number information from text, for example: 10.1.1 9.5 10.10.11

Read in text (. txt) and write out to text (. txt)

First, construct a regular expression:

pattern=Vpat= "I. (I.) *I"

Construct a regular expression: r'\d+\.(?:\d+\.)*\d+'


import re
pattern = r'\d+\.(?:\d+\.)*\d+'
f=open("F:\\xxxxxx\\banners.txt","r")
data=f.read()
f.close
result=re.findall(pattern,data)
f1=open("F:\\xxxxxx\\test1.txt","w")
for i in result:
 f1.write(i+'\n')
f.close

(1) re. match only matches the beginning of the string. If the beginning of the string does not conform to the regular expression, the match fails and the function returns None; re. search matches the entire string until one match is found.

(2) findall is to take out all the matching results

(3) A number can be added to the group () parentheses after it, which can be used to derive a specific line

(4)\ d means integer + is 1-infinity (that is, more than one)

(5) * is 0 to infinity # # # # (I.) * is more than 0 integers plus points

(6) '(?:)' No capture group

When you want to do something about a part of a rule as a whole, such as specifying its repetition times, you need to use the part of the rule with '(? Surround it with: 'And', not just a pair of parentheses, and you will get absolutely unexpected results.

Example: Match duplicate 'ab' in string


>>> s='ababab abbabb aabaab'

>>> re.findall( r'\b(?:ab)+\b' , s )

Results: ['ababab']

If you only use 1 pair of parentheses, see what happens:


>>> re.findall( r'b\(ab)+\b' , s )

Results: ['ab']

This is because if only one pair of parentheses is used, then this becomes a group (group). The use of groups is complicated.

Debugging process:


import re
#pattern = r'.*?(\d.*\d).*'
#pattern = r'\d\.\d\.\d'
#pattern = r'\d\.(?:\d\.)*\d'
#pattern = r'\d*\.(?:\d\.)*\d*'
#pattern = r'\d\.(\d\.)*\d'
pattern = r'\d+\.(?:\d+\.)*\d+'
f=open("F:\\shovat\\banners.txt","r")
data=f.read()
##data=f.readline()
f.close
#for line in data:
result=re.findall(pattern,data)
##print(result)
 # print(result)
 # print(result.group())
#t=(result.group())
 #t=(result.group(1))

f1=open("F:\\shovat\\test1.txt","w")
for i in result:
 f1.write(i+'\n')
f.close

banners.txt


ddd 1.1.1cisco ios software , c3750 software (c3750-ipbase-m),version
ddd 2.2.2 12.2(53)se,release softeware(fc2) 10.1.1 
ddd 3.3.3 technical support:http://www.cisco.com/techsupport
ddd 4.4.4 copyright (c) 1986-2009 by cisco systems,inc.
ddd 5.5.5 comiled sun 13-dec-09 16:25 by prod_rel_team
9.5

Recognition result:

test1.txt

1.1.1
2.2.2
12.2
10.1.1
3.3.3
4.4.4
5.5.5
9.5

Summarize


Related articles: