Example Code of python Filtering Chinese and English Punctuation Marks

  • 2021-07-18 08:26:52
  • OfStack

As shown below:


import re
 
#  Can't filter \\ \  Chinese () and-- 
r1 = u'[a-zA-Z0-9'!"#$%&\'()*+,-./:;<=>?@ ,. ? ★, … "" "" "? "" ' ' ! [\\]^_`{|}~]+'# Users can also customize filtering characters here 
 
 
#  The rules among people are also not completely filtered 
r2 = "[\s+\.\!\/_,$%^*(+\"\']+|[+ -! ,. ? , ~@# $ % ... &* () ]+"
 
 
# \\\ You can filter out the reverse horizontal bar and parallel bar, / Can filter out the forward horizontal bar and parallel bar, the first 1 English symbols are placed in brackets, and the first 2 Chinese symbols are placed in brackets, and the first 2 You can't have less before brackets | Otherwise, the filtering is incomplete 
r3 = "[.!//_,$&%^*()<>+\"'?@#-|:~{}]+|[ -! \\\\ ,. = ? ,: "" ' ' "" ""  $  … () ]+"
 
 
#  Remove parentheses and everything in parentheses 
r4 = "\\ " .*? " +|\\ " .*? " +|\\#.*?#+|[.!/_,$&%^*()<>+""'?@|:~{}#]+|[ -! \\\ ,. = ? ,: "" ' '  $  … () "" "" ]"
 
 
text = "\ Cui Yun, \\ I love =+ You! "I //"" They " ~ Knot / Marriage ' Right : :! This .! ! _# ? ? () a ' ' "$ $ Lord | Meaning () Not bad ...... ! "
 
 
print(re.sub(r1, , '', text))

Related articles: