Example Code of python Filtering Chinese and English Punctuation Marks
- 2021-07-18 08:26:52
- OfStack
As shown below:
import re
# Can't filter \\ \ Chinese () and--
r1 = u'[a-zA-Z0-9'!"#$%&\'()*+,-./:;<=>?@ ,. ? ★, … "" "" "? "" ' ' ! [\\]^_`{|}~]+'# Users can also customize filtering characters here
# The rules among people are also not completely filtered
r2 = "[\s+\.\!\/_,$%^*(+\"\']+|[+ -! ,. ? , ~@# $ % ... &* () ]+"
# \\\ You can filter out the reverse horizontal bar and parallel bar, / Can filter out the forward horizontal bar and parallel bar, the first 1 English symbols are placed in brackets, and the first 2 Chinese symbols are placed in brackets, and the first 2 You can't have less before brackets | Otherwise, the filtering is incomplete
r3 = "[.!//_,$&%^*()<>+\"'?@#-|:~{}]+|[ -! \\\\ ,. = ? ,: "" ' ' "" "" $ … () ]+"
# Remove parentheses and everything in parentheses
r4 = "\\ " .*? " +|\\ " .*? " +|\\#.*?#+|[.!/_,$&%^*()<>+""'?@|:~{}#]+|[ -! \\\ ,. = ? ,: "" ' ' $ … () "" "" ]"
text = "\ Cui Yun, \\ I love =+ You! "I //"" They " ~ Knot / Marriage ' Right : :! This .! ! _# ? ? () a ' ' "$ $ Lord | Meaning () Not bad ...... ! "
print(re.sub(r1, , '', text))