python Example of separating sentences with all punctuation marks
- 2021-07-18 08:24:49
- OfStack
Problem
Give a paragraph, which consists of short sentences, which may be separated by arbitrary punctuation marks. Want to extract all the short sentences.
Solve
Using re. split function and regular matching method, all short sentences are separated once.
import re
pattern = r',|\.|/|;|\'|`|\[|\]|<|>|\?|:|"|\{|\}|\~|!|@|#|\$|%|\^|&|\(|\)|-|=|\_|\+| , | . | , | ; | ' |'| " | " | · | ! | | … | ( | ) '
test_text = 'b,b.b/b;b\'b`b[b]b<b>b?b:b"b{b}b~b!b@b#b$b%b^b&b(b)b-b=b_b+b , b . b , b ; b ' b'b " b " b · b ! b b … b ( b ) b'
result_list = re.split(pattern, test_text)
print(result_list)
Output is
['b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b', 'b']
It can be seen that all b have been extracted.