python method of extracting a string for indeterminate delimiter cutting

  • 2021-01-06 00:40:11
  • OfStack

Problem: We need to extract characters from the data like Sand 1, with more than one delimiter and a lot of Spaces, such as:

The original string is as follows:


'asd ff gg; asd , foo| og '

We need to delete the top,; | separators and extra Spaces, extract:


['asd', 'ff', 'gg', 'asd', 'foo', 'og']

This kind of data is usually used for processing log or web page data extraction. Generally speaking, the data distribution regularity required in this kind of data is not too strong, and relatively scattered.

The processing results are as follows:


import re
line = 'asd ff gg; asd , foo| og '
data = re.split(r'[;,|\s]\s*',line)
for i in data:
 if i == '':
 data.remove(i)
print(data)

Related articles: