Python adds a specific character at the end of each line of a file that contains Chinese characters

  • 2020-05-17 05:54:20
  • OfStack

Recently, the continuity of programs and data file formats, can let a person see not bottom go to, using pandas won't open, 1 is straight io error. Take a closer look at, found that many lines of data file is "at the end, however, other line is missing, and demand is obvious: to judge if there is a" at the end of each line, if not, add it.

After all, what many people need is a quick solution, not an why. The solution is as follows:


 b = open('b_file.txt', w)
 with open('a_file.txt', 'r') as lines:
  for line in lines:
   line = line.strip()
   if not line.endswith(r'"'):
    line += r'"'
   line += '\n'
   b.write(line) 
 b.close()
 a.close()

The key to the whole process is


line = line.strip()

I had been lazy and had simply omitted the line above, but I stumbled in my judgment that each line did not end with ":


if not line.endswith(r'"')

Try it out and rewrite:


for line in open(data_path+'heheda.txt', 'r'):
 if not line[-2] == r'"':
  print line
  line = line[:-1] + r'"' + line[-1:]
  print line

At this point, the judgment condition is if not line[-2] == r'"' , in order to get the correct result except for the last row. For well-known reasons, in the windows system, the carriage return character of the file is "\r\n". Therefore, when the carriage return character is not properly processed by strip(), it is necessary to manually move the end of each line by 1 byte to determine the end of each line. And for the last line of the file, 1 is usually not a carriage return at the end, after all, you don't want to wrap. Therefore, line[-2] is positioned in the middle of the last Chinese character, and \xx\xx is hard written with \xx"\xx, so that the last character is displayed incorrectly.


Related articles: