Python read write Json involves processing in Chinese

  • 2020-05-10 18:28:46
  • OfStack

Today, when helping the front-end to prepare data, we need to change the data format into json format. To be honest, sometimes it is really painful when it comes to Chinese. Unless we have a better understanding of the encoding rules of Python, it is really painful to deal with it.

The entire logical

What we need to do is take some articles, generate multiple html files, and then use json to display the list of articles, pictures, summaries, and titles.

Train of thought

Data for future extensions, that must have a database, my idea is to write a simple web page as submit input, then post into the background after the entry into the database, to write a page featuring articles, right after the show effect, write a requests dynamic get down all the data generated 1 1 html document. The json data at the end I just pull the data out of the database to generate the rows.

The front end

In fact, the front end of the thing is very simple, recently 1 straight in writing web pages, so the front end of the thing minutes to do. The code is as follows:


urls.py

from django.conf.urls import url, include
from . import views


urlpatterns = {
  url(r'^$', views.index, name='index'),
  url(r'add_article/', views.add_article, name='add_article'),
  url(r'^article/(?P<main_id>\S+)/$', views.article, name='article'),
}
views.py

# coding=utf-8
from django.shortcuts import render
from .models import Tzxy

# Create your views here.


def index(request):
  return render(request, 'index.html')


def add_article(request):
  error = 'error'
  if request.method == 'POST':
    #  Access to the front request The content of the 
    main_id = request.POST['main_id']
    img_url = request.POST['img_url']
    title = request.POST['title']
    content = request.POST['content']
    abstract = content[:50]
    print main_id
    indb = Tzxy(
          main_id=main_id,
          img_url=img_url,
          title=title,
          content=content,
          abstract=abstract
          )
    indb.save()
    error = 'success'
    return render(request, 'index.html', {'error': error})
  return render(request, 'index.html')


def article(request, main_id):
  article_detial = Tzxy.objects.get(main_id=main_id)
  return render(request, 'views.html', {'content': article_detial})

models.py

from __future__ import unicode_literals
from django.db import models
from django.contrib import admin


class Tzxy(models.Model):
  main_id = models.CharField(max_length=10)
  img_url = models.CharField(max_length=50, null=True)
  title = models.CharField(max_length=50)
  content = models.TextField()
  abstract = models.CharField(max_length=200)

admin.site.register(Tzxy)

I just wrote a simple form for the template

index.html


<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Title</title>
  <link href="http://libs.baidu.com/bootstrap/3.0.3/css/bootstrap.min.css" rel="stylesheet">
  <script src="http://libs.baidu.com/jquery/2.0.0/jquery.min.js"></script>
  <script src="http://libs.baidu.com/bootstrap/3.0.3/js/bootstrap.min.js"></script>
</head>
<body>
<form method="post" action="/tzxy/add_article/">
{% csrf_token %}
main_id: <input type="text" name="main_id"><br>
img_url: <input type="text" name="img_url"><br>
title: <input type="text" name="title"><br>
{% if error == 'success' %}
  <div class="alert alert-success">{{ error }}</div>
{% endif %}
<textarea name="content" rows="25" style="width: 600px;"></textarea><br>
  <input type="submit" name="Submit">
</form>
</body>
</html>

Display page


{% load custom_markdown %}
<!DOCTYPE html>
<html lang="zh-cn">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="initial-scale=1.0,maximum-scale=1.0,minimum-scale=1.0,user-scalable=no" />
  <meta name="apple-touch-fullscreen" content="yes" />
  <meta name="apple-mobile-web-app-capable" content="yes" />
  <meta name="format-detection" content="telephone=no">
  <meta http-equiv="Cache-Control" content="no-store" />
  <meta http-equiv="Pragma" content="no-cache" />
  <meta http-equiv="Expires" content="0" />
  <title>{{ content.title }}</title>
  <link rel="stylesheet" href="../../css/cssreset.min.css">
  <link rel="stylesheet" href="../../css/fx_tzxy_content.min.css">
</head>
<body>

  <div class="page">
    <h1>{{ content.title }}</h1>
    <div class="content">
      {{ content.content | custom_markdown | linebreaksbr }}
    </div>
  </div>

</body>
</html>

Of course, I used markdown to process some data. For markdown integration, you can go to Django development blog (6) -- add markdown support
A small script to crawl the data is shown below, using the requests module


# coding=utf-8
import sys
import requests
reload(sys)
sys.setdefaultencoding('utf8')


def tohtml(file_name, startpos, endpos):
  """
   Request web page data after the web page source is stored as html format , Start the script first Django the Server
  :param file_name: Generates a prefix for the file name , The last 1 Bits are replaced by incoming Numbers 
  :param startpos: Starting number 
  :param endpos: Ending number 
  :return:None
  """

  for x in range(startpos, endpos):
    r = requests.get('http://127.0.0.1:8000/tzxy/article/' + file_name + str(x))
    with open('/Users/SvenWeng/Desktop/test/' + file_name + str(x) + '.html', 'w') as f:
      f.write(r.text)
  print 'success'


if __name__ == '__main__':
  tzhtl_name = 'tzxy_tzhtl_h_'
  djjyy_name = 'tzxy_djjyy_h_'
  tohtml(djjyy_name, 1, 39)

Inside 1 some names themselves can be modified as needed.

Generate json

To be honest, json is easy to use and Python supports json very well, but when it comes to Chinese, it hurts a bit. My code looks like this:


# coding=utf-8
import sqlite3
import json
import sys
reload(sys)
sys.setdefaultencoding('utf8')

list_json = []

conn = sqlite3.connect('db.sqlite3')
c = conn.cursor()
sql = 'select * from Tzxy_tzxy'
c.execute(sql)
all_thing = c.fetchall()

for x in all_thing:
  dic_member = {'id': x[1].split('_')[3],
         'img': x[2],
         'title': x[3],
         'abstract': ''}
  list_json.append(dic_member)
conn.close()

final_json = json.dumps(list_json, sort_keys=True, indent=4)
with open('test.json', 'w') as f:
  f.write(final_json)

The logic of the code is: define an empty list to hold the generated dictionary information, and then extract all the previously saved data from sqlite. Loop through the data to generate a dictionary of the format you want, one by one, and insert it into the list. Use the json.dumps method provided by Python to convert the data to json format and then write it to a file.
The logic seemed fine and the implementation was perfect, but when I finally opened the json file to check, I found that all the Chinese had been changed to Unicode. This is a total disaster.

Roughly checked under 1, as if said to the content on the Internet is not detailed, to examples are also very, very simple, directly to the Chinese, is not what I want, finally only crustily skin of head go to the official specification, finally found anything so 1 ensure_ascii = False, with this method, when you turn in Python Json namely


final_json = json.dumps(list_json, sort_keys=True, indent=4, ensure_ascii=False)

After this processing, write to the file is normal Chinese.


Related articles: