json serialization summary for special characters

  • 2020-06-19 10:32:02
  • OfStack

preface

String in JSon data needs to deal with special characters when passing data. This article summarizes the json serialization operations of golang, rust, java and Python for special characters

So let's look at paragraph golang


package main

import (
 "encoding/json"
 "fmt"
)

func main() {

 data := map[string]string{
 "str0": "Hello, world",
 "str1": "<",
 "str2": ">",
 "str3": "&",
 }
 jsonStr, _ := json.Marshal(data)

 fmt.Println(string(jsonStr))
}

The output

[

{"str0":"Hello, world","str1":"\u003c","str2":"\u003e","str3":"\u0026"}

]

Let's start with rust


extern crate rustc_serialize;
use rustc_serialize::json;
use std::collections::HashMap;

fn main(){
 let mut data = HashMap::new();
 data.insert("str0","Hello, world");
 data.insert("str1","<");
 data.insert("str2",">");
 data.insert("str3","&");
 println!("{}", json::encode(&data).unwrap());
}
}

The results of

[

{"str0":"Hello, world","str2":" > ","str1":" < ","str3":" & "}

]

Let's look at python


import json

data = dict(str0='Hello, world',str1='<',str2='>',str3='&')

print(json.dumps(data))

The output

[

{"str0": "Hello, world", "str1": " < ", "str2": " > ", "str3": " & "}

]

Look at java's


import org.json.simple.JSONObject;

class JsonDemo
{
 public static void main(String[] args)
 {
 JSONObject obj = new JSONObject();

 obj.put("str0", "Hello, world");
 obj.put("str1", "<");
 obj.put("str2", ">");
 obj.put("str3", "&");

 System.out.println(obj);
 }
}

The output

[

{"str3":" & ","str1":" < ","str2":" > ","str0":"Hello, world"}

]

conclusion

As you can see, python, rust, and java serialize the four strings almost identically (with the exception of a slight change in the sequence of java serialization), golang is clearly correct < ,

> , & I escaped it, so what does the document say

[

// String values encode as JSON strings coerced to valid UTF-8, // replacing invalid bytes with the Unicode replacement rune. // The angle brackets " < " and " > " are escaped to "\u003c" and "\u003e" // to keep some browsers from misinterpreting JSON output as HTML. // Ampersand " & " is also escaped to "\u0026" for the same reason.

]

& Escaped to prevent some browsers from misinterpreting the JSON output to HTML,

while < , > It was forced to escape because golang thought these were invalid bytes (which is odd),

I'm fine if the technology stack is golang, but if cross-language and cross-department collaboration 1 needs to pay attention to this...


Related articles: