android POST data encounter UTF 8 encoding (messy code) problem resolution

  • 2020-06-01 10:57:16
  • OfStack

Today, I encountered an bug: a piece of data from the client POST to the server caused an unknown exception on the server side. Server-side validation is an encoding conversion error. Therefore, network packets were intercepted for analysis, and it was found that the json data of client POST contained the following paragraph (hex form) :

... 61 64 20 b7 20 52 69 63 ...

That's the problem with b7. After checking the Unicode code table, we found that U+00b7 is MIDDLE DOT, and its UTF-8 should be c2 b7, but why does it become b7 in the data sent by the client?

Since the ormlite, gson, and async-http libraries are used in the system, they are checked one by one. It turned out that the text encoding was not specified when the data was sent to the server, causing async-http (actually apache common http client) to send the data in ISO-8559-1 format, U+00b7 to be encoded as b7, and an error occurred when the server tried to decode the data using UTF-8.

The code snippet that went wrong is as follows:


Gson gson = new Gson();
String json = gson.toJson(data);
StringEntity entity = new StringEntity(json);
httpClient.post(context, url, entity, "application/json", new TextHttpResponseHandler() ... );

Line 3, new StringEntity(json), does not specify the encoding, resulting in an error. Corrected as follows:

Gson gson = new Gson();
String json = gson.toJson(data);
StringEntity entity = new StringEntity(json, "utf-8");
httpClient.post(context, url, entity, "application/json;charset=utf-8", new TextHttpResponseHandler() ... );


Related articles: