The ultimate solution to the Linux problem

  • 2020-05-12 06:44:35
  • OfStack

Programmers who are new to linux are often greeted with garbled code. It can be said that "the beginning of chaos eventually abandoned." There are many people who give up linux because of garbled codes. Well, to get to the point, let's take a look at 1 of the specific solutions to the linux mess.

Method 1: modify /root/.bash_profile file and add export LANG= zh_CN.GB18030

The file is in the user directory and must be modified for other users as well.

Using this method, putty displays Chinese, but the desktop is in English, and all web pages in Chinese are still garbled

Method 2:
Modify/etc/sysconfig/i18n files


#LANG="en_US.UTF-8"
#SUPPORTED="en_US.UTF-8:en_US:en"
#SYSFONT="latarcyrheb-sun16"

Is amended as:


LANG="zh_CN.GB18030"
LANGUAGE="zh_CN.GB18030:zh_CN.GB2312:zh_CN"
SUPPORTED="zh_CN.GB18030:zh_CN:zh"
SYSFONT="lat0-sun16"
SYSFONTACM="8859-15"

Reference:

Linux Chinese garbled code problem

Recently, the company had a problem with Chinese garbled codes when transferring data between XP system and LINUX!

First, the character set:

Chinese character code:

* GB2312 is a simplified Chinese character set, known as GB2312(80), consisting of 6763 simplified Chinese characters. * BIG5 is a collection of traditional Chinese characters in Taiwan, including 13053 traditional Chinese characters in national standard. * the GBK font set is a simplified font set, including the GB font set, BIG5 font set and some symbols. It contains a total of 21003 characters. * GB18030 is a mandatory large character set standard formulated by the state, which is fully known as GB 18030-2000. The introduction of GB 18030-2000 makes the Chinese character set a "big one" standard.

ASCII:

American Standard Code for Information Interchange, American standard code for information interchange. The most widely used character set and its encoding in computers today is developed by the national bureau of standards (ANSI). It has been established as an international standard by the international organization for standardization (ISO), known as ISO 646. The ASCII character set consists of control characters and graphic characters. In the memory cell of a computer, one ASCII code value is one byte (eight base 2 bits), and its highest bit (b7) is used as a parity bit. The so-called parity check, refers to the code in the process of transmission used to check whether there is an error 1 method, 1 kind of odd check and even check two. Odd check rule: the number of 1 in 1 byte of the correct code must be odd. If it is not odd, add 1 to the highest bit b7.

Parity: the correct code must have an even number of 1's in 1 byte. If it is not even, add 1 to the highest bit, b7.

UTF:
The way Unicode is implemented is different from the way it is coded. The Unicode encoding of 1 character is determined, but in the actual transmission process, the implementation method of Unicode encoding is different due to the design difference of different system platforms and for the purpose of saving space. The way Unicode is implemented is called Unicode conversion format (Unicode Translation Format, UTF for short). * UTF-8:8bit variable-length encoding. For most commonly used character sets (0 to 127 characters in ASCII), it USES only single byte, while for other commonly used characters (especially Korean and Chinese characters), it USES 3 bytes. * UTF-16:16bit encoding, which is variable length code, roughly equivalent to 20-bit encoding, with values between 0 and 0x10FFFF, is basically the implementation of unicode encoding, related to CPU word order.

Note: ASCII char (2); UTF-8 wide characters wchar 4 times. The most compatible code is UTF-8! After all, GBK/GB2312 is the domestic standard, and UTF-8 is the most common language in the coding world when a lot of foreign open source software is used.

Linux USES locale to set up the different locale in which the program will run. locale is supported by ANSI C. The naming rule for locale is < language > _ < region > . < Character set coding > , zh_CN.UTF-8, zh for Chinese, CN for mainland, UTF-8 for character set.

In the locale environment, there is a set of variables representing the different Settings in the internationalization environment:

1. LC_COLLATE
Define the sorting and comparison rules for the environment

2. LC_CTYPE
Used for character classification and string processing, controlling how all characters are processed, including character encoding, whether a character is a single byte or multiple bytes, how to print, etc. Is one of the most important environment variables.

3. LC_MONETARY
Currency format

4. LC_NUMERIC
Non-monetary digital display format

5. LC_TIME
Time and date format

6. LC_MESSAGES
The language of the message. There is also an LANGUAGE parameter, which is similar to LC_MESSAGES, but if this parameter is set by 1 denier, then the LC_MESSAGES parameter becomes invalid. LANGUAGE parameters can be set up at the same time a variety of language information, such as LANGUANE = "zh_CN. GB18030: zh_CN. GB2312: zh_CN".

7. LANG
The default value for LC_* is the lowest level setting, which is used if LC_* is not set. Similar to LC_ALL.

8. LC_ALL
It is a macro, and if it is set, it overrides all the Settings for LC_*. Note that the value of LANG is not affected by this macro.

Example:

Before setting, use the default locale:

Code example:


[root@ahlinux ~]# locale
LANG="POSIX"
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

After setting, use zh_CN. GDK Chinese locale:

Code example:


[root@ahlinux ~]# export LC_ALL=zh_CN.GBK
[root@ahlinux ~]# locale
LANG=zh_CN.UTF-8
LC_CTYPE="zh_CN.GBK"
LC_NUMERIC="zh_CN.GBK"
LC_TIME="zh_CN.GBK"
LC_COLLATE="zh_CN.GBK"
LC_MONETARY="zh_CN.GBK"
LC_MESSAGES="zh_CN.GBK"
LC_PAPER="zh_CN.GBK"
LC_NAME="zh_CN.GBK"
LC_ADDRESS="zh_CN.GBK"
LC_TELEPHONE="zh_CN.GBK"
LC_MEASUREMENT="zh_CN.GBK"
LC_IDENTIFICATION="zh_CN.GBK"
LC_ALL=zh_CN.GBK

"C" is the system default locale, and "POSIX" is the alias for "C". So when we install a new system, the default locale is either C or POSIX.
The way to install locales in Debian is as follows:

Install the locales package with the apt-get install locales command · after the installation of locales package is completed, the system will automatically configure locale. You only need to select the required locale and you can choose more than one. Finally, specify 1 system default locale. The system will automatically generate the corresponding locale and the configured locale. It is also easy to add a new locale, just reconfigure locale with dpkp-reconfigure locales. We can also manually add locale, simply add the new locale to the /etc/ locale.gen file and run the locale-gen command to generate the new locale. The system's locale can be set by setting the LC_* variable described above. Below is a sample locale.gen file.

Code example:


# This file lists locales that you wish to have built. You can find a list
# of valid supported locales at /usr/share/i18n/SUPPORTED. Other
# combinations are possible, but may not be well tested. If you change
# this file, you need to rerun locale-gen.
#
zh_CN.GBK GBK
zh_CN.UTF-8 UTF-8

In my opinion, as long as we make clear LANG and SUPPORTED, we will have OK. Otherwise, we may not use them too much.

Here's how to set environment variables.

Modify/etc/sysconfig/i18n files, such as

Code example:


LANG="en_US.UTF-8" . xwindow It will display an English interface, 
LANG="zh_CN.GB18030" . xwindow Chinese interface will be displayed. 

One kind of method cp/etc/sysconfig/i18n $HOME /. i18n

Modify $HOME /. i18n files, such as

Code example:


LANG="en_US.UTF-8" . xwindow It will display an English interface, 
LANG="zh_CN.GB18030" . xwindow Chinese interface will be displayed. 

This allows you to change your own interface language without affecting other users

The modified/etc/sysconfig/i18n file as follows:

Code example:


LANG="en_US.UTF-8"
SUPPORTED="zh_CN.GB18030:zh_CN:zh:en_US.UTF-8:en_US:en"
SYSFONT="latarcyrheb-sun16"
LC_ALL="en_US.UTF-8"
export LC_ALL

Restart after setup or enable with rc.local

Or modify the.bash_profile file of the logged-in user

Code example:


export LANG=zh_CN.GB18030
export LANGUAGE=zh_CN.GB18030:zh_CN.GB2312:zh_CN

Be aware that Windows XP is the encoding of GB2312. If your server's character set is not this, it will probably be scrambled, so adjust it.

Some people said that I changed the system environment variable during the adjustment, which caused the user content to display garbled code. There are only two ways to solve this problem:

1. Convert iconv to the current code

2. Use your old code

After looking at these two, you definitely have to figure out what your original character encoding was. So, LANG SUPPORTED and your original character set :)

Of course, locale-a you can look at the character sets currently supported on your system, and if you don't, install them.

The first two methods are very practical and I have tried them. Other methods are found on the Internet, hehe...

****************************

That is, when you pull it out of the database, when you put it into the linux file, you format it as a stream of characters. The code is as follows:

Code example:


FileOutputStream fos=new FileOutputStream(new File(filePath),true);
Writer out=new OutputStreamWriter(fos,"UTF-8");
out.write(s);
out.write("\n");
out.flush();
fos.close();
out.close();
**********************
vi .bash_profile
export lang=zh_CN
vi /etc/sysconfig/i18n
LANG="en_US.UTF-8"
SUPPORTED="en_US.UTF-8:en_US:en:zh_CN.GB18030:zh_CN:zh:zh_TW.big5:zh_TW:zh:ja_JP.UTF-8:ja_JP:ja:ko_KR.eucKR:ko_KR:ko"
SYSFONT="latarcyrheb-sun16"

Changing only the first one doesn't work, as if the second one is particularly important and must be changed.

1, console terminal garbled code

Add the following to the last line of the /etc/profile file:

Code example:


LANG="zh_CN.GB18030"
LANGUAGE="zh_CN.GB18030:zh_CN.GB2312:zh_CN"
SUPPORTED="zh_CN.GB18030:zh_CN:zh"
SYSFONT="lat0-sun16"
SYSFONTACM="8859-15"
0

2, xwindow terminal garbled code

At the end of the/etc sysconfig/i18n file 1 line add the following content:

Code example:


export LC_ALL="zh_CN.GB18030"

There are two types of garbled codes:

1. Scrambled code of terminal (pure shell interface)

Code example:


vi /etc/profile
export LC_ALL="zh_CN.GB18030:zh_CN.GB2312:zh_CN.GBK:zh_CN:en_US.UTF-8:en_US:en:zh:zh_TW:zh_CN.BIG5"

Save exit,reboot system can..

2.X-window(graphical interface

Code example:


LANG="zh_CN.GB18030"
LANGUAGE="zh_CN.GB18030:zh_CN.GB2312:zh_CN"
SUPPORTED="zh_CN.GB18030:zh_CN:zh"
SYSFONT="lat0-sun16"
SYSFONTACM="8859-15"
3

Save reboot...

A new linux virtual machine was launched, and the problem of Chinese scrambled codes appeared with VIM. I searched the data and found the solution:


LANG="zh_CN.GB18030"
LANGUAGE="zh_CN.GB18030:zh_CN.GB2312:zh_CN"
SUPPORTED="zh_CN.GB18030:zh_CN:zh"
SYSFONT="lat0-sun16"
SYSFONTACM="8859-15"
4

Change the content to

Code example:


LANG="zh_CN.GB18030"
LANGUAGE="zh_CN.GB18030:zh_CN.GB2312:zh_CN"
SUPPORTED="zh_CN.GB18030:zh_CN:zh"
SYSFONT="lat0-sun16"
SYSFONTACM="8859-15"
5

In this way, Chinese can be displayed normally in SSH and telnet terminals

Among them, the content of the main modification is zh_CN.GB18030, in which the content of VI's personality is under the root directory, and the permission should be paid attention to.

Every time you install linux and connect it with SSH, the Chinese characters are always displayed in garble.

Solution: edit/etc/sysconfig/i18n, will the LANG = "zh_CN. UTF - 8" LANG = "zh_CN. GB2312" instead.

Disconnect and reconnect.

Annex 1, linux Chinese scrambled code problem solution.

The copy of the file sent from windows to linux is garbled. We want to display Chinese under linux. What should we do? We first test whether the Chinese under linux can be displayed normally. A :yes. Therefore, the problem is quite obvious. The copy of windows cannot be displayed, indicating that the format supported by windows and linux are different.
linux 1 generally USES the encoding of utf-8, while we edit the file on windows with the encoding of gb2312. So the Chinese coding will be messy. To correct this, it's actually quite simple to just convert the file to utf-8 and import ok.

Then use the following command to convert:


LANG="zh_CN.GB18030"
LANGUAGE="zh_CN.GB18030:zh_CN.GB2312:zh_CN"
SUPPORTED="zh_CN.GB18030:zh_CN:zh"
SYSFONT="lat0-sun16"
SYSFONTACM="8859-15"
6

(-f is the source code, -t converts the target code, test.txt source file, testutf8.txt generates the target code file)

Note: use iconv-l to view the system support encoding format. Of course, you can also add the encoding format:

The default is utf8, if you want to use another encoding such as GBK

Commands to manually change configuration files:


shell> vi /etc/sysconfig/i18n

Change LANG=" zh_CN.UTF-8 "to:


LANG="zh_CN.GBK"

Save and close, and run the following command to enable the configuration:


LANG="zh_CN.GB18030"
LANGUAGE="zh_CN.GB18030:zh_CN.GB2312:zh_CN"
SUPPORTED="zh_CN.GB18030:zh_CN:zh"
SYSFONT="lat0-sun16"
SYSFONTACM="8859-15"
9

Display the terminal character encoding as simplified Chinese:


shell> vi /etc/profile.d/chinese.sh

Add the following line:

Code example:


export LC_ALL=zh_CN.GBK
shell> source /etc/profile.d/Chinese.sh

Attachment 2: solve the Chinese garbled code problem of Java under Linux operating system.

After jdk15, just under the ~ / jre lib fonts/build a fallback directory, you want to use the font in java roast bei to this directory

Under the following methods in fc6 tests pass, it is assumed that the user's jre path for/usr java/jdk1 _03 jre / 6.0

Code example:


cd /usr/java/jdk1.6.0_03/jre/lib/fonts
sudo mkdir fallback

Will C: \ WINDOWS \ Fonts \ simsun ttc copied to/usr/java/jdk1. 6.0 _03 / jre/lib fonts/fallback folder
export LC_ALL = zh_CN. GB2312; export LANG= zh_CN.GB2312 is the most effective.

1. No matter which ssh client is used, font setting 1 must be set to display Chinese font.

2. The remote locale1 must be set to LANG= zh_CN.UTF-8

Modify/etc/profile

I'm going to add this 1 row


export LC_ALL=zh_CN.GBK

Attachment 3, SSH shows the problem of Chinese garbled codes

1), open/etc/sysconfig/i18n

Set to:

Code example:


LANG="zh_CN.GB2312"
LANGUAGE="zh_CN.GB18030:zh_CN.GB2312:zh_CN"
SUPPORTED="zh_CN.GB18030:zh_CN.GB2312:zh_CN.UTF-8:zh:en_US.UTF-8:en_US:en:ja_JP.UTF-8:ja_JP:ja"
SYSFONT="lat0-sun16"
SYSFONTACM="8859-15"

LANG=" zh_CN.GB2312 "is required (if you don't want Chinese characters to be scrambled!!).

Others can be changed according to their own needs.

2) open smb.conf

Add:

Code example:


display charset=cp936
unix charset=cp936
doc charset=cp936


Related articles: