Java USES the regular expression of regex to match the Chinese instance code

2020-05-19 04:50:15
OfStack

Chinese only


/** 
* 22. Verify the characters  
*  expression  ^[\u4e00-\u9fa5]{0,}$ 
*  describe   Only Chinese characters  
*  Matching example   So moon  
*/ 
@Test 
public void a1() { 
Scanner sc = new Scanner(System.in); 
String input = sc.nextLine(); 
String regex = "^[\\u4e00-\\u9fa5]*$"; 
Matcher m = Pattern.compile(regex).matcher(input); 
System.out.println(m.find()); 
sc.close(); 
}

PS: here are two ways to write regular expressions in Java to match Chinese characters: 1. 2 is the direct use of Chinese characters;

Ex. :

(1) String str = "sunny ";


String regexStr = "[\u4E00-\u9FA5]";
str.regex(regexStr);

(2) String str = "fine ";


String regexStr = "[1- � ]";
str.regex(regexStr);

Description:

(1) currently, most Chinese characters on the Internet are judged by \ u4E00 - \ u9FA5, which is just the range of "Chinese, Japanese and Korean ideographic characters". However, this is not the whole range. If they are to be included in the whole range, we need their extended set, radical, pictographic characters, interscript letters and so on. You can see the simplified Chinese code in unicode

(2) "[1 - �]". Is the corresponding Chinese of \u4E00-\u9FA5. Specific uniocde2 Chinese query