Java gets the pinyin of of a Chinese character

  • 2020-05-30 20:05:56
  • OfStack

According to the Chinese string, Java obtains the corresponding pinyin string or the pinyin initial string, etc., it needs to add the jar package:

Introducing pinyin4j - 2.5.0. jar package

Code implementation:


import java.util.regex.Matcher;
import java.util.regex.Pattern;

import net.sourceforge.pinyin4j.PinyinHelper;
import net.sourceforge.pinyin4j.format.HanyuPinyinOutputFormat;
import net.sourceforge.pinyin4j.format.HanyuPinyinToneType;
 /***
 *  Chinese character tools 
 * @author csharper
 * @since 2014.12.26
 *
 */
public class ChineseCharacterUtil {
 /***
 *  Turn Chinese characters into pinyin ( Take the first letter or spell it all out )
 * @param hanzi
 * @param full  Whether the whole put together 
 * @return
 */
 public static String convertHanzi2Pinyin(String hanzi,boolean full)
 {
 /***
 * ^[\u2E80-\u9FFF]+$  Matches all east Asian languages  
 * ^[\u4E00-\u9FFF]+$  Match simplified and traditional characters  
 * ^[\u4E00-\u9FA5]+$  Match the simplified 
 */
 String regExp="^[\u4E00-\u9FFF]+$";
 StringBuffer sb=new StringBuffer();
 if(hanzi==null||"".equals(hanzi.trim()))
 {
 return "";
 }
 String pinyin="";
 for(int i=0;i<hanzi.length();i++)
 {
 char unit=hanzi.charAt(i);
 if(match(String.valueOf(unit),regExp))// Is a Chinese character, then turn pinyin 
 {
 pinyin=convertSingleHanzi2Pinyin(unit);
 if(full)
 {
  sb.append(pinyin);
 }
 else
 {
  sb.append(pinyin.charAt(0));
 }
 }
 else
 {
 sb.append(unit);
 }
 }
 return sb.toString();
 }
 /***
 *  Convert individual characters into pinyin 
 * @param hanzi
 * @return
 */
 private static String convertSingleHanzi2Pinyin(char hanzi)
 {
 HanyuPinyinOutputFormat outputFormat = new HanyuPinyinOutputFormat();
 outputFormat.setToneType(HanyuPinyinToneType.WITHOUT_TONE);
 String[] res;
 StringBuffer sb=new StringBuffer();
 try {
 res = PinyinHelper.toHanyuPinyinStringArray(hanzi,outputFormat);
 sb.append(res[0]);// For polyphonic words, only the number 1 A pinyin 
 } catch (Exception e) {
 e.printStackTrace();
 return "";
 }
 return sb.toString();
 }
 
 /***
 * @param str  The source string 
 * @param regex  Regular expression 
 * @return  match 
 */
 public static boolean match(String str,String regex)
 {
 Pattern pattern=Pattern.compile(regex);
 Matcher matcher=pattern.matcher(str);
 return matcher.find();
 }

 public static void main(String[] args) {
 System.out.println(convertHanzi2Pinyin(" I'm Chinese 123abc",true));
 }
}

Operation results:

(1) the whole put together:

woshizhongguoren123abc

(2) initials:

wszgr123abc

conclusion

The above is the whole content of this article, I hope the content of this article to your study or work can bring 1 definite help, if you have questions you can leave a message to communicate.


Related articles: