Realization of Picture and Character Recognition by Android

  • 2021-12-13 16:50:25
  • OfStack

Introduction

OCR, tess-two, openCV and other dizzy things are distinguished first. OCR, tess-two are picture and text recognition, while openCV is image recognition comparison. For more complex picture and text recognition requirements, SDK developed by Baidu Cloud Artificial Intelligence General Text Recognition can be used with higher accuracy

Runnable steps

1. Add dependencies


implementation 'com.rmtheis:tess-two:8.0.0'

2. Download the font recognition library (chi_sim. traineddata simplified Chinese, chi_tra. traineddata traditional Chinese, eng. traineddata English library)

3. For the size of apk, we need to copy the font recognition library file to the SD card directory, such as copy of eng. traineddata


private String mDataPath = Environment.getExternalStorageDirectory().getAbsolutePath() + File.separator;
private String mFilePath = mDataPath + File.separator + "tessdata" + File.separator + "eng.traineddata";
private void copyFile() {
        try {
            File mFile = new File(mFilePath);
            if (mFile.exists()) {
                mFile.delete();
            }
            if (!mFile.exists()) {
                File p = new File(mFile.getParent());
                if (!p.exists()) {
                    p.mkdirs();
                }
                try {
                    mFile.createNewFile();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }

            OutputStream os = new FileOutputStream(mFilePath);
            InputStream is = this.getAssets().open("eng.traineddata");
            byte[] buffer = new byte[1024];
            int len = 0;
            while ((len = is.read(buffer)) != -1) {
                os.write(buffer, 0, len);
            }
            os.flush();
            os.close();
            is.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

4. tess two initialization


TessBaseAPI baseApi;
baseApi = new TessBaseAPI();
baseApi.init(mDataPath, "eng");

5. Process bitmap pictures and identify their contents


//OCR Picture character recognition 
baseApi.setImage(bitmap);
String result = baseApi.getUTF8Text().replace(" ", "").toLowerCase();

6. He asked to look at the requirements, and the end of this article


Related articles: