Android上使用Tesseract识别文字

Joe.Ye • 2023-03-12 • Android-NDK

文字识别一般使用：tesseract-ocr
GitHub：https://github.com/tesseract-ocr/tesseract
训练数据：https://github.com/tesseract-ocr/tessdata

Android平台推荐的tesseract封装：tess-two
GitHub：https://github.com/rmtheis/tess-two

Tesseract及tess-two简介

Tesseract是一个HP和Google先后维护的开源OCR（Optical Character Recognition）引擎，3.0以后已经支持中文识别。tess-two是一个在Android上使用tesseract的实例，tess-two中有三个主要目录：

eyes-two：对leptonica的封装
tess-two：对Tesseract的封装
tess-two-test：ocr的测试代码

获取tess-two

git clone https://github.com/rmtheis/tess-two.git

或

dependencies {
    implementation 'com.rmtheis:tess-two:9.0.0'  //免编译方式
}

使用ndk编译tess-two和eyes-two

cd tess-two
ndk-build

cd eyes-two
ndk-build

简单调用示例

只需要新建Module导入tess-two中的tess-two即可，然后引用这个Library

//SD卡路径
private static final String SD_PATH = Environment.getExternalStorageDirectory();
//字典名
private static final String DICTIONARY = "/custom";

@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);
    TessBaseAPI baseApi = new TessBaseAPI();
    //记得要在SD卡的tessdata文件夹下放对应的字典文件，例如这里就放的是custom.traineddata
    baseApi.init(SD_PATH, DICTIONARY);
    baseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO);
    //记得要在对应的文件夹里放上要识别的图片文件，比如这里就在SD卡根目录放了test.png
    baseApi.setImage(new File(SD_PATH + "/test.png"));
    final String result= baseApi.getUTF8Text();
    //可以把result的值赋值给你的TextView
    baseApi.end();
}

版权声明：
作者：Joe.Ye
链接：https://www.appblog.cn/index.php/2023/03/12/use-tesseract-to-recognize-text-on-android/
来源：APP全栈技术分享
文章版权归作者所有，未经允许请勿转载。

THE END

OCR Tesseract

二维码

打赏

海报

Android上使用Tesseract识别文字

文字识别一般使用：tesseract-ocr GitHub：https://github.com/tesseract-ocr/tesseract 训练数据：https://github.com/tesseract-ocr/tessdata Android平台推……

Android下OpenCV实现人脸检测

<<上一篇

Android NDK调用C++标准库问题

下一篇>>

文章目录

关闭

搜索内容

Android上使用Tesseract识别文字

Tesseract及tess-two简介

获取tess-two

使用ndk编译tess-two和eyes-two

简单调用示例

取消回复

共有 0 条评论

热门文章

最新评论