Try the Model

Use the widget below to experiment with Tesseract. You can detect COCO classes such as people, vehicles, animals, household items.

Overview

Tesseract is a highly popular OCR engine and project, now primarily developed open-source.

Originally developed by Hewlett Packard (HP) between 1984 and 1994, it was created as a better alternative to other commercial OCR engines at the time which “failed miserably”. In 1995, it was in the top three OCR engines in terms of character accuracy. In 2006, Tesseract development was sponsored by Google until 2019. Version 4 of Tesseract added a machine learning-based technique called LSTM. Version 5 was released in 2021.

Initially supporting only English, it now supports over 100 languages natively, while having the ability to be trained to recognize more.

Tesseract License

Tesseract

is licensed under a

Apache-2.0

license.

Performance

Deploy a Tesseract API

You can use Roboflow Inference to deploy a

Tesseract

API on your hardware. You can deploy the model on CPU (i.e. Raspberry Pi, AI PCs) and GPU devices (i.e. NVIDIA Jetson, NVIDIA T4).

Below are instructions on how to deploy your own model API.

Label Data Automatically with Tesseract

You can automatically label a dataset using Tesseract with help from Autodistill, an open source package for training computer vision models. You can label a folder of images automatically with only a few lines of code. Below, see our tutorials that demonstrate how to use Tesseract to train a computer vision model.

No items found.