Skip to content

Commit 353d03f

Browse files
committed
Add tesseract OCR sample
1 parent 1967945 commit 353d03f

4 files changed

Lines changed: 129 additions & 0 deletions

File tree

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# OCR-Enhanced Barcode Validation
2+
3+
This project demonstrates how to use **Tesseract OCR** as a complementary technology to enhance barcode reading accuracy, especially when dealing with damaged or low-quality barcode images.
4+
5+
6+
## Features
7+
8+
- **Dual Recognition**: Attempts barcode scanning first, falls back to OCR if needed
9+
- **Multiple Format Support**: Handles various 1D barcode formats (Codabar, Code 128, etc.)
10+
- **Damage Resilience**: Extracts text from damaged or low-quality barcode images
11+
- **Detailed Output**: Provides barcode location coordinates and format information
12+
- **Digit Extraction**: Filters OCR results to extract only numeric characters
13+
14+
## Prerequisites
15+
- Obtain a [30-day free trial license](https://www.dynamsoft.com/customer/license/trialLicense/?product=dcv&package=cross-platform) for Dynamsoft Barcode Reader.
16+
- Python 3.6 or higher
17+
- Tesseract OCR installation
18+
- Windows: Install [Tesseract](https://github.com/UB-Mannheim/tesseract/wiki)
19+
- macOS:
20+
21+
```bash
22+
brew install tesseract
23+
```
24+
25+
- Linux:
26+
```bash
27+
sudo apt update
28+
sudo apt install tesseract-ocr -y
29+
sudo apt install libtesseract-dev -y
30+
```
31+
32+
- Python dependencies
33+
34+
```bash
35+
pip install dynamsoft-capture-vision-bundle pytesseract pillow
36+
```
37+
38+
## Quick Start
39+
1. Set the license key in `app.py`:
40+
```python
41+
error_code, error_message = LicenseManager.init_license("YOUR_LICENSE_KEY_HERE")
42+
```
43+
44+
2. Run the application:
45+
```bash
46+
python app.py
47+
```
48+
49+
## Sample Images
50+
The project includes two test images:
51+
- `codabar.jpg` - A clear Codabar barcode image
52+
- `damaged.png` - A damaged barcode image to demonstrate OCR fallback
53+

examples/official/tesseract/app.py

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
from PIL import Image
2+
import pytesseract
3+
import os
4+
import sys
5+
from dynamsoft_capture_vision_bundle import LicenseManager, EnumErrorCode, CaptureVisionRouter, EnumPresetTemplate
6+
def main():
7+
8+
print("**********************************************************")
9+
print("Welcome to Dynamsoft Barcode Reader")
10+
print("**********************************************************")
11+
12+
error_code, error_message = LicenseManager.init_license(
13+
"DLS2eyJoYW5kc2hha2VDb2RlIjoiMjAwMDAxLTE2NDk4Mjk3OTI2MzUiLCJvcmdhbml6YXRpb25JRCI6IjIwMDAwMSIsInNlc3Npb25QYXNzd29yZCI6IndTcGR6Vm05WDJrcEQ5YUoifQ==")
14+
if error_code != EnumErrorCode.EC_OK and error_code != EnumErrorCode.EC_LICENSE_CACHE_USED:
15+
print("License initialization failed: ErrorCode:",
16+
error_code, ", ErrorString:", error_message)
17+
else:
18+
cvr_instance = CaptureVisionRouter()
19+
while (True):
20+
image_path = input(
21+
">> Input your image full path:\n"
22+
">> 'Enter' for sample image or 'Q'/'q' to quit\n"
23+
).strip('\'"')
24+
25+
if image_path.lower() == "q":
26+
sys.exit(0)
27+
28+
if image_path == "":
29+
image_path = "codabar.jpg"
30+
31+
if not os.path.exists(image_path):
32+
print("The image path does not exist.")
33+
continue
34+
result = cvr_instance.capture(
35+
image_path, EnumPresetTemplate.PT_READ_BARCODES.value)
36+
if result.get_error_code() != EnumErrorCode.EC_OK:
37+
print("Error:", result.get_error_code(),
38+
result.get_error_string())
39+
else:
40+
41+
items = result.get_items()
42+
print('Found {} barcodes.'.format(len(items)))
43+
for item in items:
44+
format_type = item.get_format_string()
45+
text = item.get_text()
46+
print("Barcode Format:", format_type)
47+
print("Barcode Text:", text)
48+
49+
location = item.get_location()
50+
x1 = location.points[0].x
51+
y1 = location.points[0].y
52+
x2 = location.points[1].x
53+
y2 = location.points[1].y
54+
x3 = location.points[2].x
55+
y3 = location.points[2].y
56+
x4 = location.points[3].x
57+
y4 = location.points[3].y
58+
print("Location Points:")
59+
print("({}, {})".format(x1, y1))
60+
print("({}, {})".format(x2, y2))
61+
print("({}, {})".format(x3, y3))
62+
print("({}, {})".format(x4, y4))
63+
print("-------------------------------------------------")
64+
65+
66+
## Inovke Tesseract OCR
67+
result = pytesseract.image_to_string(Image.open(image_path))
68+
digits = ''
69+
for i in result:
70+
if ord(i) >= 48 and ord(i) <= 57:
71+
digits += i
72+
73+
print(f'OCR Result: {digits}')
74+
75+
if __name__ == "__main__":
76+
main()
134 KB
Loading
1.27 MB
Loading

0 commit comments

Comments
 (0)