143715fdcc
Add reuse compliance |
||
---|---|---|
.github/workflows | ||
LICENSES | ||
appinfo | ||
js | ||
lib | ||
templates | ||
.gitignore | ||
.scrutinizer.yml | ||
AUTHORS.md | ||
CHANGELOG.md | ||
LICENSE | ||
Makefile | ||
README.md | ||
REUSE.toml | ||
composer.json | ||
composer.lock |
README.md
files_fulltextsearch_tesseract
OCR your documents before index
Installation / Setup
-
install Tesseract
-
download language files from: https://github.com/tesseract-ocr/tessdata
-
copy language files into /usr/share/tessdata/ (or /usr/share/tesseract-ocr/tessdata/, depends on our distribution)
-
configure this app in the Full text search Admin panel
-
report bugs
more
devblog about PDF and OCR: https://daita.github.io/files-fulltextsearch-tesseract-ocr-pdf/