Difference between revisions of "Elastic search and OCR"

Revision as of 23:13, 28 November 2016

Understanding integrated search

The integrated fulltext serach using Elatic search is a internal/active approach to indexing the content. Content will be added to a indexing queue every time it is updated - ensuring allways updated content, but consuming CPU ressources on the indexing server.

Beacuase file indexing is very CPU intensive, the file indexing functionality i seperated into a service that can run on a server seperated from te main application server. Anyway the fileindexer will run from a database queue.

The basic search service requires

TS file indeing service
Elastic search server

If PDF OCR functionality is needed the following components needs installation too

Ghostscript (PDF to TIFF conversion)
Tesseract (OCR library)

The above components for OCR must be installed on the file indexing server.

@@ Line 1: / Line 1: @@
 == Understanding integrated search ==
 The integrated fulltext serach using Elatic search is a internal/active approach to indexing the content.
-Content will be added to a indexing queue every time it is updated - ensuring allways updated content, but causing a performance degrade.
+Content will be added to a indexing queue every time it is updated - ensuring allways updated content, but consuming CPU ressources on the indexing server.
 Beacuase file indexing is very CPU intensive, the file indexing functionality i seperated into a service that can run on a server seperated from te main application server. Anyway the fileindexer will run from a database queue.
+The basic search service requires
+* TS file indeing service
+* Elastic search server
 If PDF OCR functionality is needed the following components needs installation too
@@ Line 13: / Line 17: @@
 === Setting up basic search service ===
+==== Install: TS file indexing service ====
+==== Install: Elastic search server ====
 === Adding OCR capability ===
+==== Install: Ghostscript binaries ====
+==== Install: Tesseract binaries ====

Difference between revisions of "Elastic search and OCR"

Revision as of 23:13, 28 November 2016

Contents

Understanding integrated search

Setting up basic search service

Install: TS file indexing service

Install: Elastic search server

Adding OCR capability

Install: Ghostscript binaries

Install: Tesseract binaries

Navigation menu

Difference between revisions of "Elastic search and OCR"

Revision as of 23:13, 28 November 2016

Understanding integrated search

Setting up basic search service

Install: TS file indexing service

Install: Elastic search server

Adding OCR capability

Install: Ghostscript binaries

Install: Tesseract binaries

Navigation menu

Search