Difference between revisions of "Constellio search/indexing"

Latest revision as of 11:51, 10 December 2021

Constellio search in Tempus Serva installations

Activate the search servlet in your installation

The search servlet is deactivated by default.

Edit the <tomcat>/webapps/<Tempus Serva>/WEB-INF/web.xml
Remove comments from the search servlet
Remove comments from the search filter

If you are using web container security please remove it from the search servlet: The servlet filter will handle authentication of crwaler robots using a specialized form of basic authetication (normal users will be redirected to the main servlet instead).

Option: Create a user for crawling

You will need at least 1 user for crawling the content in Tempus Serva, possibly more if content restrictions apply to different search user groups.

The following applies to crawling users

All group and policies will be respected through the indexing
No codeunits will be activated
No log entries will be created

You can test what the crawler will se by logging in with an extra parameter:

  /login?SearchIndexing=true

Prepare Constellio

Install Constellio

Download the 1.3 installer
Run the installer by doubleclicking the .jar file
Install to MySQL database
Run the Start constellio

Setting up a connector

Before setting up a connector create or choose a valid search scope

Choose connector type: auth-http-conector
Ensure that Use security is checked
Set start URL to: http://<server name>/TempusServa/search
Include the same URL in include patterns
Enter username for the crawler user (a valid TS user)
Enter password for the crawler user (a valid TS user)
After submitting the new connector, crawling/indexing will start by itself

No further actions are needed:

The search servet will automatically redirect real users after they click on a search result.

Option: Tweak search results

The search servlet wil automatically deliver content in a crude form, without any extra html such as wrappers. It will also provide the crawler with information about when it was last updated, and document Title will be se to current records Resume value.

You might consider excluding the command=list pages for better (less redundant) search results.

@@ Line 1: / Line 1: @@
-== Constilleio search / indexing ==
+== Constellio search in Tempus Serva installations ==
@@ Line 10: / Line 10: @@
 # Edit the <tomcat>/webapps/<Tempus Serva>/WEB-INF/web.xml
 # Remove comments from the search servlet
+# Remove comments from the search filter
+If you are using web container security please remove it from the search servlet: The servlet filter will handle authentication of crwaler robots using a specialized form of basic authetication (normal users will be redirected to the main servlet instead).
 ==== Option: Create a user for crawling ====
 You will need at least 1 user for crawling the content in Tempus Serva, possibly more if content restrictions apply to different search user groups.
+The following applies to crawling users
+* All group and policies will be respected through the indexing
+* No codeunits will be activated
+* No log entries will be created
+You can test what the crawler will se by logging in with an extra parameter:
+   /login?SearchIndexing=true
 === Prepare Constellio ===
@@ Line 36: / Line 46: @@
 # Enter username for the crawler user (a valid TS user)
 # Enter password for the crawler user (a valid TS user)
+# After submitting the new connector, crawling/indexing will start by itself
+No further actions are needed:
+The search servet will automatically redirect real users after they click on a search result.
+==== Option: Tweak search results ====
+The search servlet wil automatically deliver content in a crude form, without any extra html such as wrappers. It will also provide the crawler with information about when it was last updated, and document Title will be se to current records Resume value.
+You might consider excluding the '''command=list''' pages for better (less redundant) search results.