How to install Fulltextsearch in Nextcloud with Elasticsearch and Tesseract OCR

In the second of a series of tutorials on enhancing nextcloud performance and functionality, we are going to implement full text search with Elasticsearch and Tesseract OCR.

This tutorial has been tested on nextcloud 13, 14 and 15 however the Tesseract OCR extension is not available for nextcloud 15 yet.

Prerequisites

Please ensure that the following plugins have been installed on your Nextcloud server instance.

  • Full text search – Bookmarks
  • Full text search – Elasticsearch Platform
  • Full text search – Files – Tesseract OCR
  • Full text search – Files
  • Full text search

Install Elasticsearch

Elasticsearch is a java based application so we will need to install open JDK. This can be installed by running the following command:

sudo apt-get install openjdk-8-jre

Once installed we can now add the Elasticsearch repository with:

sudo apt install apt-transport-https 
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add - 
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch.list 
sudo apt update

Once the repository has been added we can now go ahead and install Elasticsearch:

sudo apt install elasticsearch

In this instance we are going to bind Elasticsearch to the localhost which is 127.0.0.1 by running the following command:

sudo nano /etc/elasticsearch/elasticsearch.yml

And edit or add the following line:

network.host: 127.0.0.1

To reload the configuration file and enable the service we will need to run the following command:

sudo systemctl daemon-reload && sudo systemctl enable elasticsearch && sudo systemctl start elasticsearch

To check that the Elastic-server is running, enter the following:

curl -XGET '127.0.0.1:9200/?pretty'

All things going well you should see an output something like this:

Once we have verified that Elasticsearch is running we now need to install the ingest-attachment plugin which is required to facilitate searches with the file extensions of .pdf, .ppt, .xls etc:

sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install ingest-attachment

Finally, restart Elasticsearch with:

sudo systemctl restart elasticsearch