07 Mar

Solr probe to check the indexes update frequence and live longer

Are you sure that the Solr Index Server of your production environment is updating your content?

As a consequence, are you sure that your Alfresco search indexes are correctly (and regularly) updated?

From a theoretical point of view, the answer is ‘yes’, but in a real life scenario, something could goes wrong and a message in our email box could arrive from your users, saying:

Angry_faceHei fellow, the Alfresco installation that you are maintaining cannot find the documents I have uploaded a lot of time ago: this is not acceptable!

In this post is described a project that has, as its main goal, to avoid this awful issue. The project is called Solr probe (solr-probe) ad it is available on GitHub, for Solr versions greater than 1.4 (the one used from a lot of Alfresco versions).

Description

Name: Solr probe (solr-probe)

Link: Hosted on GitHub

The probe aims to monitor the health status of a Solr Server. An automatic task will send a warning message to the configured email address, if an abnormal condition is detected, so you can take countermeasures promptly. The probe checks the availability of the Solr >= 1.4 index server (used by a lot of Alfresco versions, enterprise and community).

How it works

More in deep, the probe tests the frequence of the updates of Solr, through the standard reports available in JSON format at the endpoint below.

/solr/admin/cores?action=SUMMARY&wt=json

The probe reads that JSON summary and analyzes openedAt and registeredAt index dates. It then computes the time elapsed between the configured number of days. If the time is greater than the configured threshold, it reckons that the index generation or update is stuck, and it sends a warning email message.

The probe is:

  • Developed as a Java stand-alone self-consistent process (in terms of databases and libraries);
  • For persistence, it uses a local HSQLDB instance;
  • Uses a JVM 1.7 or higher;
  • Can be started as a process of the Operating System.
  • Can be scheduled via crontab expression.

How to build and schedule the probe

Building the probe it quite easy using GitHub. The first step is, for sure, download it with the command described below.

git clone https://github.com/fsforna/solr-probe.git

Before launching it, you need to configure the probe by editing the conf.properties file. Last but not least, run mvn package command to compile the source code. To execute the probe, simply use the command described below.

java -Djsse.enableSNIExtension=false -Dconf=/conf.properties -cp /YOUR_PATH_LIB/lib/*:/solr-probe-1.0-SNAPSHOT.jar it.tai.solr.SolrM

Putting the command above in a crontab task, everything is definitely easy to install.

What to do, if the automatic message arrives

So, what to do if the automatic message will arrive?

For sure, the information you have (in time) is that your Solr indexes are not changed for too much time and this could be very suspectious in a used environment. No matter if your environment is quite “static” but in average, the indexes not updated, definitely indicates an abnormal condition to consider.

For example, it may happen that the SOLR process went down due to an out-of-memory failure (which balancers normally fail to detect in a clustered environment). In our experience is happened that no errors (or warnings) were stored in the log files, even if the Solr process no longer indexed any content.

In case of using Alfresco

alfrescoIn case of using Alfresco (and we all know that Alfresco uses Solr as Indexing Engine) this issue could be caused as a side effect of a complex architecture where the Solr Servers are separated from the Alfresco Servers (this is a recommended architecture in terms of scalability).

In those cases, when the Alfresco Servers are not reachable during the Solr crowling, the indexing task could stops and fails as described above.

This could cause the issue with the effect that the content is not anymore indexed and not searchable by the users… and this is definitely a sad situation to manage in terms of service.

Conclusion

In this post is described the project called Solr probe (solr-probe). The project, hosted on GitHub, develops an useful and automatic service to monitor a particular aspect of the health status of a Solr Server. For any comment and feedback on the project, please leave a comment or contact directly Francesco Fornasari.

fsforna

Francesco Fornasari

alfresco-certified-engineeralfresco-certified-administrator

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.