07 Mar

Solr probe to check the indexes update frequence and live longer

Are you sure that the Solr Index Server of your production environment is updating your content?

As a consequence, are you sure that your Alfresco search indexes are correctly (and regularly) updated?

From a theoretical point of view, the answer is ‘yes’, but in a real life scenario, something could goes wrong and a message in our email box could arrive from your users, saying:

Angry_faceHei fellow, the Alfresco installation that you are maintaining cannot find the documents I have uploaded a lot of time ago: this is not acceptable!

In this post is described a project that has, as its main goal, to avoid this awful issue. The project is called Solr probe (solr-probe) ad it is available on GitHub, for Solr versions greater than 1.4 (the one used from a lot of Alfresco versions).

Read More

15 Jun

Scaling Big Data with Hadoop and Solr – Review

In this post I would like to share my personal review about the recent book from Packt Publishing called Scaling Big Data with Hadoop and Solr (Second Edition) by Hrishikesh Vijay Karambelkar.

The goal of the book is quite clear in the title too: describe in practice how Apache Hadoop and Apache Solr, help organizations to resolve the problem of information extraction from big data. Don’t you think this is a very interesting problem to face? I think so.

Read More

13 Oct

How to install Alfresco 5.0.a Community Edition on Ubuntu 14.04 LTS

alfresco logoAfter the development of my most viewed post sharing how to install Alfresco 4.2.c on Ubuntu 10.04 LTS (more than 35.000 views) and Windows Server 2008 R2 (more that 13.000 views), it’s time to update the tutorial to the newest major version of Alfresco: Alfresco Community Edition 5.0.a. Even if the 5-th version is quite different from the 4-th version, the installation process is more or less the same… but let’s describe exactly the differences. The operating system choosen for the tutorial is Ubuntu 14.04.01 LTS.

Differently from the other post, the tutorial is separated in two parts: the installation of the dependencies and the Alfresco installation. As we like and prefer, the installation is a step by step list of commands and tasks: simpler to understand, to do and to test. Hope you’ll be agree.

Read More

23 Jul

Solr doesn’t return more than 1,000 objects in Alfresco.

lucene_apacheOnce upon a time Alfresco used Apache Lucene as search engine….

This was great until you had particular needs like, for example, a long duration query or a query that retrieves a huge amount of objects. It was more than a year ago when I wrote a post talking how Alfresco retrieves 1,000 results maximum or query for a couple of minutes.

solrAs you can read in the post, the most suggested solution to the problem was to migrate the indexing engine to Apache Solr. At that time, Alfresco supported both the engines and considered Solr as its future.

Today Lucene and Solr are always supported and Solr is probably the most used, but regarding the same issue, probably something is coming back again.

>> https://issues.alfresco.com/jira/browse/ALF-20567(*) <<

As you can read from the JIRA issue, in Alfresco 4.2.e SOLR also returns a maximum of 1,000 results and to solve the issue is suggested to set the parameters below in the alfresco-global.properties file.

solr.query.maximumResultsFromUnlimitedQuery=10000
system.acl.maxPermissionChecks=10000

This could have a high impact on “big” queries or “long” queries so I would like to share this information with all of you to prevent problems or nights spent on the debugger. 😉

I hope this will help you.

Francesco Corti

(*) Thanks to Francesco Fornasari and Christian Tiralosi for the hint.

23 Jul

New FAQ: CmisRuntimeException with an Internal Server Error running A.A.A.R. with Solr Indexing

During the past months few users emailed me to ask for a support becausealfresco-pentaho they received an ‘Internal Server Error’ while running the A.A.A.R. v1.3. I would like to thank them for the energy they put to solve the problem and specially Gabriele Barbara that wrote the post I share below. The current post has been added to the A.A.A.R.’s FAQ hoping will help someone else.

Gabriele Barbara say:

Using the latest version of A.A.A.R. (v1.3) on Alfresco Community 4.2.c, Windows platform and MySQL database, you could get an error message.

ERROR 19-05 13:33:54,593 - Cmis Input documents before last update - Unxpected Error
ERROR 19-05 13:33:54,593 - Cmis Input documents before last update – org.apache.chemistry.opencmis.commons.exceptions.CmisRuntimeException: Internal Server Error

After running:

kitchen.bat /rep:"AAAR_Kettle" /job:"Get all" /dir:/Alfresco /user:admin /pass:admin /param:get_parents=true /level:Basic

The problem appears when the job goes to step “get nodes” (if in fact you are running

kitchen.bat /rep:"AAAR_Kettle" /job:"Get audit" /dir:/Alfresco /user:admin /pass:admin /level:Basic

 the job ends properly) and seems to be linked to the “SOLR SSL certificate”.

If in the file ‘<Alfresco>/tomcat/shared/classes/alfresco-global.properties’ there is the line:

index.subsystem.name=solr

You can solve the problem by following these steps:

  • Shutdown Tomcat
  • Comment out the line below and replace with:
index.subsystem.name=lucene
index.recovery.mode=FULL
  • Save the file
  • Start Tomcat
  • Modify again  the file ‘<Alfresco>/tomcat/shared/classes/alfresco-global.properties’ changing as described 
index.recovery.mode=AUTO
  • Save the changes and close the file

Now if you try again to run from the command line: 

kitchen.bat /rep:"AAAR_Kettle" /job:"Get all" /dir:/Alfresco /user:admin /pass:admin /param:get_parents=true /level:Basic

the execution of the job should be finished properly.