A.A.A.R. – FAQ

Need support?
Click here!

Installation and first run

How to check the extraction tasks?
How to check that all the Alfresco data are available into the analytic platform?
Why, after the first run, only the first 50,000 of the actions have been extracted?
Why, after the first run, my audit_action report is empty?
I have a log saying ‘Driver class ‘org.gjt.mm.mysql.Driver’ could not be found’. How to solve it?
Why do I get an error “Unsupported major.minor version 51.0″?
Why do I get an error “Invalid column for cmis:document cmis:objectid″?
Why do I get a CmisRuntimeException with an “Internal Server Error″?
Looking to the reports result I see the data are partially shown, why?
How to reduce the execution time of the ‘Get all’ job?
How to reduce the execution time of the ‘Get audit’ job?
How to reduce the execution time of the ‘Get nodes’ job?

Customizations

How to upload a report in a different Alfresco’s space?
I would like to avoid one report to be generated, how to do it?
How to init the data mart to start again from scratch with data?
Can I extract audit data from an Alfresco installed on a remote server?
Could I develop my own report? How to do it?
How to add a new report, developed by me, to the ones generated by AAAR?
How to increase or decrease the maximum number of actions, extracted for each execution of the ‘Get audit’/’Get all’ job?

Miscellaneous

What about the execution times in worst cases?
I would like to develop my own report but I don’t know how to use Pentaho Report Designer.
Is AAAR able to do something I need?


How to check the extraction tasks?

Starting from A.A.A.R. v4.4 a brand new feature has been introduced to verify the results of the extraction tasks. The feature is developed as an easy dashboard, available to the Administrator into the A.A.A.R. Wizard panel. To see it in action and see how it works, take a look at the video here.

[Up]

How to check that all the Alfresco data are available into the analytic platform?

Into the Extraction Dashboard described here, you can find also a panel dedicated to the Data Quality. Into the A.A.A.R. solution, “Data Quality” means: be sure that the actions (from audit trail), nodes (from the repository) and workflows (from the BPM) are entirely available into the A.A.A.R. Data Marts. To be sure that this happened during the extraction tasks, you can see a brief summary of the number of entities into Alfresco and into the A.A.A.R. Data Marts. To see it in action and see how it works, take a look at the video here

[Up]

Why, after the first run, only 50,000 of the actions have been extracted?

By default, the ‘Get all’ job extracts a maximum of 100,000 actions per time. The extraction is developed “incrementally” and this means that the first time it extracts from first recorded action to the 100,000-th, the second time it extracts from 100,001-st to 200,000-th and so on. For that reason, if your audit repository contains more than 100,000 actions, you’ll need more than one extraction to get it completely.

[Up]

Why, after the first run, my audit_action report is empty?

Are you on Alfresco Community 4.0.x? Please check this JIRA: https://issues.alfresco.com/jira/browse/MNT-6206

[Up]

Why do I get an error “Unsupported major.minor version 51.0″?

The CMIS Input plugin is compiled with Java 7 and probably in you are using another not supported version of the java compiler. If you have some problem, contact me and I’ll solve re-compiling the plugin.

[Up]

Why do I get an error “Invalid column for cmis:document cmis:objectid″?

If you get the message below:

Failed to execute script 'classpath*:alfresco/template/webscripts/org/alfresco/cmis/queries.post.cmisquery.js’:
Invalid column for cmis:document cmis:objectid

Modify the query under ‘/Alfresco/Staging/Utility/Set folder’s children’ as described below:

select
cmis:objectId as cmis_objectid,
cmis:name as cmis_name
from
cmis:document
where
in_folder(”${cmis_parentid}”)
and cmis:lastModificationDate >= TIMESTAMP ”${cmis_last_update}”
and cmis:contentStreamLength >= 0′);

In some environments, instead of ‘cmis:objectid’ should be used ‘cmis:objectId’ (pay attention to the camel case).

[Up]

Why do I get a CmisRuntimeException with an “Internal Server Error″?

During the command line execution you could get a message like this: …Cmis Input documents before last update – org.apache.chemistry.opencmis.commons.exceptions.CmisRuntimeException: Internal Server Error. The problem is probably related to the Solr Indexing of your Alfresco installation. If this is true you can read this post to solve it.

[Up]

Looking to the reports result I see the data are partially shown, why?

By default, some of the developed reports limits their result to the first 5,000 rows. It’s easily understandable that this configuration covers the average cases but should be reconsidered for the worst cases. If you consider your case as a “worst case”, I always suggest two different solutions: 1) remove the limits in the queries in one or more reports, 2) develop brand new reports, more close to your needs and business questions (and, of course, more tuned on your data and environment). If you think to remove the limits of the queries, please read here. If you think to develop new reports, please read here.

[Up]

How to reduce the execution time of the ‘Get all’ job?

The ‘Get all’ job invoke the ‘Get audit’ and ‘Get nodes’ jobs so you can invoke only one the one you need: ‘Get audit’ for the audit data and the ‘Get nodes’ for the repository analysis.

[Up]

How to reduce the execution time of the ‘Get audit’ job?

You can change the maximum amount of actions extracted at each execution of the ‘Get all’ job. To do it follow the instructions described here.

[Up]

I have a log saying ‘Driver class ‘org.gjt.mm.mysql.Driver’ could not be found’. How to solve it?

Your MySql driver in the Pentaho installation is missing. To solve it you can keet the mysql jar (mysql-connector-java-5.1.23-bin.jar) file in “/opt/data-integration/lib”. Read here for more details (thanks to bisana for the contribution).

[Up]

How to reduce the execution time of the ‘Get nodes’ job?

The ‘Get nodes’ job could need several minutes the first time you run, maybe hours depending on your repository structure. If you don’t want to wait you can reduce the execution time setting properly the ‘get_parents’ parameter. For instruction on how to do it, read here.

[Up]

How to upload a report in a different Alfresco’s space?

It’s enough to set the ‘alfresco_path’ field in the ‘dm_reports’ table of the ‘AAAR_DataMart’ database. By default the ‘alfresco_path’ is set to the root folder but you can customize it, report by report. To do it you can simply use your database administration GUI. Below the ‘dm_reports’ table with the default values.

img

[Up]

I would like to avoid one report to be generated, how to do it?

It’s enough to set the ‘is_active’ field to ‘N’ value, in the ‘dm_reports’ table of the ‘AAAR_DataMart’ database. To do it you can simply use your database administration GUI. Below the ‘dm_reports’ table with the default values.

img

[Up]

How to init the data mart to start again from scratch with data?

To init the AAAR_DataMart again, you have simply to execute the SQL in ‘AAAR_DataMart_init.sql’ file you can find in ‘/pentaho-solution/system/AAAR/endpoints/kettle/src/MySql’ or ‘/pentaho-solution/system/AAAR/endpoints/kettle/src/PostgreSql’ (depending on your adopted database). To execute the SQL you can use your favorite database client.

[Up]

Can I extract audit data from an Alfresco installed on a remote server?

Yes, simply changing the ‘url’ field of the ‘dm_dim_alfresco’ table in the ‘AAAR_DataMat’ database. To do this you can use your database administration GUI. Below the ‘dm_dim_alfresco’ table with the default values.

dm_dim_alfresco

[Up]

Could I develop my own report? How to do it?

Yes, develop your own report it’s possibile simply using the Pentaho Report Designer. Once you have developed it, save the ‘prpt’ file in the ‘/Public/AAAR’ folder using the Pentaho Business Analytic Platform and follow the instructions on how to add a new report to the ones managed by AAAR. If you don’t know how to develop on Pentaho Report Designer, try here.

[Up]

How to add a new report, developed by me, to the ones generated by AAAR?

It’s enough to add a new line in the ‘dm_reports’ table of the ‘AAAR_DataMart’ database. To do it you can simply use your database administration GUI. Below the ‘dm_reports’ table with the default values where you can copy values and settings.

img

[Up]

How to increase or decrease the maximum number of actions, extracted for each execution of the ‘Get audit’/’Get all’ job?

As described here, the ‘Get audit’ job extracts a maximum of 100,000 actions per time. In some particular cases, this maximum amount of actions extracted could be changed to reduce the execution time of the ‘Get audit’/’Get all’ job or to get all the actions in one time. To do it you can simply change the ‘url_audit_suffix’ field of the ‘dm_dim_alfresco’ table in the ‘AAAR_DataMat’ database, using your database administration GUI. Below the ‘dm_dim_alfresco’ table with the default values.

dm_dim_alfresco

[Up]

What about the execution times in worst cases?

Thanking to the support of some of you, we tested the AAAR solution with success on more than 25,000 actions per day. The ‘Get audit’ job takes less than one hour to process 50,000 actions. The ‘Get nodes’ job takes less than one hour (the first time) to process a structured repository with thousands of documents. The ‘Report all’ job (on the 11 sample reports) takes less than 3 minutes to be executed.

[Up]

I would like to develop my own report but I don’t know how to use Pentaho Report Designer.

emailmeI always look for interesting reports, features and needs. Try to explain me your needs, I could be interested in developing it for you.

[Up]

Is AAAR able to do something I need?

emailmeAre you meaning you have suggestions, feedback or simple requests of informations? Try to explain me your needs, I’m always interested in new features, particular cases and specific needs.

[Up]

I like A.A.A.R.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.