11 Dec

Uploading a mondrian schema to Pentaho using PDI

In this post is shared the solution to upload a mondrian schema to Pentaho BA Server, using the REST API through a transformation of PDI. If you take a look to this thread of the Pentaho forum, the goal seems to be a common problem so we think it could be a good idea to share the solution with the community. I hope this post will be helpful.

Development environment

The source code is developed and tested on a Windows platform and a Linux Ubuntu 14.04 LTS platform. Pentaho BA Server and Pentaho Data Integration are both in the 5.2 version.

Use case

Starting from a file containing the mondrian schema (a XML file), our goal is to develop a PDI transformation to define a Pentaho BA Server Data Source. Of course we would like to define the data source on the mondrian schema so we would like to define a so called “Analysis Data Source”.

The strategy

Thank to the Pentaho BA Server REST API, our strategy is to use the service described below to create the data source.

http://<pentahoURL>/pentaho/plugin/data-access/api/mondrian/postAnalysis

To create (and replace) an Analysis Data Source it’s easy: simply invoke a POST call to the REST, using a multipart request. Of course this goal could be easy using a programming language, but we would like to use a transformation of Pentaho Data Integration (called Kettle). Unfortunately Kettle is not so smart when you have a multipart request.

Description of the solution

Below is described the transformation of Pentaho Data Integration.

Pentaho Upload Data Source

As you can imagine, the core of the solution is in the ‘Generate multipart entity’ step and in the ‘HTTP Post’ step. But before looking at this, let’s share what is in the ‘Generate rows’ step. There you are going to find the basic parameters to make everything properly work.

Pentaho Upload Data Source

  • uploadAnalysis contains the file name with the mondrian schema. In the ‘Add root file’ step, this file name will be completed with the absolute path.
  • catalogName and origCatalogName contains the name of the mondrian schema (the same that is described in the XM file).
  • parameters… ok, it’s clear! 😉

Below the source code of the ‘Generate multipart entity’ step that defines three output parameters.

  • requestEntityValue containing the multipart entity to post in the request.
  • contentType and contentLength containing the informations about the request entity.
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import org.apache.commons.httpclient.methods.multipart.MultipartRequestEntity;
import org.apache.commons.httpclient.methods.multipart.FilePart;
import org.apache.commons.httpclient.methods.multipart.StringPart;
import org.apache.commons.httpclient.methods.multipart.Part;
import org.apache.commons.httpclient.params.HttpMethodParams;

public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException {

 Object[] r = getRow();
 if(r == null){
  setOutputDone();
  return false;
 }

 String uploadAnalysis = get(Fields.In,"uploadAnalysis").getString(r);
 String catalogName = get(Fields.In, "catalogName").getString(r);
 String origCatalogName = get(Fields.In, "origCatalogName").getString(r);
 String parameters = get(Fields.In, "parameters").getString(r);

 try {

  File filePart = new File(uploadAnalysis);
  Part[] parts = {
   new FilePart("uploadAnalysis", filePart),
   new StringPart("catalogName", catalogName),
   new StringPart("origCatalogName", origCatalogName),
   new StringPart("parameters", parameters)
  };
  MultipartRequestEntity requestEntity = new MultipartRequestEntity(parts, new HttpMethodParams());

  ByteArrayOutputStream bOutput = new ByteArrayOutputStream();
  requestEntity.writeRequest(bOutput);
  String requestEntityValue = new String(bOutput.toByteArray());
  String contentType = requestEntity.getContentType();
  String contentLength = String.valueOf(requestEntity.getContentLength());

  Object[] outputRow = createOutputRow(r, data.outputRowMeta.size());
  get(Fields.Out, "requestEntityValue").setValue(outputRow, requestEntityValue);
  get(Fields.Out, "contentType").setValue(outputRow, contentType);
  get(Fields.Out, "contentLength").setValue(outputRow, contentLength);
  putRow(data.outputRowMeta, outputRow);

  return true;

 } catch(FileNotFoundException ffNotFoundEx) {
  logError("File '" + uploadAnalysis + "' not found!!");
  throw new KettleException(ffNotFoundEx);
 } catch(IOException ioEx) {
  logError("Error generating the value of the multipart request!!");
  throw new KettleException(ioEx);
 }
}

Below the ‘HTTP Post’ step that, finally, send the POST request.

Pentaho Upload Data Source Pentaho Upload Data Source

Conclusion

In this post, gently developed by Stefano Massarini, is shared the solution to upload a mondrian schema to Pentaho BA Server, using the REST API through a transformation of PDI. If you would like to use the solution and evaluate it, you can download it here.

2 thoughts on “Uploading a mondrian schema to Pentaho using PDI

  1. Hi Francesco,
    Thank you for your good blog article!
    I tried to change your code for repository API “/pentaho/api/repo/files/import”.
    I could send a text file, but I could not send a binary file.
    Do you know how to send binary files?

    I posted to Pentaho Community Forums.
    http://forums.pentaho.com/showthread.php?157097

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.