Tag Archives: webservice

Launch Services – ETL Transformation as a WebService

Transformations on CloverETL Server can be run by users in a very simple way by using just a web browser and a correct link. When a user click on the link, a transformation is triggered on the Server and requested data are generated. These data can be in several different formats as Excel spreadsheet, CSV file, XML, or HTML. One can even place a form on the web page that serves as a source for parameters for the transformation.

In this blog post, I will demonstrate how Launch Services can be used from user point-of-view. The next part will show how Launch Services can be configured.

Each installation of CloverETL Server contains a sandbox named default that has many examples of Launch Services in. So you can immediately experiment with all the examples described below. If you do not have a licensed version of CloverETL Server, you can use an online server demo running at http://www.cloveretl.com/examples/server-demo

Example 1 – Glossary

There is a database with economic terms which is updated everyday on our online demo server. You can put a web form on a web page where user enters an unknown term and press the execute button. A request is sent to the server where CloverETL transformation is run, then matching and similar terms are returned and displayed as an HTML page.

Note: This example is using HTTP POST request. However, Launch Services can handle also GET requests. So you can place anywhere a normal URL that would link to an explanation of a specific term. Format of such link is: <server-url>/launch/<launch-service-name>?<paramenter-name>=<parameter-value>, e.g.: http://server-demo-ec2.cloveretl.com/clover/launch/glossary?term=trend

Example 2 – Mountains

There is a database with the highest mountains on Earth with their heights. User enters an elevation above sea-level and hits an enter key. Excel sheet is then displayed listing all mountains with the given minimal elevation.

Note: It is necessary to provide credentials to trigger Launch Services (using HTTP basic authentification). A transformation may depend on a logged user or a group, the user belongs to. It is thus possible to configure the service in such a way that it will run under one URL but returned data format will depend on a specific user (e.g. users will get results in an Excel sheet, but one specific user can receive results as CSV file).

Example 3 – Mountains Upload

Launch Service can accept also a file containing data as a parameter. These data can be inserted to the server database or can be used for a transformation run. As a variation to the previous example, it is possible to submit a file with a list of highest mountains that is used for a selection.

Example 4 – Everest Ascents

Though input fields are of text type, Launch Services can parse them, according to how they are configured, into other data types as integer, real, date, boolean etc. They can have specific formats and locale settings, even different for each user. If a user enters data in a wrong format, he is alerted and CloverETL transformation can then receive only correct, parsed data.

In our example, we have a list of mountain climbers that reached Mount Everest without using an oxygen mask. Launch Service enables you to select a climber by date of the first ascent, gender, number of successful ascents, and nationality.

Example 5 – Scripting

Up to now, we have been accessing Launch Services from a web browser. However, they can be easily used also from other systems. It is possible to access a given URL through any application supporting HTTP protocol as wget on Unix. You could write a script downloading all women that climbed Mount Everest and sending them to a given email address:

wget –user clover –password clover -q -O – http://localhost:8080/clover/launch/ascents-everest?sex=F | tail -n +2 | cut -f 1 | mail i@somwhe.re -s “Ladies on Everest”

Example 6 – Client Library

The last possibility how to call Launch Services is via a Java library cloverETL-server-client.zip. This package will come handy if you need to integrate Launch Service with your Java application or if you need to trigger Launch Service from a command line with more parameters than wget allows for.

If you need to run Launch Service called acents-everest running on a server http://localhost:8080/clover as a user clover and all results save to a folder acents-everest-result, you can use this command:
java -jar cloverETL-server-client.jar acents-everest -u http://localhost:8080/clover -l clover -p clover -o acents-everest-result

Read the part two: Launch Services – Configuration

Integration of CloverETL with PHP

To witness the power of Clover, consider the following scenario. Our customer required an application whose main purpose was generating reports in xls format. What would seem like a simple task was presented with difficult challenges:

  • Reports had  many different formats specified by many parameters.
  • Reports were generated from heterogeneous sources – xls files, database tables and IBM iSeries files (former AS/400).
  • Application users modified data (mainly added records) in a report before its final confirmation.
  • The confirmation of a report invoked updates in several database tables and generated log records.
  • Short response times were needed for the most of operations.

We knew right away we would approach this situation with a web application written in PHP because web application architecture brought us plenty of needed advantages. But how can we quickly and reliably integrate heterogeneous sources? This is exactly the type of task that ETL tools were developed for. That’s why we built the whole solution using CloverETL Server with the Launch Services (equivalent of web services) feature. The following schema should illustrate the architecture of our solution.

Users of the application communicate only with the web application layer (1) that uses its own database for storing temporary data, setup information, and a history of reported records. This part also manages access permissions based on user roles.

When a user wants to view some report, he or she sets the proper parameters and other restrictions (e.g. type of records to be reported or dates scope) in the web application using his/her preferred web browser (IE, Firefox …). Confirming the filled parameters invokes a call to a Clover Launch Service (2) and a transformation connected with this service. The transformation integrates data from heterogeneous sources and stores it into a database dedicated to the web application. All stored records are identified by runID (transformation execution’s unique identifier within Clover Server) and the service returns the same runID as a response from its call.

The web application allows users to modify only certain data (according to partial runID). Users can insert, change or delete records (3). These changes have an effect only within the database of the web application and play only a temporary role until the user confirms the report. The confirmation again invokes  a call to the Clover Launch Service that then generates the needed reports, logs records about processed changes, and propagates changes into a history of reported records (4).

To summarize this solution, we moved the entire application logic into a CloverETL Server which allows us to solve such problems with ease. Furthermore we took advantage of Clover Launch Services – a combination of ETL integration power and online processing.

In my opinion, there are many similar problems where the concept described above cuts the time and effort to reach a solution.

Loop Execution of Data Transformation

Case study description

Czech Insolvency Registry (http://isir.justice.cz) basically contains data about economic subjects that entered insolvency and have financial difficulties with paying off their debts. The registry allows everybody to download data using public SOAP Web Service. It can be done manually or automatically with the right software.
https://isir.justice.cz:8443/isir_ws/services/IsirPub001?wsdl

CloverETL can easily help with the automatically download that would save time and technical difficulties. CloverETL graph can get required data by calling the web service, processes data and store it in required format. Unfortunately the Registry’s web service is very poorly designed. The service doesn’t give you current status of each of the economic subjects, but provides the whole history of the required company. Therefore we have to download not only the current information we need but the whole information since the year 2008 (the registry foundation). That is a lot of data to process – actually thousands of log records for each company! Moreover the Registry’s Web service „GetIsirPub0012“ provides only maximum of 1000 records per one call. If one company has few thousands of records you have to undertake more calls.  So we have to download data in thousand-records bunches, but we don’t know in advance exactly how many of these bunches (records) there are for each company. That makes the whole process quiet difficult.

But solution with CloverETL is simple. CloverETL Server provides features “graph event listener” and “groovy task” that help us with all the above described challenges. Firstly, we will of course design a CloverETL graph that processes for the beginning just one thousand –record bunch of data (see picture bellow).

Transformation graph

This graph has a parameter „startID“ which has value “0” by default. If we want to process 1000 records starting let’s say from no. 2541, then  start ID will be startID=2541, and the first downloaded record will be identified by. If we run graph without parameters, it’ll download and process first thousand of records (no. 0 – 999).

Graph also contains couple of components to store ID of last downloaded record so that  the next bunch to download may use the last ID  as the startID. It will be automatically stored to graph parameters as lasted parameter. It can be done by in-line Java code in Reformat component:
String id = GetVal.getString(source[0],"id");
getGraph().getGraphProperties().setProperty("lastID", id );

The loop

The graph we designed must be executed n-times to download and process all records. At first we don’t know how many times, but we know, that we can stop the downloading process as soon as there are no more records to read. It means we can stop the process as soon as  “started” and “lasted” are equal.

How to achieve such loop?

Graph event listener

To achieve the automatic loop, for the graph that we designed and described previously, we’ll define graph event listener for “FINISHED_OK” graph event on CloverETL Server. So every time transformation finishes without error („FINISHED_OK“), listener will trigger task that we selected. We need to specify this tasks now. Since we want to  execute the same graph repeatedly, we have to specify “execute graph” task. This task will repeat executing the graph indefinitely. However we need to stop this loop at some point. We need to “break the loop” when the startID and the lastID parameters are equal. Therefore it is actually better to create “groovy task” instead of „execute graph“.

Groovy task

Groovy is scripting language with Java syntax. In addition, groovy scripts may access java objects and use java libraries. See Groovy project site http://groovy.codehaus.org/ for more details.

We’ll create a simple groovy script which decides whether execute the graph again or not. To decide it, we’ll need to get graph properties from the finished graph. These properties are accessible by calling method event.getProperties().

Then, we’ll need to execute graph using CloverETL Server Java API. It’s done by calling method serverFacade.executeGraph().

Script may return String value which is stored in „Task history log“.

// these variables are predefined:
// sessionToken
// event
// serverFacade

import com.cloveretl.server.persistent.RunRecord;
import org.apache.log4j.Logger;
import com.cloveretl.server.api.*;
import org.springframework.web.context.WebApplicationContext;
import org.springframework.web.context.support.WebApplicationContextUtils;

Logger log = Logger.getLogger("groovy-ISIR-graphEventListener");

Properties eventProps = event.getProperties();
log.info("event properties: " + eventProps);

// get lastID and startID from previous graph execution
String lastIDString = eventProps.getProperty("lastID");
String startIDString = eventProps.getProperty("startID");
long lastID = Long.valueOf(lastIDString);
long startID = Long.valueOf(startIDString);

// lastID and startID from last graph execution are equal – break the loop
if (lastID == startID)
return "no more records to download";

// prepare startID which will be passed to next graph execution
Properties properties = new Properties();
properties.setProperty("startID", lastIDString);

String SANDBOX = eventProps.getProperty("SANDBOX_CODE");
String GRAPH = eventProps.getProperty("GRAPH_FILE");
GraphExecutionCommand graphExecutionCommand = new GraphExecutionCommand(
null, SANDBOX, GRAPH, null, null, null, true, properties, null, null);
Response respExec = serverFacade.executeGraph(sessionToken, graphExecutionCommand);
String result = "graph "+SANDBOX+"/"+GRAPH+" executed: "+respExec.getBean();
log.info(result);
return result;

Graph results

All graph results for each bunch of data are stored to only one CSV file. They are always added, so don’t worry there is no danger that some of them will be overwritten :-) . So when the whole batch of transformations is finished, we have only one CSV file with all processed records. Or if somebody wishes we can consolidate records and store them directly into database where the data can be stored in more friendly and usable format.

Data Integration with Web Services – How to Set Up a WebServiceClient

WebServiceClient is a new CloverETL component that was released in 2.9 version. It provides an easy intuitive way to interact with web-service technology that has gained a huge popularity in the last few years. The following simple example shows the new component´s functionality. The WS component is user friendly and very easy to use.

Example for WS component usage:

Our goal is to get the actual weather information for specific ZIP codes from the web service that is running on http://ws.cdyne.com/WeatherWS/Weather.asmx?op=GetCityWeatherByZIP. The zip codes are stored in the flat file and results will be saved to Excel file.

I suppose that you are familiar with CloverETL so I will not describe the input and output setting up but will focus directly on WebService Client component.

Step 1 – WebService URL and operation

Firstly we have to specify the URL of web service WSDL file and select the operation you want to use from the list of all operation WS provides. WSDL URL is the basic attribute of the component and have to be set to http://ws.cdyne.com/WeatherWS/Weather.asmx?WSDL. The dialog for Operation name attribute allows you to select the operation you want to use. In our case it’s the operation GetCityWeatherByZIP from the group WeatherSoap (Port:WeatherSoap) as you can see on the figure bellow.

Step 2 – Generate request

The request is a way how to pass the arguments to webservice. It’s the XML file with pre-defined structure which depends on the used webservice and writting it by hand is very annoying. CloverETL provides a functionality that generates the required structure of request in one click. Simply open the dialog for Request structure attribute and click on „Generate request“ button. The skeleton of the request is genereted and we have to map the input port fields of the component to the request placeholders. In our case the only placeholder is string. By drag-and-drop we replace string by the field name from the right panel of the dialog.

Step 3 – Process response

Finally we define how to process the response from webservice. The response is XML file so we will create the mapping among elements and attributes of XML files and the output fields of the component. CloverETL makes it easy for us, it provides drag-and-drop dialog where we can select the item from „XML Fields“ part of dialog and drop them to „Clover Fields“.

And that’s all :-) . Communication with webservice is set-up. Now we can collect actual weather conditions from Internet.

CloverETL 2.9 Released: Infobright Data Writer, Web Services Component and Other New Features.

New CloverETL version 2.9. was just released. This version brings a new Infobright Data Writer component, enhances the connectivity by adding Web Services component and adds features that simplify common data transformation tasks.

New Features and Components:
Infobright Data Writer
In response to customer requests, this component writes data into Infobright software, a column-oriented relational database. Infobright is a provider of solutions designed to deliver a scalable data warehouse optimized for analytic queries.

Web Services component
The new component makes communication with Web Services easier than ever. It provides user friendly graphical interface for mapping your data into Web Service fields, automatically generates requests and process responses. It offers faster, easier and more comfortable way to interact with remote Web Services.

Reading formatted values from XLS
Additionally to reading plain data from MicrosoftTM ExcelTM sheets, the Excel component is now also capable of reading user-formatted values such as currencies, dates or numbers.

New tracking option
Customers can now see all absolute speed rates for finished data transformations, facilitating comparative analysis in pursuit of process improvements.

New Aspell Lookup table
Brand new implementation of this component brings better performance, improved configuration and better customization.

Improved treatment of empty (NULL) values
Developers can now specify special strings that should be treated as empty (NULL) when data is being parsed. This feature simplifies processing of typical application export files which often contain values insignificant for ETL processing. Additionally it may lead to improved processing throughput and lower memory consumption of data transformation.

More user friendly File URL dialog and improved LDAP functionality.

Customers can evaluate these new features along with CloverETL’s other leading capabilities with a free 30-day trial of the CloverETL Designer Pro evaluation, which is available at www.cloveretl.com Information management professionals can also evaluate the enterprise integration features of CloverETL Server via an online demo, which is also available at www.cloveretl.com.