CloverETL Designer User Interface Improvements in 3.4

Here at CloverETL we not only work to create new and exciting features, but also have a workgroup specifically dedicated to improving the “little things”. I’m talking about those little things that can help you avoid aimlessly clicking around while trying to do work, or make you think in a quick, intuitive, and productive manner. Listed below are some of the most important tweaks and improvements that CloverETL 3.4 can provide. We think you will find them useful in that they will make working with the CloverETL Designer not only more comfortable, but also more effective.

Graph Editor Improvements:

  • Zoom into/out of the canvas by holding Ctrl key and scrolling up and down.
  • Create a new note using a new keyboard shortcut “N”.
  • The Disable/Enable action on a note disables/enables the note and all of the components placed in it. Think of a note as a candy wrapper- it can be used to group components together.

UImprovements_1

Disable note

  • Insert a component between two existing ones by dropping it onto the edge between them. Whether the component is being created or simply moved around, the edge will split and input and output ports will then be automatically connected. 

UImprovements_2

Drop a component onto an edge

  • Any component can be deleted from the edge. The edge will then remain even if the connected component is deleted.
  • The Progress dialog is shown during graph validation (checkconfig). You can also make the validation run as a background job.
  • The Add Component dialog (opened via the toolbar or “Shift+Space”) can now also be used to add notes. Additionally, you can choose to view either all the components or just those available in the palette.

UImprovements_3

Add component dialog

  • The Add component dialog (“Shift+Space) supports camel-case search. Just type the first, uppercase letters of the words you are trying to find. E.g. type “UDW“ to find “Universal Data Writer“.

UImprovements_4

Add Component dialog camel-case search

  •  The new Rich Component Tooltips have been introduced. You can move your mouse over the component status icon in the top right corner to show a tooltip. It contains detailed information such as validation errors and warnings, custom set attributes, connected components and edge metadata.

UImprovements_5

Rich component tooltip

Outline View Improvements:

  • The Parameter files as well as the parameters themselves are sorted alphabetically, making it much easier for you to find what you’re looking for.

UImprovements_6

 Parameters sorting

  • A filter is available in the Parameters editor window.

UImprovements_7

Parameters filter

  • Ctrl + C, Ctrl + V, Ctrl + X shortcuts now work properly.
  • The Import Metadata from XSD action is now available on context menu- it used to only be found in the File->Import menu.

UImprovements_8

Import metadata from XSD

  • The Outline tree collapsed/expanded state is remembered when either switching between graph editors or simply editing and saving graphs.
  • The disabled components are written in gray to indicate their status.

UImprovements_9

Disabled components at Outline

Working with Metadata:

  • The Extract Metadata action is available on the readers’ and writers’ context menus. However, it is only available if it’s possible to extract metadata from the input/output. For example, it’s available for when the UniversalDataReader is reading a CSV file, but not for the XMLExtract component.

UImprovements_10

Extract metadata from component input

  • The Join key editor displays a metadata tree that includes its name, not just flat list of metadata fields

UImprovements_11

Metadata displayed in Join key editor

  • The Visual Transformation Editor shows a warning when an input or output port has no edge connected where an edge is required, e.g when editing a Transform attribute of the Reformat component.

UImprovements_12

A warning in the Visual Transformation Editor when no input edge is connected

  • The Visual Transformation Editor allows the filtering of input and output metadata fields.

UImprovements_13

Filtering metadata fields

  • The Metadata menu is automatically opened right after the new edge is created so that the user can either create a new or use an existing metadata very easily. The Metadata menu can then be opened anytime by double-clicking the edge.

UImprovements_14

Some more tweaks:

  • We have added a “New Folder” action in the URL Dialog. It’s now possible to delete files and folders (Delete key), rename a file (F2 key), and create new folders on a local file system using a toolbar action (F7 key).

UImprovements_15

URL dialog improvements

  •  A bulk selection of displayed columns is visible in the View Data dialog.

UImprovements_16

Columns visible in the View Data dialog

  • The Web Service Client dialog allows you to select a subtype when generating request body.

UImprovements_17

Subtype selection

 

Transforming Coordinate Reference Systems using CloverETL – A Use Case

A common task an ETL tool has to deal with nowadays is the emergence of data containing a form of geographical information. Just like any other type of data – monetary values, times and dates, etc. – geographical data pose interesting challenges to an ETL developer. Working with different Coordinate Reference Systems (CRS) within a project is a common difficulty. Typically, two or more software systems need to exchange geographical data, but each one of them uses a different CRS.

The ETL process must make sure that the systems in question always get the data in the correct format and CRS. This is where a quick and simple solution comes in handy.

Let’s see how this problem can be easily handled in CloverETL by using a third-party conversion library plugged into a CloverETL transformation.

Note: You can find more information about the CRS used in North America here.

The Example

Let’s demonstrate a solution to a data exchange problem between Google Maps and an export for a New York tourist agency running their own map. The export has to be encoded in a different CRS than the one Google Maps uses.

A Google Maps Based System

Here, we’re running a web application that allows the user to pick a position on Google Maps and add a note to it. This information is then stored in a database and, since Google Maps uses CRS WGS 84, we get latitude-longitude pairs.

(In the picture below, you can see how to get the latitude-longitude pair from the map. In this case, latitude=40.6894 and longitude=-74.044239):

Export Data

Now let’s say that some New York officials ask us to provide them with our data. They’d like to display it on their own map and they plan to use the density of the notes on the map to define the best attractions for tourists. Unfortunately, their maps use an X-Y metric CRS NAD83 zone 18 and there is no simple conversion from WGS84.

Why is the conversion difficult? Well, the first CRS is defined by a Traverse Mercator projection, and the second one by Oblate Spheroid. This leads to non-trivial math homework and goes beyond what an ETL developer should be concerned with. Fortunately, there are libraries such as GeoTools that can help simplify things.

Building a CloverETL Graph

Let’s build a transformation graph that will solve the problem using the GeoTools library. The core conversion will use only three components:

The reader in the beginning reads data from a file in the WGS84 format. In the example download, the first record points to the Statue of Liberty, and the rest is merely random data. Of course, you can replace the reader with anything like DBInputTable or WebServiceClient that will provide the input data instead.

Then, we’ll place a Reformat that will run the conversions for us using the external Java library. Finally, the writer at the end simply writes the converted data back to a file.

The Reformat is there to embed a piece of Java code that calls the external library conversion function for each input record. The full source code is available in the example download, but for now, let’s just look at two of the most important parts.

The initialization part runs only once and prepares the “conversion” object – a MathTransform instance from the library that we’ll use later on the data.

CoordinateReferenceSystem crsFrom = CRS.decode("EPSG:4326");
CoordinateReferenceSystem crsTo = CRS.decode("EPSG:26918");
MathTransform conversion = CRS.findMathTransform(crsFrom, crsTo, true);

The second snippet shows the actual transformation – getting the input data from an edge, passing it into the conversion, and finally, sending the result to an output edge:

double fromPoint[] = new double[] {
       (java.lang.Double) getInputRecord(0).getField(0).getValue(),
       (java.lang.Double) getInputRecord(0).getField(1).getValue()
};
double toPoint[] = new double[2];

conversion.transform(fromPoint, 0, toPoint, 0, 1);
getOutputRecord(0).getField(0).setValue(toPoint[0]);
getOutputRecord(0).getField(1).setValue(toPoint[1]);

Checking Your Results

With this set up we can test the results. The output should look like this:

To verify the data, we can use The World Coordinate Converter on the input data and compare the results.

Using External Libraries in CloverETL Graphs

Because the transformation uses an external Java library, we need to put it on classpath of our CloverETL project. To do so, follow these steps:

  1. Download the latest binary package “geotools-X.X-bin.zip” from GeoTools.
  2. Unzip it into any directory. We suggest keeping the library a part of your project, e.g. “trans/lib”.
  3. In the Designer, right click on your project, select Properties, then “Java Build Path” and finally, the Libraries tab.
  4. There, add these two JAR files: gt-epsg-hsql-8.6.jar (the local database with EPSG definitions) and gt-referencing-8.6.jar (the transformations themselves)

When this is set, you’re ready to use the Java library in your project’s graphs. This way, you can use any number of third-party tools.

Download Demo

You can download the demo project here:

Java Vulnerabilities: No Impact on CloverETL Products

The recent discoveries of Java’s vulnerabilities have caused concerns for many organizations and individuals alike. With recent questions from multiple customers, we’d like to reassure you that our products are not impacted by these vulnerabilities.

The security holes are related to a Java plug-in on the browser where it can be used by hackers to overtake or silently install malware on the visitor’s computer (more details can be found in this article: http://krebsonsecurity.com/2013/01/what-you-need-to-know-about-the-java-exploit/).

We do not use Java in the frontend (browser plug-in) in any of our products; even for our Server application, the Server Console UI is built on top of Rich Faces – which only uses HTML and basic Javascript in the browser.

However, we recommend that you update your Java package for your own protection in other applications. Please also consider upgrading to our latest stable build: CloverETL version 3.3.1 – http://www.cloveretl.com/resources/releases/3-3-1; the Designer installer package for Windows comes with an updated Java bundle that contains the latest patches.

Connection hangs with MSSQL driver 4.0 and Java 6 update 29 – solution

While playing around with MS SQL Server 2012 today I got into trouble running queries in the database from CloverETL. The connection initialized, I was able to browse the structure, graph started but no result would come out from DBInputTable or DBExecute.

Looking at the code I found out that the connection got stuck in setAutoCommit(false) call while initializing in CloverETL. Even a simple test code froze on the first call to executeQuery().

StackOverflow had the answer (http://stackoverflow.com/questions/8986350/jdbc-connection-hangs-with-no-response-from-sql-server-2008-r2/14101687#14101687): the issue was caused by Oracle JDK 6.0 update 29, where SSL would not initialize correctly.

After I updated my Java to JDK 6.0 update 37 all started to work fine. I hope some of you experiencing this annoying behavior will find this helpful.

Calling Web Services on Windows using NTLM authentication

NTLM is a client-server authentication protocol designed by Microsoft that’s often used in its server software products. A server that’s part of a NTLM domain uses this protocol to grant access to secured resources to clients who are able to present a username and a password valid in said domain.

If you are calling web services provided by Microsoft products, it’s likely that such a call will require the NTLM authentication. Microsoft SharePoint Web Services is a prominent example. But there are many other such services, like Microsoft Analysis Services or Reporting Services provided by the Microsoft SQL Server as well.

CloverETL and NTLM

CloverETL 3.3 now supports these NTLM authenticated web services. Both versions   NTLMv1 and NTLMv2   of the protocol are supported. You can use the WebServiceClient component to call such a service. All you’ll need to do to enable the NTLM authentication is to set the Domain Name, Username, and Password. Notice that the Domain Name parameter is required for the NTLM authentication to work.

Altogether now, there are 3 types of authentication supported by WebServiceClient: NTLM, Digest, and Basic. NTLM is obviously the most secure of these three schemes, while Basic is the least secure. When the WebServiceClient component initiates a call to a secured Web Service, the server first indicates which authentication protocols it supports. The WebServiceClient then automatically chooses the most secure one, and, using the user-provided credentials, performs the actual authenticated web service call.

CloverETL Jobflows – Build, Monitor, and Manage Complex Workflows

Lately, I have been confronted with a data integration task of extracting customer data from an Oracle database to an XML file, uploading it to FTP, and finally encrypting that file. If you use a single CloverETL graph to do this, the overall logic might get lost in complexity of the graph. You will need many components, execute them in different phases and, most importantly, your solution will be vulnerable to all kinds of errors – causing the graph to fail for not-so-obvious reasons.

Now what would a mighty tool for such complex tasks be? The answer is the new jobflow component in CloverETL 3.3.

Jobflows is a high-level approach to managing ETL processes. It allows you to combine ETL with other activities to form complex business processes – providing orchestration, conditional job execution, and error handling. Actions participating in jobflow include scripts, executables, web services (REST/SOAP) and common file operations (both local and remote).

Jobflow example

First of all, let’s take a look at the picture below. It shows the principal logic of the example I will be discussing in this article.

(yellow blocks – jobflow *.jbf, green blocks – ETL graphs *.grf)

Step One – Extract from DB

Let us say that the first step in our complex set of tasks will be responsible for extracting rows from the Oracle DB, converting the data to XML and uploading it to an FTP. It will do nothing more and nothing less, and after doing so, it will report its results to the next step in the jobflow pipeline (Did the job fail? How long did it run? When was it started? What was the error message?). This step will be performed by an underlying ETL graph, whose execution is managed by ExecuteGraph component.

If you are scratching your head to figure out how jobs communicate with each other, keep in mind that ExecuteGraph is a component like any other, i.e. it is connected to other components by edges. Jobflow components can use pre-defined (template) metadata containing fields that allow you to send job results through the upstream chain (try right-click on an edge → New metadata from template).

Step Two – Send XML to FTP

In this example, the DB extraction step is connected to a File Operation component called CopyFiles that uploads the result to an FTP server:

In fact, the metadata describes a token – a piece of information that flows on jobflow edges – similarly to a record in an ETL graph.
Tokens are responsible for triggering events in a jobflow; you can make components react to incoming tokens in various ways (read more).

Input mapping

Back to the example: How would I let the CopyFiles component know the name and path of the XML file to be uploaded to FTP? I added an extra metadata field to the output edge of the ExecuteGraph component. As soon as the ExecuteGraph component finishes, it populates the field with a file name the executed graph has created (see Dictionary and Output Mapping section). I used the Output mapping of the ExecuteGraph to map the file name to the output edge. Respectively, Input mapping on CopyFiles component lets you read source/target file paths from input edge. The affected attribute in this case is sourceURL, so you would write the Input mapping like this:

$out.0.sourceURL = "${DATAOUT_DIR}" + "/" + $in.0.xmlName;

Notice the $out.0 – in Input Mapping it represents the internal structure of the component attributes – it acts as a record so that you can use it in CTL2. Its metadata is fixed by the component definition – check the transformation dialog to see all the attributes you can map.

Dictionary and Output Mapping

I have not yet mentioned how the DB extraction job communicates with its depended jobs, particularly, where and how it gets the XML file name. Obviously, it needs to ‘send out’ its internal parameter – the XML name. This is carried out via one of CloverETL’s earlier features – dictionary.

Dictionary is an interface between the graph and its surroundings, e.g. other jobflow components, externally called procedures, etc. Using a dictionary to pass parameters between two jobs works like this:

  • First, you create a dictionary entry in the ETL graph (via Outline → Dictionary) and configure its data type. This determines what kind of data will be exchanged – string in this case.
  • Second, you populate the dictionary entry with a value within your ETL graph – either using the SetJobParameters component or in CTL (e.g. Reformat).
  • Now, in the parent jobflow, you can map the dictionary entry content in ExecuteGraph’s Output mapping. The dictionary entry will appear in the left panel under Dictionary folder:

Note – Dictionary can also be manipulated with the GetJobInput and SetJobOutput components. These allow you to €˜read and write€™ (respectively) from/to a dictionary inside an ETL graph. GetJobInput is usually a starting component that reads the dictionary value and maps it to its output edge, passing it to the rest of the graph. SetJobOutput, on the other hand, will typically be placed at the end of your graph where you want to make the results accessible to the parent.

Error Handling

So far, I have not taken into account any errors either on the graph or on the jobflow level. Let’s look at the middle step in my jobflow hierarchy – what happens if, for example, the file server goes down during the FTP transfer?

  • Obviously, the XML file will not be uploaded completely and the ETL graph will fail.
  • Next, the error will be signalled to the ExecuteGraph component which called the graph.
  • The error info is sent to the second output port of the component where you can react to it – stop the jobflow by placing the Fail component there, send email to the support team (EmailSender), or handle the error by another graph or nested jobflow.

In the picture below, I react to an FTP transfer error by writing an error log (notice this is handled by a separate graph) and then stopping the jobflow using Fail component:

There are many more possible reactions to failures. I could have configured the executing component not to Stop processing on fail – if an error occurred in the executed graph, the jobflow would not break and incoming tokens would sill trigger new graph executions (i.e. ignoring errors).

Step Three – Analyse Jobflow Stats

The penultimate step in my jobflow analyses previous job runs from a statistical point of view. First, it combines (Combine) tokens from preceding jobs and then computes average duration of each one. If that goes okay, the jobflow ends with Success.

Step Four – Encrypt Output

This overall result is then signalled to a logic one level higher (the top jobflow in the main diagram), which executed all the underlying tasks by a single ExecuteJobflow component. Using ExecuteScript, it calls a binary utility to produce a hash of the XML created earlier. The ListFiles component in front of it helps assembling the complete File URL.

CloverETL Server – LDAP settings

Introduction

The purpose of this post is to explain the CloverETL Server LDAP configuration and to provide necessary guidance and some how-to’s to learn LDAP and CloverETL Server integration. By going through this guide, you will be able to centralize your CloverETL Server user management into your LDAP/Active Directory.

LDAP is a powerful, standardized concept of organizing information. With that often comes a few major trade-offs:

  • It’s highly complex for beginners
  • There are vendor specific differences against standard
  • Error messages can be cryptical

Connecting to LDAP

Before configuring the CloverETL Server to work with LDAP, you should have a basic understanding of how LDAP works. If you’re familiar with it already, you can skip to the next chapter.

For basic information about LDAP, please read the this Wikipedia article. You may actually want to jump there right now and read the article there before continuing this one.

You also need an LDAP server instance. It makes sense to first check the installation by logging into your OS, mail account, etc. It can also be helpful to visualize your LDAP structure so you can get a better idea about what you’re doing. If your LDAP server provider does not have a good tool for that, you can try JXplorer. It’s free and simple to use. Talk to your system administrator to get a better understanding of your LDAP setup.

Distinguished Names (DN)

LDAP is based on storing objects in its database. These objects are referred to by their Distinguished Names (DN). Basically it’s a record of path through LDAP tree. A sample of such DN can be: cn=John Doe,ou=people,dc=MyCompany,dc=com

To reach such an object you need to drill down in reverse order like this:

  • find an object with attribute “dc” with value “com” in the root of LDAP directory
  • among its children nodes, find one with attribute “dc” set to “MyCompany”
  • among its children nodes, find one with attribute “ou” set to “people”
  • among its children nodes, find one with attribute “cn” set to  “John Doe”, that is it!

Here is an example object with its attributes:

cn=John Doe,ou=people,dc=MyCompany,dc=com
cn John Doe
objectClass organizationalPerson
sn Doe
uid doej
displayName John Doe
givenName John
mail johndoe@mycompany.com

LDAP search filter

The LDAP search filter is an expression used for search in LDAP tree. You can run such expression in JXplorer by Search->Search dialog. Fill Start Searching From with start node (called base in Server) and fill Text Filter. You can see the usage in the following images.

An example of user search:

An example of groups assigned to user search:

When a search is successful, you should see found nodes in the left tab “Results”. Click on it to see the object (select simple or text mode of HTML View).

LDAP authentication workflow in the CloverETL Server

The following steps describe how CloverETL Server and LDAP work together:

  1. The user fills the username and password into the login form and submits it; the Server checks the user type: either local (managed by the Server) or LDAP (this is what we’ll set up). Assume the user type is LDAP;
  2. The Server tries to connect to the configured LDAP server;
  3. If the connection is successful, the Server tries to find user by username (see below);
  4. If the user is found, the Server tries to list groups assigned to this user
  5. The Server tries to match the user’s LDAP groups to Server groups; i.e., all groups that are defined in the Server are intersected with user’s LDAP groups. All other are ignored. Thus, LDAP groups identify Server user groups which define user’s permissions;
  6. The user might need to be a member of an allowed group (see security.ldap.allowed_ldap_groups below);
  7. The password is checked against LDAP;
  8. User is logged in if password is correct and has at least one group assigned.

CloverETL Server LDAP settings

Let’s take a look at what you need to do in the Server configuration to enable LDAP authentication. For details on Server settings, please refer to this page in the documentation. It’s important to restart your application server after each change of configuration to apply it.

First, locate the clover.xml configuration. For example, on Tomcat server, each setting mentioned below would be written to [tomcat]\conf\Catalina\localhost\clover.xml file. Refer to the documentation to find the file in your installation.

A sample LDAP configuration in clover.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Context path="/clover" crossContext="true">
<Manager pathname=""/>
<Parameter name="security.authentication.allowed_domains" value="clover,LDAP" override="false" />
<Parameter name="security.ldap.url" value="ldap://mail.mycompany.com:389" override="false" />
<Parameter name="security.ldap.user_search.base" value="ou=people,dc=MyCompany,dc=com" override="false" />
<Parameter name="security.ldap.user_search.filter" value="(uid=${username})" override="false" />
...
</Context>

In the XML above, you can see the Parameter tags. The following chapter will describe all LDAP-related parameters you can use in your configuration.

In case you encounter errors in any of the steps below, check your logs. You can read more about the Server logs location here. Interesting files are (for Tomcat):

  • [tomcat]\temp\cloverlogs\userAction.log - contains log of user actions and their results (login succeed/failed)
  • [tomcat]\temp\cloverlogs\all.log - contains more technical information

Enable LDAP log in (workflow – step 1)

security.authentication.allowed_domains

Enabled authentication methods. Can be either clover or LDAP or both separated by a comma (i.e. clover,LDAP).

Test correct settings

You should now be able to create a user with domain set to LDAP. Let’s create a new user doej for further testing.

Connection to LDAP server (workflow – step 2)

security.ldap.ctx_factory

The class name with namespace containing context provider implementations. Use “com.sun.jndi.ldap.LdapCtxFactory” for start.

security.ldap.timeout

Timeout when accessing your LDAP server. Use “5000″ (unit is milliseconds) for start.

security.ldap.records_limit

Limit of records number returned by the server. Use “50″ for start.

security.ldap.url

The URL of your server including port; in format ldap://hostname:port. The default port of the LDAP is 389, or 636 for SSL connection. For Microsoft Active Directory specifics, please see paragraph in the “Pitfalls” section. You can check your URL using JXplorer.

security.ldap.userDN
security.ldap.password

If your LDAP server requires authentication, fill in the credentials of the user who will perform the queries. It’s recommended that you create a dedicated user with minimal rights sufficient for this purpose. You can test the credentials by logging in via JXplorer to your LDAP server. “userDN” is the abbreviation for User Distinguishable Name.

Test correct settings

Try to log in to the CloverETL Server user with the newly created user (‘doej’ in our example). If the settings were incorrect, then “Login failed” will appear with a message similar to one of these:

  • my.domainX.com:3893 (server was not found)
  • my.domain.com:38931; socket closed (cannot open connection to found server)
  • …or another connection-related message

Use JXplorer and experiment to get working settings.

User lookup (workflow – step 3)

security.ldap.user_search.base

This property describes the DN of a node where the search for user object starts. Use the topmost node which contains all required users (as subnodes). For our example mentioned above, you would use “ou=people,dc=MyCompany,dc=com”.

security.ldap.user_search.scope

This setting specifies the behavior of your search:

  • SUBTREE – search recursively all child nodes of security.ldap.user_search.base
  • ONELEVEL – search just immediate child nodes of security.ldap.user_search.base
  • OBJECT – the object is selected directly, i.e. only security.ldap.user_search.base node is checked

security.ldap.user_search.filter

The filter to find proper object. The value can be something like (uid=${username}), where ${username} is substituted by the login that was typed into the login form, uid is the name of LDAP attribute which should match against the user’s login.

Test correct settings

Again, you can test this filter in your LDAP tool. See the LDAP search filter section. Just replace “${username}” with the user login. For example, (uid=doej) should find our user. Do not forget to set the correct start point for your search (see security.ldap.user_search.base).

It should also be possible to log into the Server already,  but the user will not have any group assigned yet.

Groups assigned to User lookup (workflow – step 4)

security.ldap.groups_search.base

This property describes the DN of a node where the search for group objects starts. Use the topmost node which contains all required groups (as subnodes). For our example, ou=groups,dc=MyCompany,dc=com.

security.ldap.groups_search.filter

This filter should return all group objects for the user found in step 3 where this user is a member. For example, (&(objectClass=group)(member=${userDN})). This search returns all objects of class “group” which contain attribute “member” set to the user’s DN. (${userDN} is replaced with the DN of the user object from step 3. You may want to change this query if your class or attribute names differ.

security.ldap.groups_search.scope

Same as security.ldap.user_search.scope.

Test correct settings

You can test this filter in your LDAP tool. See LDAP search filter section. Just replace “${userDN}” by some user DN. For example, (&(objectClass=group)(cn=John Doe,ou=people,dc=MyCompany,dc=com)) should find our user. Do not forget to set the correct start point for your search (see security.ldap.user_search.base).

Group binding (workflow – steps 5 and 6)

security.ldap.groups_search.attribute.group_code

Once groups are found, each value of an attribute given by this setting is extracted. For example a value “cn” and its value “my_test_group”, when extracted, are matched against the Server groups field “code”. In our example, the “code” field of the Server group must be set to “my_test_group” to be matched. The user gets a membership assigned in all matched groups.

security.ldap.allowed_ldap_groups

This field may contain a list of group DNs which the user must be a member of to be able to log in.

For example, cn=test1,ou=groups,dc=MyCompany,dc=com;cn=test2,ou=groups,dc=MyCompany,dc=com will allow access only to users with membership in “test1″ and “test2″.

Can be set to _ANY_ which turns off this feature and allows any LDAP user to log in.

Test correct settings

Now you should be able to log in to the CloverETL Server and be assigned proper groups based on your LDAP groups

Pitfalls

SSL access

For the CloverETL Server running in an application server, the SSH setting is fully transparent and managed on the application server level. Please follow the instructions for your application server.

Here are some useful tips:

  • Add a server certificate to Java default truststore. See here (section “Importing Certificates”)
  • Create your own trust store and replace the Java default trust store via a system property (sample here)

You also need to set the system property com.sun.jndi.ldap.connect.pool.protocol=ssl in a place where you normally set other system properties (e.g. in “[tomcat]/bin/catalina.sh” add to “CATALINA_OPTS” -Dcom.sun.jndi.ldap.connect.pool.protocol=ssl). Please note that this will not work if set in environment variables. In the end, you should have:

  • -Djavax.net.ssl.trustStore=keystore.ks (if you have your own trust store)
  • -Djavax.net.ssl.trustStorePassword=MY_PASSWORD (if you have your own trust store)
  • -Dcom.sun.jndi.ldap.connect.pool.protocol=ssl
  • -Djavax.net.debug=true (recommended for setup phase)

Microsoft Active Directory (AD) – Global Catalog and Referrals

If your Microsoft AD is using “referrals,” you can be getting a message like Unprocessed Continuation Reference(s); remaining name 'DC=ad,DC=mycompany,DC=org' in the CloverETL Server log. Referrals is a technique of linking information scattered across the LDAP directory into one node. See LDAP Referrals for details. For example, when users in your LDAP directory are placed in multiple locations, you are able to virtually aggregate them into a single location by using referrals. By default, Microsoft AD is running services on these ports:

  • 389 – the default LDAP port; you can use this if you don’t have referrals
  • 3268 – global catalog port which is able to follow referrals. See What Is the Global Catalog? for details.

Thus, if you suffer the error mentioned above, it may help to change the Server setting security.ldap.url to use the global catalog. So for example:security.ldap.url=ldap://my.domain:3268

Known issues

All versions of the Clover Server prior to 3.3.0-M3 contain bugs reported in issue CLS-735. This means that when attribute values used for DNs contain a comma “,” character, the login fails. When “cn” in the example above is changed to “John, Doe” then it’s incorrect and will cause problems during the log in process. This is important mainly for DN used in the “member” attribute of “group” object.

The Time Has Come: Getting More From Your Data with Event Analyzer

In the Event Analyzer, the new extension for CloverETL Designer, the ability to process time-based data in the CloverETL environment is simple – and incredibly useful.

With the Event Analyzer, you can now process and analyze data with time-based characteristics, including log records, transactions, measurements, alerts, and more, all in the CloverETL environment. This new addition, available in beta, has many uses. As time is a key attribute in our world, data in relation to it can lead to valuable insights for businesses.

What’s the Benefit?

Sometimes you can’t know what you’re looking for until you’ve found it. With the Event Analyzer, uncover inconsistencies and truths in data to help you rework and rewire commercial processes – or even just understand your customers and the daily activities in your business better.

The Event Analyzer gives users a look into customer actions, fraud or unusual behavior, inefficient systems, discrepancies between systems, and even SLA violations – in essence, valuable information hidden in the sea of data and commotion. The extension can help you to understand your time data better so you can put this information to good use.

The Event Analyzer provides a powerful set of components to process records in the context of time such as:

  • FollowRecogniser for detection of event sequences
  • NonFollowRecogniser for missing events
  • ChangeRecogniser for changes in the flow of events
  • And RunningAggregate for computing time-based characteristics

 A Real-World Example

The video below details an analysis of records from an online retailer. In this example, the e-shop analyst wants to view and understand two things involving the purchasing customer. The first, when a customer first enters the store, and secondly, when he or she made a purchase. An additional stipulation is a defined time frame: the analyst only wants the users who purchased something in ten minutes or less. With the Event Analyzer, these specifications are easy to set.

How is this done?

We can see the records of when a visitor came to the e-shop and also when he or she made the order. We sort these records chronologically. Next, we correlate the two time events and split the data into two event streams for further analysis: the first stream being the entrance and the second being the purchasing event, or “order confirmation.”

To look for an event sequence where the landing page visit is followed by the order confirmation page in less than 10 minutes, we use the component: FollowRecognizerTwoPorts. It’s important to configure the component by defining the file with an event occurrence timestamp for both input ports. Also, setting the parameter “Join key” to value “user session” ensures that there will be events connected only with the same user session.

Results

Detection results are sent to the output port of the component and the component allows users to transform the output data using CloverETL transformations. The output of the graph is the filtered result set based on the rules defined in the components.

And with that, the e-shop analyst now can clearly see all customers who purchased within ten minutes – insight he did not have before using the EA extension. Make sure to watch the whole video for the step-by-step view of this example.

Easy as 1, 2, 3

As you can see, there’s much you can learn from your time-based data. A deeper look can lead to new ideas, changes, improved security measures, and renewed value for your business. Take advantage of the possibilities. Learn more about what’s going on with the Event Analyzer Extension for CloverETL. Download it in beta today.

 

DataMotion: CloverETL’s Newest OEM Partner

CloverETL offers a choice. This has been part of our philosophy since day one. Flexibility. Scalability. Robustness. These principles are fundamental to us—and also what set us apart in the market. With these strengths at the forefront of our offering, our mission is simple: to craft a tailored approach for customers looking for that “just right” ETL/Data Integration relationship. This just makes sense to us. So with this firm belief, we continue look for new ways to make great things happen for our customers.

Our OEM strategy is no different. As described in an earlier blog, CloverETL works as OEM in three ways: OEM Partner, OEM Embedded, and OEM Embedded White Label.

For us, the ability for customers to shape and style a unique OEM relationship with us really transforms the OEM conversation. When a partnership is involved, it’s a serious marriage—one in which we work closely together, count on each other, and move forward towards success. A strong OEM relationship comes not only when two product offerings sync well, but when the companies’ goals do too. This is what we strive for because after all, an OEM partner’s success is, in a sense, ours too.

The OEM Partner

In certain cases, a company may just need a little something extra. This is a great way we two companies can shine individually—together. Naturally, this is what we call the partner approach.

OEM Partner companies use CloverETL alongside their offer to provide an integrated package to their end-user clients. And this is great not only for our customers, but also for their clients, who gain access to the Clover community through both our resources and licensing of CloverETL. We support these indirect customers—users who come to CloverETL through our OEM. We welcome them; they are the in-laws, so to speak. Helping our partners means making sure that their clients get exactly what they need to move forward with projects too. Here is where our flexible architecture is a plus: our dev team can work on our ETL offering, while their team can offer the right support to their users who are implementing it. This partnership means we work together to bring the right solutions to the right customers.

CloverETL and DataMotion

Let’s take a look at a real life partner of ours—a really exciting and interesting case. DataMotion, a company that offers software, consulting, development and implementation services of CRM, Direct Marketing and Data Quality solutions in Brazil, fits under this partner umbrella well.

Before they found Javlin, “DataMotion was a frustated user of SQLServer SSIS. So, we were searching for a new, powerful and accessible ETL solution,” said Ricardo Rego, Managing Partner of DataMotion.

He continues, “In the middle of last year, we started to look at the market for a new data integration flagship technology. We were performing due diligence in readiness for start-up. Part of this planning phase included selecting a software vendor that could support the delivery of professional service engagements and help the business grow using a common integration platform. Our decision went to Javlin and CloverETL.”

With them, CloverETL works as a critical software component together with their growing suite of data solutions. As business in Brazil propels forward, especially in the data-heavy verticals of healthcare, retail, and finance, companies are looking to harness the business value from their data assets. For this, they turn to DataMotion with CloverETL for support. “With CloverETL, DataMotion will be able to provide a solid rock data management and integration services to all SME in Brazil and Latin America. This is a special momentum where the Brazilian economy is growing very fast and the enterprise commitment combined with the quality of the information is critical and mandatory to the success of businesses,” said Rego.

As we have seen working with a diverse mix of partner relationships, this approach is definitely something special: it works like a “power couple” to provide real value to data clients.