CloverETL's Blog

January 25, 2010

Iteration through the record fields in CTL

Filed under: Using CloverETL — Tags: , , — Vaclav Matous @ 9:52 am

Recently, I have been facing a very common problem. Imagine this scenario: I have two files – the first one with origin records and the second one with slightly modified new records. Each record had a unique key and aproximately 50 fields. My task was to compare these two files and find out how many fields in every pair differ from each other in the corresponding records.

The simplified graph can be seen in following picture:

A comparison of two records can be processed by the CloverETL DataIntersection component which joins the records with the same keys. In the joined records, you can compare fields that are not part of the key. But remember, you have to write the comparison in the CTL transformation ;-) . Of course you could write the following block of code for each pair of compared fields:

int count = 0;
if(nvl($0.field_N, '') != nvl($1.field_N, '')) {
count++;
//but typically more actions :-(
}
//and imagine this block 50times :-(
//final mapping
$0.key := $0.key;
$0.count := count;

But this solution takes too much time when you have to repeat it for many fields (approx. 50 in my case). It is also very slow, uncomfortable and increases the probability of making a mistake in your code (e.g. omitting some fields). Fortunately, CloverETL allows you to iterate through the fields of processed records! :-) The code is then more briefer and more generic:

//declaration of variables for copies of input records
record(Metadata1) myrec1;
record(Metadata1) myrec2;
function transform() {
int i = 0;
int count = 0;
//asign value of input records to local variables
myrec1 = @0; //myrec1 is a copy of a current record on input port 0
myrec2 = @1; //myrec2 is a copy of a current record on input port 1
//iterate through fields, suppose that field with index 0 is the key
for(i = 1; i < length(myrec1); i++) {
if(nvl(myrec1[i], '') != nvl(myrec2[i], '')) {
count++;
}
}
//final mapping
$0.key := $0.key;
$0.count := count;
}

Someone could object to the necessity of making copies of records. In this case I have good news. CTL in CloverETL version 2.9 introduces the possibility of iterating directly through the fields of input records.

@0[i];//i-th field of the input record on port 0

Moreover new functions for getting field names and data types are introduced in version 2.9. Personally, I am looking forward to such features that will make CTL code simplier and clearer.

September 16, 2009

The Beauty of Inline CTL Expressions

Filed under: Using CloverETL — Tags: , , , — Martin Janík @ 2:19 pm

Have you ever wondered how to write data records to a file with current date in its name? Then I’ve got a brand new solution for you! Since version 2.8, CloverETL supports inline CTL expressions within graph attributes and parameters. What does that mean? It means that you can use attributes and parameters containing CTL expressions which are evaluated dynamically at run-time!

Do you recall referencing graph parameters using the ${PARAM} syntax? Using inline CTL expressions is even simpler! Let’s assume you’d like to store some data records to a unique file every day, e.g. orders_2009-09-16.dat. Changing the file name every day might be pretty annoying. Well, it’s much more pleasant to use an inline CTL expression. Take UniversalDataWriter for example and simply set its fileURL attribute to something like this:

${DATAOUT_DIR}/orders_`date2str(today(), "yyyy-MM-dd")`.dat

Notice the `date2str(today(), "yyyy-MM-dd")` part — that’s an inline CTL expression. Yes, that’s all you need to do! Just enclose your CTL expression within back quotes and place it anywhere you like. Except CTL transformations of course, that would be meaningless. ;-)

The good news is that you can reference graph parameters from inline CTL expressions. Such graph parameters might again contain CTL expressions. And so forth… Assuming you defined parameters FIRST_NAME and LAST_NAME, defining another parameter, let’s say FULL_NAME, in the following way is perfectly valid:

`substring('${FIRST_NAME}', 0, 1)`. ${LAST_NAME}

Beware, two adjacent back quotes are treated as an empty CTL expression and always evaluated to an empty string. You might also ask how to use back quotes within inline CTL expressions. Well, it’s pretty straightforward, just escape them using a back slash, i.e. \`. (Both these features work since version 2.8.1.)

By default, evaluation of inline CTL expressions is turned on. If you want to turn this feature off for any reason, you can simply do so by setting the GraphProperties.EXPRESSION_EVALUATION_ENABLED configuration property to false.

We hope you’ll find this brand new feature useful! You can let us know where and why you use it by leaving a comment to this post.

Blog at WordPress.com.