CloverETL's Blog

January 12, 2010

DataDirect’s OracleDB JDBC driver speed test

Filed under: Using CloverETL — Tags: , , , , , , — Agata Vackova @ 12:39 pm

Purpose

Compare the speed of data loading into Oracle database (Oracle Database 11g Release 11.1.0.6.0 – Enterprise Edition) with Oracle corp. JDBC driver, DataDirect JDBC Oracle driver and direct data loading (OracleDataWriter component – sqlldr utility) in CloverETL.

Test description

Graph used for testing:

DDdbLoad.grf

The above graph loads data into database table that contains 3 number columns and 127 varchar columns.

Database table for storing data is truncated before each data loading – DBExecute components, each with query: TRUNCATE TABLE dd_test1 REUSE STORAGE

Phase 1: loading data with DDBulkLoad (DataDirect) object from csv file (loader.load(file))

Phase 3: loading data with DDBulkLoad (DataDirect) object from ResultSet (loader.load(resultSet)) – created ResultSet implementation, that reads data from DataRecord (read from the edge).

Phase 4: loading data with DBOutputTable with Oracle corp’s  JDBC driver:

Manifest-Version: 1.0
Specification-Title:    Oracle JDBC driver classes for use with JDK14
Sealed: true
Created-By: 1.4.2_08 (Sun Microsystems Inc.)
Implementation-Title:   ojdbc14.jar
Specification-Vendor:   Oracle Corporation
Specification-Version:  Oracle JDBC Driver version - "10.2.0.1.0XE"
Implementation-Version: Oracle JDBC Driver version - "10.2.0.1.0XE"
Implementation-Vendor:  Oracle Corporation
Implementation-Time:    Wed Jan 25 01:28:31 2006

Phase 7: loading data with DBOutputTable with DataDirect Oracle JDBC driver – enabled bulk load feature

Phase 9: loading data with OracleDataWriter component from csv file (sqlldr utility)

Phase 11: loading data with OracleDataWriter component from edge (sqlldr utility)

Test processing

Graph run 3 times for 10,000,000 records with default DataDirect settings.

Graph run 3 times for 1,000,000 records with default DataDirect settings.

Graph run 3 times for 1,000,000 records with following settings:

Test results

Results in seconds.

1,000,000 records with default DataDirect settings:

Phase 1: 178, 167, 132 – min: 132, max: 178, average: 159

Phase 3: 128, 166, 152 – min: 128, max: 166, average: 149

Phase 5: 228, 246, 290 – min: 228, max: 290, average: 255

Phase 7: 176, 170, 239 – min: 170, max: 239, average: 195

Phase 9: 44, 45, 56 – min: 44, max: 56, average: 48

Phase 11: 104, 95, 106 – min: 95, max: 104, average: 102

1,000,000 records with custom settings:

Phase 1: 163, 152, 142 – min: 142, max: 163, average: 152

Phase 3: 166, 133, 134 – min: 133, max: 166, average: 144

Phase 5: 278, 263, 260 – min: 260, max: 278, average: 267

Phase 7: 239, 172, 209 – min: 172, max: 239, average: 207

10,000,000 records with default DataDirect settings:

Phase 1: 1553, 1818, 1352 – min: 1352, max: 1818, average: 1574

Phase 3: 1475, 1299, 1298 – min: 1298, max: 1475, average: 1357

Phase 5: 3041, 2592, 2550 – min: 2550, max: 3041, average: 2728

Phase 7: 1824, 1623, 1722 – min: 1722, max: 1824, average: 1723

Phase 9: 404, 432, 472 – min: 404, max: 472, average: 436

Phase 11: 1096, 975, 1012 – min: 975, max: 1096, average: 1028

Summary

Loading data was slowest when DBOutputTable with original Oracle corp. driver was used. All loadings with DataDirect driver were faster than with Oracle corp. driver but the usage of DDBulkLoad object (DataDirect) plainly increases the speed of loading data in comparison with setting EnableBulkLoad=true and using DBOutputTable. The results for loading data from csv file and edge (result set) are very similar with slight predomination of ResultSet method. All three methods with DataDirect driver usage, get to more steady execution times with number of records to load.

The fastest way of loading data is unquestionably direct data loading with sqlldr utility. Even when inter-storing data in pipe, the sqlldr utility is about 50% faster than any other method, but is less convenient.
1,000,000 records
10,000,000 records

1 Comment »

  1. Progress|DataDirect is aware of these findings from CloverETL and has taken quick action to contact CloverETL directly to discuss test parameters and ways to improve the performance results and findings using the DataDirect JDBC Oracle driver with bulk load.

    Comment by Jonathan bruce — January 19, 2010 @ 6:17 pm


RSS feed for comments on this post. TrackBack URI

Leave a comment

Blog at WordPress.com.