Export from a Database to Excel

I was motivated to write this post by inquiries such as this one on MySQL forum about how to export data from database to Excel.  MySQL is not the only one though, similar requests can be found on Oracle or Postgres forums too. Of course a CSV format or doing some coding is always an option, but Clover lets you generate XLS/XLS(X) effortlessly and with some advantages:

  • Clover does NOT require Excel to do be installed on machine where we are doing the export. (Yes, this means you can export even on your Linux MySQL installation.)
  • We can export from any database that has a JDBC driver.(MySQL, Oracle, Informix, DB2, Postgres)
  • Export can run on any platform – Windows, OS X, Linux.
  • It is possible to do additional validation and manipulation before writing to Excel.

To keep things simple we will work just with a single database table called customers stored in MySQL:

Our goals in this post are:

  1. Connect to the database and download the customer data.
  2. Write the customer data to Excel sheet, including the header with column names.

And finally, to make it more interesting

3. Split customers into multiple sheets based on the state of their origin.

Let’s build the graphs now:

Step 1: Connecting to MySQL database

Create a new project and a new graph (see the videos if you don’t know what I am talking about), go to Outline window and new database connection.

Connection to MySQL database

Connection to my MySQL database

Step 2: Building the graph

Build the graph from DBInputTable and XLSDataWriter components. DBInputTable reads data from database table, or result of SQL query, while ExcelDataWriter writes it to Excel spreadsheet.

Graph exporting from MySQL to Excel

Graph exporting from MySQL to Excel

You will need a metadata definition to describe the structure of your database records. The easiest way to do this is to use a wizard from Outline window. Right click on Metadata entry and select Metadata > Extract from database. Once done, assign your metadata on the link (edge) between the DBInputTable and XLSDataWriter. Basic structure of our export graph is laid out.

Metadata extracted from database

Metadata extracted from database

We’ll configure the components now.

Configure the DBInputTable to use the MySQL connection we created in the previous step. Supply the SQL query to retrieve the data. Mine is just simple “select *”, but it can be any statement – joins, aggregation, while-filtering etc.

DBInputTable Configuration

DBInputTable Configuration

Next, configure the XLSDataWriter to write into desired output file. Notice the property settings. To include header on the very first line, we set the Metadata row property is to 1. Data must be appended after the header row. That is why the Start row property is set to 2.

XLSDataWriter configuration

XLSDataWriter Configuration

Step 4. Run the graph

Start the graph now and check the exported data in your output file. If you just wanted to export the data you are done!

Exported data in Excel format

Exported data in Excel format

Optionally: Split data to sheets based on field value

Clover has a nifty feature of being able to write data into multiple sheets, splitting them based on a field value. Let’s split our customers into sheets based on the state of their origin. To do this we only need to change a setting  in XLSDataWriter configuration. Set the Sheet name property to value $state. This tells the writer to examine the value of state field of each record to be written our and place it to a matching sheet.

Export to multiple=

The resulting Excel file now contains multiple worksheets (one for each US state) and all records in the same worksheet have the same state.

Data exported to multiple=

After the reading is set up, you can further extend the graph with some more components – filters, sorters, duplicate removals, or reformat to compute new values or change structure of output. Of course, designing a reverse process – import from Excel to a database is similarly straightforward. Happy exporting!

Writing to Infobright Database

In response to customer requests we created a new component that writes data into Infobright software, a very popular column-oriented relational database. The CloverETL’s just released version 2.9 offers this new InfobrightDataWriter component.

Infobright is a highly compressed column-oriented database, based on MySql engine. In this database data are stored column-by-column instead of more typically row-by-row. There are many advantages of column-orientation, including the ability to do more efficient data compression and allowing compression to be optimized for each particular data type. The higher efficiency can be achieved because each column stores a single data type as opposed to rows that typically contain several data type. However the main purpose of Infobright software is to deliver a scalable data warehouse database optimized for analytic queries.

Our new InfobrightDataWriter component allows to easily load data into Infobright database directly from CloverETL transformations. It can be used both with Infobright Community Edition (ICE) as well as with Infobright Enterprise Edition (IEE). With IEE it is possible to use fast data loading in binary format. In binary format particular rows aren’t separated by any special character and there are no values’ delimiters or qualifiers either.

The connector uses named pipe for data transfer. In Linux systems it is system pipe, on Windows you need native library placed on the Java library path. The library is part of  infobright-core-vX_X package that can be downloaded from Infobright site (Download Contributed Software).

For remote load it is necessary to start Infobright remote load agent on the server where Infobright is running (part of infobright-core-v3).

Example graph:

 

InfobrightDataWriter configuration:

The above graph loads data from selected input fields (“Clover fields” attribute) to database table test (“Database table” attribute) on remote server. It uses fast binary method of data loading (“Data format” attribute). Data loaded into database can be sent to output port. Input metadata has free format but the output one has to meet certain conditions:

  • comma as a field delimiter
  • system new line ('n' for Linux, 'rn' for Windows) as a record delimiter
  • date/time formatting: yyyy-MM-dd HH:mm:ss (can be date only, time only or both)

As we load data to the remote server, the Infobright agent must run on the server. To start the agent we need to call:

java -jar infobright-core-3.0-remote.jar -p 6666 -l all

The port the agent is listening to must be the same as the agent port set on the component (“Remote agent port” attribute).

Full description of the component can be found on CloverETL’s wiki pages (InfobrightDataWriter component).