Scanning Excel files with the database scanner

This article covers some hints that might be useful when configuring the migration-center database scanner for scanning Excel files.

Requirements

MS Office is not necessarily to be installed on the Jobserver machine but at least the Excel ODBC driver needs to be installed. One of the easiest ways to install the Excel ODBC driver is to download and install the Microsoft Access Database Engine 2010 Redistributable that includes the necessary drivers: http://www.microsoft.com/en-us/download/confirmation.aspx?id=13255

 

How to configure connectionURL for scanning Excel files?

The connectionURL parameter should be like that:

jdbc:odbc:DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=PATH-TO-EXCEL-FILE;ReadOnly=1

Note: All extensions (*.xls, *.xlsx, *.xlsm, *.xlsb) are required.

 

Solving the issue when the scanner scans null values for some fields even when they are not null in the sheet column

When a column in the sheet contains mixed data (numbers and strings) the ODBC driver uses the first few rows to detect the column type. If the first values are numbers, the ODBC driver tries to handle all the subsequent values as numbers as well and therefore the values that are strings (not numbers) will be ignored, resulting in a “null” value being scanned by mc. To avoid this problem, the parameter IMEX=1 must be set in the metadata query, e.g.:

<query type="main-metadata" name="metadata query">

select * from [Excel8.0;HDR=YES;IMEX=1;Database=PATH-TO-EXCEL-FILE].[sheet1$] where "id" = ?

</query>

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk