Month: September 2009

Unlocking data trapped in mdb files – on Linux

I thought this a nice title for this short post about getting your data out of MS Access files on Linux.
Some weeks ago I had the task to get a schema of the US census summary files into PostgreSQL to subsequently being able to load the entire US data for those summary files (SF1 and SF3).
Browsing census SF1 data via PgAdmin in PostgreSQL
After some searching on the web I found schemas for those summary files in Access mdb format (SF1 template and SF3 template). Many of the tables in the summary files have well over 100 columns (see screen shoot in pgAdmin on the left hand side) and there are more than 200 tables for both summary files combined. So that was definitely not a quick “punch-the-table-schema-in-quickly” job to recreate the tables in PostgreSQL. Again searching on the web I stumbled about this post mentioning the excellent mdbtools. Great – just the right tool found in time. Since the server where the work is supposed to happen runs Ubuntu there was even no need to compile the mdb tools program at all.

After installing mdb tools via
sudo apt-get install mdb-schema
I could easily export the schemas from the SF1and SF3 mdb files as SQL statements:
mdb-schema -S sf1.mdb postgres > sf1.sql

After running the sql files I had the schemas loaded in almost no time. On this blog there is even a python program using the mdb tools that allows you to directly extract data from mdb files into csv format – sweet.
Have fun unlocking your data !