The City of Chicago has generously released and documented their fully open source Extract-Transform-Load (ETL) toolkit and framework that uses Pentaho’s open source data integration tool (Kettle) to automatically publish data to the citiy’s Socrata Open Data Portal. The toolkit provides several utilities and a framework to help governments deploy automated ETLs using the open-source Pentaho data integration (Kettle) software.
This toolkit includes the following functionality:
Since this framework is based on Pentaho Kettle it provides the means to extract and transform data from a variety of data sources such as MySQL, PostgreSQL, Oracle, SQL Server, a variety of NoSQL, APIs, text files, etc.
For more information and getting set up with the framework refer to: