We are excited to announce the release of DataSync 1.0!
One of the most important improvements in this release is that data publishers can now use the “replace” operation as the default way to update essentially any dataset, even very large datasets (millions of rows). This is possible because the new “replace via FTP” method in DataSync automatically detects which rows have been added, updated, or deleted and only publishes those changes to the dataset. For the vast majority of datasets, this will remove the need for data publishers to take on the rather complicated task of scripting a process to determine which rows have been added, updated, or deleted since the last dataset update. Publishers will no longer have to use the “upsert” method to update their datasets, a method which often requires significant developer resources. With DataSync 1.0, automating data publishing is as easy as extracting all the data into a CSV or TSV file and creating a simple DataSync job to publish the CSV or TSV to the Socrata dataset. The data publisher can then use Windows Task Scheduler or Cron to schedule the DataSync job to run automatically (i.e. every day).
If you are already using DataSync you just need to download the new JAR file below and replace your existing JAR file. If you are not using a previous version of DataSync you can simply download version 1.0 below. Note that DataSync 1.0 requires Java version 1.7. If you do not have version 1.7 (for example, if you are still using version 1.6) you can download Java 1.7 here.
Download DataSync 1.0: https://github.com/socrata/datasync/releases/download/1.0/datasync_1.0.jar
DataSync documentation has also been dramatically improved and expanded. There is now comprehensive documentation for using DataSync exclusively as a command-line tool (headless mode).
We also invite you to contribute to the documentation using a GitHub pull request to the gh-pages branch of the DataSync repository.
DataSync 1.0 comes with additional enhancements and new features, many of which are based off of customer requests:
File -> New.. -> Metadata Job
).Many thanks to the generous open source code contribution to DataSync by Brian Williamson for that new job type!Other small features:
View the full list of features added in version 1.0 here
Want to leave a question, comment, suggestion, or bug report on DataSync? Submit these to the DataSync Github repository issue tracker - all you need is a free GitHub account.
Watch the GitHub repository to remain up to speed with new features on the roadmap and deploy schedules for future versions.
Related Links: