Heads Up! The ability to modify a dataset requires special permissions.
The SODA Producer Upsert
API allows you to create, update, and delete rows in a single operation, using their row identifiers. This is an excellent way to keep your Socrata dataset in sync with an internal system.
Please note that all operations that modify datasets must be authenticated as a user who has access to modify that dataset and may optionally include an application token.
The dataset for this example is the USGS Earthquakes Sample Dataset, which has its publisher-specified row identifier set to earthquake_id
.
We’ll format our upsert payload as a JSON array of objects. In this payload, we’ll be:
demo1234
71842370
00388609
This example contains only a few records, but upsert operations can easily update thousands of records at a time.
Note a few things about this payload:
demo1234
and the update object for 72842370
is that a row doesn’t exist with that identifier for the first, while one does for the second. The server will automatically figure out if a record already exists, and update it, rather than creating a duplicate.00388609
, we’ve included the special :deleted
key, with a value of true
. This tells the server to delete the record matching that ID. We also don’t need to include the rest of the object.These things combined means that this upsert operation is entirely idempotent. The operation can be retried an infinite number of times, and the state of the dataset will be the same after every time. This makes upsert requests very safe to retry if something goes wrong.
Once you’ve constructed your payload, upserting it is as simple as POST
ing it to your dataset’s endpoint, along with the appropriate authentication and (optional) application token information:
You’ll get back a response detailing what went right or wrong:
That means we created one record, updated a second, and deleted a third. If we were to retry that operation again, we’d get the following:
Since our record was already created, our “create” became an “update”. And because our third row was already deleted, there was nothing to delete with that ID.
You can use properly-formatted “Comma Separated Value” (CSV) data to do updates, just like you can do with JSON. Just make sure you follow a few rules:
"
)Marc "Dr. Complainingstone" Millstone
would become "Marc ""Dr. Complainingstone"" Millstone"
)Here’s an example:
Source,Earthquake ID,Version,Datetime,Magnitude,Depth,Number of Stations,Region,Location
demo,demo1234,1,03/26/2014 10:38:01 PM,1.2,7.9,1,Washington,"(47.59815, -122.334540)"
nc,71842370,2,09/14/2012 10:14:21 PM,1.4,0,21,Northern California,"(38.8023, -122.7685)"
Just like before, upserting it is as simple as POST
ing it to your dataset’s endpoint, along with the appropriate authentication and (optional) application token information. Make sure you use a content type of text/csv
:
You’ll get back a response like you did in the previous example, detailing what went right and wrong:
Finally, note that appending and upserting geographic information to be geocoded by Socrata requires writing a string into a single location column. Please refer to our Support Portal documentation for specific information on location columns.