Row Identifiers

What is a Row Identifier?

Socrata datasets are essentially a collection of rows. Each row can be uniquely designated by its “row identifier”, much like a driver’s license number or social security number identifies an individual. For those familiar with database concepts, they essentially act the same way as primary keys.

Internal Identifiers vs Publisher-Specified Identifiers

Row identifiers come in two flavors:

  • Internal identifiers are auto-generated by the Socrata platform every time a new row is created.
  • Publisher-specified identifiers are configured by the dataset owner and use a field of unique values within the dataset as the row identifier.

Depending on what dataset you're accessing, internal row identifiers may be simple integers, or alphanumeric strings. There's no difference between the two in how you use them.

To learn more about how to access internal row identifiers, read the System Fields documentation.

Retrieving Rows By Their Identifiers

To use a row identifier to look up a row, simply append it to the resource endpoint for that dataset. For example, to look up row 1 from the White House Visitor Records dataset using its row identifier:

https://open.whitehouse.gov/resource/p86s-ychb/1.json?

In contrast, the Chicago Crimes dataset is configured to use a publisher-specified identifier. To look up the earthquake with the ID 00388609:

https://data.cityofchicago.org/resource/6zsd-86xi/10399602.json?

Establishing a Publisher-Specified Identifier

Setting a row identifier requires that you are either the owner of a dataset, or that you've been granted a role of Publisher or Administrator on a Socrata customer site. Basically, if you can't modify the dataset, you can't set a row identifier.

A publisher-specified row identifier can be established for any Socrata dataset. A common column to use as a row identifier is an ‘ID’ column with some kind of number or code that uniquely identifies that row of data. For example, the ‘Inspection ID’ column of Chicago’s Food Inspections dataset is a Publisher-specified row identifier.

How to Set a Row Identifier

  1. When viewing a dataset click the dark red “About” button in the upper right.
  2. In the side menu that appears, click “Edit metadata” (if you do not see this link, ensure you are logged in).
  3. Scroll to the “API Endpoint” subheader. Below that you can select the row identifier from the drop down menu that contains all the columns within the dataset.
  4. After selecting the appropriate column, click “Save” at the bottom.