Forecasting with RSocrata

Ever since the City of Chicago team built the RSocrata Connector I have been dying to put together this simple tutorial for showing the ability for forecasting data in R on top of Socrata datasets. Most recently the City of Chicago updated the RSocrata connector to include write.socrata which will have a number of interesting uses.

This example is pulling a dataset in from the City of Austin Open Data Portal containing EMS Incidents by Month and forecasting the next two years of EMS Incidents.

Three dependencies you will need are:

devtools [needed for getting the GitHub version of RSocrata and not through CRAN]
RSocrata [reading and writing to and from Socrata]
forecast [values generator]

Step 0: getting RSocrata from GitHub


The first real step is to import the dataset as an R dataframe.

# API Endpoint for EMS-Ambulence Responses by Month
EMSIncidents <- read.socrata("") 

The next step is to create a time series variable based off of the response column in the dataset.

# Create time series variable based off of "count_responses_all"
myts <- ts(EMSIncidents$count_responses_all, start=c(2010, 1), end=c(2015,2), frequency=12)

Time Series  Plot

The next component is to control for seasonality that exists within the data.

# Seasonal Decomposition#
fit <- stl(myts, s.window = "period")


Next we can forecast the dataset out a number of periods.

# Projected Forecast

Projected Forecast

Let’s save these forecasted values in their own data frame

projected <- forecast(fit) # stores it as a list 
projected.DF <- #converts list to data frame

Next lets write this to Socrata

# Store user email and password
socrataEmail <- Sys.getenv("SOCRATA_EMAIL", "")
socrataPassword <- Sys.getenv("SOCRATA_PASSWORD", "XXXXXXX")

datasetToAddToUrl <- "" # dataset


Let’s use Socrata to visualize this data as well.

Check out the Code on Github

RSocrata saves the Day!