Using the New York State Online Geocoding API with Python

I’ve been very lucky doing geographic analysis in New York state, as the majority of base map layers I need, and in particular streets centerline files for geocoding, are available statewide at the NYS GIS Clearing house. I’ve written in the past how to use various Google API’s for geo data, and here I will show how one can use the NYS SAM Address database and their ESRI online geocoding service. I explored this because Google’s terms of service are restrictive, and the NYS composite locator should be more comprehensive/up to date in matches (in theory).

So first, this is basically the same as with most online API’s (at least in my limited experience), submit a particular url and get JSON in return. You just then need to parse the JSON for whatever info you need. This is meant to be used within SPSS, but the function works with just a single field address string and returns the single top hit in a list of length 3, with the unicode string address, and then the x and y coordinates. (The function is of course a valid python function, so you could use this in any environment you want.) The coordinates are specified using ESRI’s WKID (see the list for projected and geographic coordinate systems). In the code I have it fixed as WKID 4326, which is WGS 1984, and so returns the longitude and latitude for the address. When the search returns no hits, it just returns a list of [None,None,None].

*Function to use NYS geocoding API.
BEGIN PROGRAM Python.
import urllib, json

def ParsNYGeo(jBlob):
  if not jBlob['candidates']:
    data = [None,None,None]
  else:
    add = jBlob['candidates'][0]['address']
    y = jBlob['candidates'][0]['location']['y']
    x = jBlob['candidates'][0]['location']['x']
    data = [add,x,y]
  return data

def NYSGeo(Add, WKID=4326):
  base = "http://gisservices.dhses.ny.gov/arcgis/rest/services/Locators/SAM_composite/GeocodeServer/findAddressCandidates?SingleLine="
  wkid = "&maxLocations=1&outSR=4326"
  end = "&f=pjson"
  mid = Add.replace(' ','+')
  MyUrl = base + mid + wkid + end
  soup = urllib.urlopen(MyUrl)
  jsonRaw = soup.read()
  jsonData = json.loads(jsonRaw)
  MyDat = ParsNYGeo(jsonData)
  return MyDat

t1 = "100 Washington Ave, Albany, NY"
t2 = "100 Washington Ave, Poop"

Out = NYSGeo(t1)
print Out

Empt = NYSGeo(t2)
print Empt
END PROGRAM.

So you can see in the code sample that you need both the street address and the city in one field. And here is a quick example with some data in SPSS. Just the zip code doesn’t return any results. There is some funny results here though in this test run, and yes that Washington Ave. extension has caused me geocoding headaches in the past.

*Example using with SPSS data.
DATA LIST FREE / MyAdd (A100).
BEGIN DATA
"100 Washington Ave, Albany"
"100 Washinton Ave, Albany"
"100 Washington Ave, Albany, NY 12203"
"100 Washington Ave, Albany, NY, 12203"
"100 Washington Ave, Albany, NY 12206"
"100 Washington Ave, Poop"
"12222"
END DATA.
DATASET NAME NY_Add.

SPSSINC TRANS RESULT=GeoAdd lon lat TYPE=100 0 0 
  /FORMULA NYSGeo(Add=MyAdd).

LIST ALL.
Leave a comment

6 Comments

  1. Mary Thomas

     /  October 17, 2017

    I have tried this and the google api with no luck. This is exactly what I need. Currently the error on this post is MyUrl = base + mid + WKID + end
    TypeError: cannot concatenate ‘str’ and ‘int’ objects

    can you help?

    Reply
    • It looks like I hard coded wkid in my original post. Try this function:

      #############
      def NYSGeo(Add, WKID=4326):
      base = “http://gisservices.dhses.ny.gov/arcgis/rest/services/Locators/Street_and_Address_Composite/GeocodeServer/findAddressCandidates?SingleLine=”
      wkid = “&maxLocations=1&outSR=” + str(WKID)
      end = “&f=pjson”
      mid = Add.replace(‘ ‘,’+’)
      MyUrl = base + mid + wkid + end
      soup = urllib.urlopen(MyUrl)
      jsonRaw = soup.read()
      jsonData = json.loads(jsonRaw)
      MyDat = ParsNYGeo(jsonData)
      return MyDat

      t1 = “100 Washington Ave, Albany, NY”

      Out = NYSGeo(t1,WKID=26918)
      print Out
      #############

      Using that you should be able to pass in WKID as either a string or an integer and it will work out fine. (Also note I changed the base function, as they appear to have changed the geocoding service.)

      Reply
      • Mary Thomas

         /  October 17, 2017

        Thanks will try , am changing all inputs to our server Trying your post ,Online geocoding in R using the NYS GIS server having some syntax errors…..working thru them.

        thank you for fast response Been working on this for days.

        Mary Thomas

        On Tue, Oct 17, 2017 at 11:32 AM, Andrew Wheeler wrote:

        > apwheele commented: “It looks like I hard coded wkid in my original post. > Try this function: ############# def NYSGeo(Add, WKID=4326): base = ” > http://gisservices.dhses.ny.gov/arcgis/rest/services/ > Locators/Street_and_Address_Composite/GeocodeServer/findAddressCandidates” >

  2. Ok, I presume you have seen this post then — https://andrewpwheeler.wordpress.com/2015/09/22/online-geocoding-in-r-using-the-nys-gis-server/ as well. If you have any questions always feel free to send me an email as well.

    Reply
  1. Online geocoding in R using the NYS GIS server | Andrew Wheeler
  2. Data sources for crime generators | Andrew Wheeler

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: