Geoclient: An Open NYC Geocoding API

Overview
Geoclient is a geocoding API that recognizes and geocodes addresses, intersections and blockfaces (on street and two cross streets) located in New York City. The Geoclient service provides a RESTful web service interface to the Department of City Planning’s (DCP) Geosupport system. Geoclient provides “pass-through” style access to native Geosupport functions. It does not change or modify Geosupport functionality in any way. However, Geoclient does add some very useful features (some of which are considered standard by today’s developers). While Geoclient is intended for programmatic use, DCP hosts GOAT, a website that provides direct access to Geosupport native functions. Geoclient developers often use GOAT to compare results or test data interactively.

Access
There are two primary Geoclient installations. One is for use internally by official City agencies and the second intended for the public developer community. However, the data and logic behind all instances of Geoclient are exactly the same. By sharing the same code base and deployment scripts, it is easier to maintain and support. For the public, access to Geoclient is through the NYC Developer Portal. For City agencies, contact us directly.

Functionality
Geoclient provides two notable enhancements which can be used to simplify and optimize access to Geosupport. The first is single-field search functionality which parses a single input string into the discrete location elements that are required by the different Geosupport functions. For example, an input of “59 Maiden Lane, Manhattan” is parsed into its house number (59), street name (Maiden Lane) and Borough (Manhattan) for submission to Geosupport. Note that parsing is not case-sensitive, punctuation is ignored and most standard street pre- and post-modifiers, types and directionals are recognized.

A fully qualified (i.e., expanded) address is often referred to as a ‘normalized’ address. An example is ‘314 w 100 st mn’ which is the equivalent of the normalized ‘314 West 100 Street, Manhattan’. Normalization is native to Geosupport. If your data is already parsed into discrete address elements, using the /geoclient/v1/address endpoint directly is less ambiguous and slightly more efficient.

Building on it’s ability to parse single-field search locations, Geoclient can also be used to “guess” the intended target of an incomplete or ambiguous input. One example is the submission of an address without a borough. As long as Geoclient can recognize the search text as an address (house number and street), the /geoclient/v1/search endpoint will try that address in all five boroughs. By default, if the address exists in one or more boroughs, all locations will be fully geocoded and returned as possible matches. This behavior can be customized, for example, by calling the service with the optional ‘exactMatchForSingleSuccess’ parameter set to true. In that case, if the address exists in only one borough it will be geocoded and returned as an exact match.

Using the previous example, entering “59 maiden ln” without the borough will result in the same response as entering “59 Maiden Lane, Manhattan”. The figure below shows the response for “59 Maiden Ln” using the Pre-K Finder application.

geoclient_imputed

In cases where the same address exists in multiple boroughs, Geoclient will return a candidate list of possible addresses. In each case, the candidate address is validated to ensure a successful response. See example below.

geoclient_candidate

 

Developers can customize this and several other search features as documented by the Geoclient API.

Hopefully this will clear up some of the misconceptions associated with Geoclient and contribute to the knowledge base. And please check back in the coming weeks for a more detailed Geoclient post. In addition, expect some enhancements over the coming months. Nice work Matt!

NYC Addressing: A Primer

There is no mystery and intrigue when defining the primary function of an address is to locate or identify a property. And although we often take addressing (hereto defined as the process of assigning and using addresses) for granted, addressing provides an essential function to all. This is evident in daily life where addresses are used by individuals, corporations and governments as they interact and conduct business. Common examples across this spectrum are the delivery of mail and packages, police and fire departments responding to 911 emergency service calls, and generally navigating the areas we inhabit or visit. Addressing is the fuel that make our cities, towns and villages run clean.

The mystery and intrigue comes from improper or confusing addresses that can cause problems or delays with the delivery of services and response to emergency incidents at an address. Numerous stories exist of problems encountered by first responders to problematic addresses. Standardized and predictable addressing makes locating an address quicker and easier for all parties. When and where possible it is best to assign addresses:

  • in logical numeric sequence and
  • consistently across a single block (all with or without hyphens);
  • with odd and even house numbers on separate street sides;
  • to the street a property fronts;
  • that are not duplicates of existing addresses.

There is no single authority overseeing address within NYC. Addresses are assigned in New York City by the Topographical Units of the respective Borough President’s (BP) Offices. That is, the Queens BP assigns addresses only within Queens and so forth. NYC DoITT provides a secure web-based application for BP’s to make address and street name assignments. The application ensures centralized storage of address assignments; notification to responsible parties (911) and consistency of address assignments across boroughs.

Addresses are assigned to buildings for the following general cases:

  • new construction;
  • additional entrance to an existing business;
  • change an address of an existing building;
  • storefront business.

Unique Cases

It sometimes seems as if NYC has each and every possible address anomaly although that is most certainly not the case. Below are just a few types of the address anomalies in NYC.

Vanity Address: an address for a building that uses a street or place name on which the building does not front. The figure below provides an excellent an example as well as the challenges vanity addresses pose. Imagine trying to find 16 Penn Plaza while standing in front of 2 Penn Plaza.

Penn Plaza Area

Hyphenated Address: often referred to as Queens-style addresses, a hyphenated address has a hyphen in the house number (e.g., 70-111). The left side of the hyphen represents the nearest cross street exclusive of avenues and the right side of the hyphen represents the house number.

Edgewater Park: a gated community in the Bronx, Edgewater Park is divided into alphabetic sectors (A, B, C…) which are used in lieu of a street name for addressing. Geosupport uses Edewater Park as the street name to avoid confusion to the extent possible. An example of an address 111C Edgewater Park. See figure below.

Edgewater Park

Miscellaneous

House number containing fraction and letter: 138 1/2 B Edgewater Park, Bronx.

Odd and even house numbers on the same side of street: Park Row, Manhattan (see figure below)

Hyphenated and non-hyphenated address on the same block: Ann Street, Manhattan (see figure below)

Park Row Addresses

Address Data

There are two primary methods for modeling and managing addresses in a geospatial database. The first is by street, which is commonly referred to as a street centerline. This method models the high and low house numbers on a street segment (i.e., block) for each side of the street. Geocoders then interpolate an input address proportionately between the high and low house number range on the respective side of the street. Geocoded addresses using this method are approximations of actual addresses and include hypothetical non-existent addresses.

The second, and more recent approach, is to represent each individual address, which is referred to as address points. For this method, each and every address is modeled generally within the building the address falls. Both methods are used by NYC and both data sets are available to the public.

Address points is a geospatial dataset that models the approximate entrance of a building and includes the properties signed address (house number, street name). Address points were developed by NYC DoITT and completed in 2012. The data were subsequently released to the public in 2013. Since that time the data has been released on a quarterly basis.

Data sources

CSCL, Citywide Street Centerline, models only physical streets and does not have duplicate segments for cases where there are alternate street names.

LION – an extract from CSCL that includes both roadbed (modeling of dual carriage ways) and generic (modeling a single line to represent dual carriage ways). LION provides both to support legacy use of the data. In addition, LION has duplicate segments for each alternate name of a street segment.

Address points – a point representing all known addresses.

Other Resources

Manhattan BP – http://manhattanbp.nyc.gov/downloads/pdf/address-assignments-v-web.pdf