Skip Headers
Oracle® Spatial Developer's Guide
11g Release 2 (11.2)

Part Number E11830-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

11 Geocoding Address Data

Geocoding is the process of associating spatial locations (longitude and latitude coordinates) with postal addresses. This chapter includes the following major sections:

11.1 Concepts for Geocoding

This section describes concepts that you must understand before you use the Spatial geocoding capabilities.

11.1.1 Address Representation

Addresses to be geocoded can be represented either as formatted addresses or unformatted addresses.

A formatted address is described by a set of attributes for various parts of the address, which can include some or all of those shown in Table 11-1.

Table 11-1 Attributes for Formal Address Representation

Address Attribute Description

Name

Place name (optional).

Intersecting street

Intersecting street name (optional).

Street

Street address, including the house or building number, street name, street type (Street, Road, Blvd, and so on), and possibly other information.

In the current release, the first four characters of the street name must match a street name in the geocoding data for there to be a potential street name match.

Settlement

The lowest-level administrative area to which the address belongs. In most cases it is the city. In some European countries, the settlement can be an area within a large city, in which case the large city is the municipality.

Municipality

The administrative area above settlement. Municipality is not used for United States addresses. In European countries where cities contain settlements, the municipality is the city.

Region

The administrative area above municipality (if applicable), or above settlement if municipality does not apply. In the United States, the region is the state; in some other countries, the region is the province.

Postal code

Postal code (optional if administrative area information is provided). In the United States, the postal code is the 5-digit ZIP code.

Postal add-on code

String appended to the postal code. In the United States, the postal add-on code is typically the last four numbers of a 9-digit ZIP code specified in "5-4" format.

Country

The country name or ISO country code.


Formatted addresses are specified using the SDO_GEO_ADDR data type, which is described in Section 11.2.1.

An unformatted address is described using lines with information in the postal address format for the relevant country. The address lines must contain information essential for geocoding, and they might also contain information that is not needed for geocoding (something that is common in unprocessed postal addresses). An unformatted address is stored as an array of strings. For example, an address might consist of the following strings: '22 Monument Square' and 'Concord, MA 01742'.

Unformatted addresses are specified using the SDO_KEYWORDARRAY data type, which is described in Section 11.2.3.

11.1.2 Match Modes

The match mode for a geocoding operation determines how closely the attributes of an input address must match the data being used for the geocoding. Input addresses can include different ways of representing the same thing (such as Street and the abbreviation St), and they can include minor errors (such as the wrong postal code, even though the street address and city are correct and the street address is unique within the city).

You can require an exact match between the input address and the data used for geocoding, or you can relax the requirements for some attributes so that geocoding can be performed despite certain discrepancies or errors in the input addresses. Table 11-2 lists the match modes and their meanings. Use a value from this table with the MatchMode attribute of the SDO_GEO_ADDR data type (described in Section 11.2.1) and for the match_mode parameter of a geocoding function or procedure.

Table 11-2 Match Modes for Geocoding Operations

Match Mode Description

EXACT

All attributes of the input address must match the data used for geocoding. However, if the house or building number, base name (street name), street type, street prefix, and street suffix do not all match the geocoding data, a location in the first match found in the following is returned: postal code, city or town (settlement) within the state, and state. For example, if the street name is incorrect but a valid postal code is specified, a location in the postal code is returned.

RELAX_STREET_TYPE

The street type can be different from the data used for geocoding. For example, if Main St is in the data used for geocoding, Main Street would also match that, as would Main Blvd if there was no Main Blvd and no other street type named Main in the relevant area.

RELAX_POI_NAME

The name of the point of interest does not have to match the data used for geocoding. For example, if Jones State Park is in the data used for geocoding, Jones State Pk and Jones Park would also match as long as there were no ambiguities or other matches in the data.

RELAX_HOUSE_NUMBER

The house or building number and street type can be different from the data used for geocoding. For example, if 123 Main St is in the data used for geocoding, 123 Main Lane and 124 Main St would also match as long as there were no ambiguities or other matches in the data.

RELAX_BASE_NAME

The base name of the street, the house or building number, and the street type can be different from the data used for geocoding. For example, if Pleasant Valley is the base name of a street in the data used for geocoding, Pleasant Vale would also match as long as there were no ambiguities or other matches in the data.

RELAX_POSTAL_CODE

The postal code (if provided), base name, house or building number, and street type can be different from the data used for geocoding.

RELAX_BUILTUP_AREA

The address can be outside the city specified as long as it is within the same county. Also includes the characteristics of RELAX_POSTAL_CODE.

RELAX_ALL

Equivalent to RELAX_BUILTUP_AREA.

DEFAULT

Equivalent to RELAX_POSTAL_CODE.


11.1.3 Match Codes

The match code is a number indicating which input address attributes matched the data used for geocoding. The match code is stored in the MatchCode attribute of the output SDO_GEO_ADDR object (described in Section 11.2.1).

Table 11-3 lists the possible match code values.

Table 11-3 Match Codes for Geocoding Operations

Match Code Description

1

Exact match: the city name, postal code, street base name, street type (and suffix or prefix or both, if applicable), and house or building number match the data used for geocoding.

2

The city name, postal code, street base name, and house or building number match the data used for geocoding, but the street type, suffix, or prefix does not match.

3

The city name, postal code, and street base name match the data used for geocoding, but the house or building number does not match.

4

The city name and postal code match the data used for geocoding, but the street address does not match.

10

The city name matches the data used for geocoding, but the postal code does not match.

11

The postal code matches the data used for geocoding, but the city name does not match.


11.1.4 Error Messages for Output Geocoded Addresses

Note:

You are encouraged to use the MatchVector attribute (see Section 11.1.5) instead of the ErrorMessage attribute, which is described in this section.

For an output geocoded address, the ErrorMessage attribute of the SDO_GEO_ADDR object (described in Section 11.2.1) contains a string that indicates which address attributes have been matched against the data used for geocoding. Before the geocoding operation begins, the string is set to the value ???????????281C??; and the value is modified to reflect which attributes have been matched.

Table 11-4 lists the character positions in the string and the address attribute corresponding to each position. It also lists the character value that the position is set to if the attribute is matched.

Table 11-4 Geocoded Address Error Message Interpretation

Position Attribute Value If Matched

1-4

(Reserved for future use)

????

5

House or building number

#

6

Street prefix

E

7

Street base name

N

8

Street suffix

U

9

Street type

T

10

Secondary unit

S

11

Built-up area or city

B

12-13

(Not used by Spatial)

(Ignore any values in these positions.)

14

Region

1

15

Country

C

16

Postal code

P

17

Postal add-on code

A


11.1.5 Match Vector for Output Geocoded Addresses

For an output geocoded address, the MatchVector attribute of the SDO_GEO_ADDR object (described in Section 11.2.1) contains a string that indicates how each address attribute has been matched against the data used for geocoding. It gives more accurate and detailed information about the match status of each address attribute than the ErrorMessage attribute (described in Section 11.1.4). Before the geocoding operation begins, the string is set to the value ?????????????????. Each character of this string indicates the match status of an address attribute.

Table 11-5 lists the character positions in the string and the address attribute corresponding to each position. Following the table is an explanation of what the value in each character position represents.

Table 11-5 Geocoded Address Match Vector Interpretation

Position Attribute

1-4

(Reserved for future use)

5

House or building number

6

Street prefix

7

Street base name

8

Street suffix

9

Street type

10

Secondary unit

11

Built-up area or city

14

Region

15

Country

16

Postal code

17

Postal add-on code


Each character position in Table 11-5 can have one of the following possible numeric values:

  • 0: The input attribute is not null and is matched with a non-null value.

  • 1: The input attribute is null and is matched with a null value.

  • 2: The input attribute is not null and is replaced by a different non-null value.

  • 3: The input attribute is not null and is replaced by a null value.

  • 4: The input attribute is null and is replaced by a non-null value.

11.2 Data Types for Geocoding

This section describes the data types specific to geocoding functions and procedures.

11.2.1 SDO_GEO_ADDR Type

The SDO_GEO_ADDR object type is used to describe an address. When a geocoded address is output by an SDO_GCDR function or procedure, it is stored as an object of type SDO_GEO_ADDR.

Table 11-6 lists the attributes of the SDO_GEO_ADDR type. Not all attributes will be relevant in any given case. The attributes used for a returned geocoded address depend on the geographical context of the input address, especially the country.

Table 11-6 SDO_GEO_ADDR Type Attributes

Attribute Data Type Description

Id

NUMBER

(Not used.)

AddressLines

SDO_KEYWORDARRAY

Address lines. (The SDO_KEYWORDARRAY type is described in Section 11.2.3.)

PlaceName

VARCHAR2(200)

Point of interest (POI) name. Example: CALIFORNIA PACIFIC MEDICAL CTR

StreetName

VARCHAR2(200)

Street name, including street type. Example: MAIN ST

IntersectStreet

VARCHAR2(200)

Intersecting street.

SecUnit

VARCHAR2(200)

Secondary unit, such as an apartment number or building number.

Settlement

VARCHAR2(200)

Lowest-level administrative area to which the address belongs. (See Table 11-1.)

Municipality

VARCHAR2(200)

Administrative area above settlement. (See Table 11-1.)

Region

VARCHAR2(200)

Administrative area above municipality (if applicable), or above settlement if municipality does not apply. (See Table 11-1.)

Country

VARCHAR2(100)

Country name or ISO country code.

PostalCode

VARCHAR2(20)

Postal code (optional if administrative area information is provided). In the United States, the postal code is the 5-digit ZIP code.

PostalAddOnCode

VARCHAR2(20)

String appended to the postal code. In the United States, the postal add-on code is typically the last four numbers of a 9-digit ZIP code specified in "5-4" format.

FullPostalCode

VARCHAR2(20)

Full postal code, including the postal code and postal add-on code.

POBox

VARCHAR2(100)

Post Office box number.

HouseNumber

VARCHAR2(100)

House or building number. Example: 123 in 123 MAIN ST

BaseName

VARCHAR2(200)

Base name of the street. Example: MAIN in 123 MAIN ST

StreetType

VARCHAR2(20)

Type of the street. Example: ST in 123 MAIN ST

StreetTypeBefore

VARCHAR2(1)

(Not used.)

StreetTypeAttached

VARCHAR2(1)

(Not used.)

StreetPrefix

VARCHAR2(20)

Prefix for the street. Example: S in 123 S MAIN ST

StreetSuffix

VARCHAR2(20)

Suffix for the street. Example: NE in 123 MAIN ST NE

Side

VARCHAR2(1)

Side of the street (L for left or R for right) that the house is on when you are traveling along the road segment following its orientation (that is, from its start node toward its end node). The house numbers may be increasing or decreasing.

Percent

NUMBER

Number from 0 to 1 (multiply by 100 to get a percentage value) indicating how far along the street you are when traveling following the road segment orientation.

EdgeID

NUMBER

Edge ID of the road segment.

ErrorMessage

VARCHAR2(20)

Error message (see Section 11.1.4). Note: You are encouraged to use the MatchVector attribute instead of the ErrorMessage attribute.

MatchCode

NUMBER

Match code (see Section 11.1.3).

MatchMode

VARCHAR2(30)

Match mode (see Section 11.1.2).

Longitude

NUMBER

Longitude coordinate value.

Latitude

NUMBER

Latitude coordinate value.

MatchVector

VARCHAR2(20)

A string that indicates how each address attribute has been matched against the data used for geocoding (see Section 11.1.5).


You can return the entire SDO_GEO_ADDR object, or you can specify an attribute using standard "dot" notation. Example 11-1 contains statements that geocode the address of the San Francisco City Hall; the first statement returns the entire SDO_GEO_ADDR object, and the remaining statements return some specific attributes.

Example 11-1 Geocoding, Returning Address Object and Specific Attributes

SELECT SDO_GCDR.GEOCODE('SCOTT', 
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'), 
    'US', 'RELAX_BASE_NAME') FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
SDO_GEO_ADDR(0, SDO_KEYWORDARRAY(), NULL, 'CARLTON B GOODLETT PL', NULL, NULL, '
SAN FRANCISCO', NULL, 'CA', 'US', '94102', NULL, '94102', NULL, '1', 'CARLTON B
GOODLETT', 'PL', 'F', 'F', NULL, NULL, 'L', .01, 23614360, '????#ENUT?B281CP?',
1, 'RELAX_BASE_NAME', -122.41815, 37.7784183, '????0101010??000?')

SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').StreetType  FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
PL                                                                              
 
SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').Side  RROM DUAL;
 
S                                                                               
-                                                                               
L                                                                               
 
SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').Percent  FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
                                                                             .01
 
SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').EdgeID  FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
                                                                        23614360
 
SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').MatchCode  FROM DUAL;
 
SDO_GCDR.GEOCODE('SCOTT',SDO_KEYWORDARRAY('1CARLTONBGOODLETTPL','SANFRANCISCO
--------------------------------------------------------------------------------
                                                                               1

SELECT SDO_GCDR.GEOCODE('SCOTT',
  SDO_KEYWORDARRAY('1 Carlton B Goodlett Pl', 'San Francisco, CA  94102'),
  'US', 'RELAX_BASE_NAME').MatchVector  FROM DUAL;
 
SDO_GCDR.GEOCODE('SC
--------------------
????0101010??000?

11.2.2 SDO_ADDR_ARRAY Type

The SDO_ADDR_ARRAY type is a VARRAY of SDO_GEO_ADDR objects (described in Section 11.2.1) used to store geocoded address results. Multiple address objects can be returned when multiple addresses are matched as a result of a geocoding operation.

The SDO_ADDR_ARRAY type is defined as follows:

CREATE TYPE sdo_addr_array AS VARRAY(1000) OF sdo_geo_addr;

11.2.3 SDO_KEYWORDARRAY Type

The SDO_KEYWORDARRAY type is a VARRAY of VARCHAR2 strings used to store address lines for unformatted addresses. (Formatted and unformatted addresses are described in Section 11.1.1.)

The SDO_KEYWORDARRAY type is defined as follows:

CREATE TYPE sdo_keywordarray AS VARRAY(10000) OF VARCHAR2(9000);

11.3 Using the Geocoding Capabilities

To use the Oracle Spatial geocoding capabilities, you must use data provided by a geocoding vendor, and the data must be in the format supported by the Oracle Spatial geocoding feature. For information about getting and loading this data, go to the Spatial page of the Oracle Technology Network (OTN):

http://www.oracle.com/technology/products/spatial/

Find the link for geocoding, and follow the instructions.

To geocode an address using the geocoding data, use the SDO_GCDR PL/SQL package subprograms, which are documented in Chapter 23:

11.4 Geocoding from a Place Name

If you know a place name (point of interest) but not its locality details, you can create a PL/SQL function to construct an SDO_GEO_ADDR object from placename and country input parameters, as shown in Example 11-2, which creates a function named create_addr_from_placename. The SELECT statement in this example uses the SDO_GCDR.GEOCODE_ADDR function to geocode the address constructed using the create_addr_from_placename function.

Example 11-2 Geocoding from a Place Name and Country

create or replace function create_addr_from_placename(
placename varchar2,
country varchar2)
return sdo_geo_addr
as
  addr sdo_geo_addr ;
  begin
  addr := sdo_geo_addr() ;
  addr.country := country ;
  addr.placename := placename ;
  addr.matchmode := 'default' ;
  return addr ;
  end;
  /
 
SELECT sdo_gcdr.geocode_addr('SCOTT', 
  create_addr_from_placename('CALIFORNIA PACIFIC MEDICAL CTR', 'US')) 
FROM DUAL;

If you know at least some of the locality information, such as settlement, region, and postal code, you can get better performance if you can provide such information. Example 11-3 provides an alternate version of the create_addr_from_placename function that accepts additional parameters. To call this version of the function, specify actual values for the placename and country parameters, and specify an actual value or a null value for each of the other input parameters.

Example 11-3 Geocoding from a Place Name, Country, and Other Fields

create or replace function create_addr_from_placename(
placename varchar2,
city varchar2,
state varchar2,
postalcode varchar2,
country varchar2)
return sdo_geo_addr
as
  addr sdo_geo_addr ;
  begin
  addr := sdo_geo_addr() ;
  addr.settlement := city ;
  addr.region := state ;
  addr.postalcode := postalcode ;
  addr.country := country ;
  addr.placename := placename ;
  addr.matchmode := 'default' ;
  return addr ;
  end;
  /
 
SELECT sdo_gcdr.geocode_addr('SCOTT', 
  create_addr_from_placename('CALIFORNIA PACIFIC MEDICAL CTR', 
    'san francisco', 'ca', null, 'US')) FROM DUAL;

11.5 Data Structures for Geocoding

Oracle uses the following tables for geocoding:

The GC_PARSER_PROFILES and GC_PARSER_PROFILEAFS tables store address format definitions of all supported counties. These tables are used by the internal address parser in parsing postal addresses into addressing fields. The data for these two tables is provided by your data provider or by Oracle. (If these tables are not supplied by your data provider, you will need to install and populate them as explained in Section 11.6.) The remaining tables store geocoding data provided by data vendors.

Each user that owns the tables containing geocoding data (that is, each user that can be specified with the username parameter in a call to an SDO_GCDR subprogram) must have one GC_PARSER_PROFILES table, one GC_PARSER_PROFILEAFS table, and one GC_COUNTRY_PROFILE table. Each such user can have multiple sets of the other tables (GC_xxx_<suffix>). Each set of tables whose names end with the same suffix stores geocoding data of a country. For example, the following set of tables can be used to store geocoding data of the United States:

Geocoding data of one country cannot be stored in more than one set of those tables. The table suffix is defined by data venders and is specified in the GC_TABLE_SUFFIX column in the GC_COUNTRY_PROFILE table (described in Section 11.5.2).

The following sections describe the vendor-supplied tables that store geocoding data, in alphabetical order by table name.

Section 11.5.8 describes the indexes that you must create in order to use these tables for geocoding.

11.5.1 GC_AREA_<suffix> Table

The GC_AREA_<suffix> table (for example, CG_AREA_US) stores administration area information for the country associated with the table name suffix. This table contains one row for each administration area, and it contains the columns shown in Table 11-7.

Table 11-7 GC_AREA_<suffix> Table

Column Name Data Type Description

AREA_ID

NUMBER(10)

Area ID number. (Required)

AREA_NAME

VARCHAR2(64)

Area name. (Required)

LANG_CODE

VARCHAR2(3)

3-letter ISO national language code for the language associated with the area. (Required)

ADMIN_LEVEL

NUMBER(1)

Administration hierarchy level for the area. (Required)

LEVEL1_AREA_ID

NUMBER(10)

ID of the level-1 area to which the area belongs. In the administration hierarchy, the level-1 area is the country. (Required)

LEVEL2_AREA_ID

NUMBER(10)

ID of the level-2 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional)

LEVEL3_AREA_ID

NUMBER(10)

ID of the level-3 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional)

LEVEL4_AREA_ID

NUMBER(10)

ID of the level-4 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional)

LEVEL5_AREA_ID

NUMBER(10)

ID of the level-5 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional)

LEVEL6_AREA_ID

NUMBER(10)

ID of the level-6 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional)

LEVEL7_AREA_ID

NUMBER(10)

ID of the level-7 area to which the area belongs, if applicable. You must specify an area ID for each level in the administration hierarchy to which this area belongs. (Optional)

CENTER_LONG

NUMBER

Longitude value of the center of the area. The center is set to the closest road segment to the center longitude and latitude values. Oracle recommends that these two attributes be set properly. If these values are not set, the longitude and latitude coordinates of the geocoded result of an area will be (0,0). (Optional)

CENTER_LAT

NUMBER

Latitude value of the center of the area. (See the explanation for the CENTER_LONG column.) (Optional)

ROAD_SEGMENT_ID

NUMBER(10)

ID of the road segment to which the area center is set. This value must be set correctly if the geocoder is intended to work with the Oracle Spatial routing engine (described in Chapter 13); otherwise, it can be set to any nonzero value, but it cannot be null. (Required)

POSTAL_CODE

VARCHAR2(16)

Postal code for the center of the area. Oracle recommends that this attribute be set correctly. If this value is null, the postal code attribute of the geocoded result of an area will be null. (Optional)

COUNTRY_CODE_2

VARCHAR2(2)

2- letter ISO country code of the country to which the area belongs. (Required)

PARTITION_ID

NUMBER

Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required)

REAL_NAME

VARCHAR2(64)

The real name of the area, as spelled using the local language. This column is useful for area names that are not in English. For example, the German name of city MUNICH is MÜNCHEN. It is allowed to be spelled as MUNCHEN, but its REAL_NAME value should be MÜNCHEN. In the area table for Germany, areas with name MÜNCHEN and MUNCHEN both refer to the same area, and they both have the same real name MÜNCHEN. If the area name does not have any non-English characters, set REAL_NAME to be the same as AREA_NAME. (Required)

IS_ALIAS

VARCHAR2(1)

Contains T if this area is an alias of another area that is an officially recognized administrative area; contains F if this area is not an alias of another area that is an officially recognized administrative area. For example, Manhattan is not an officially recognized administrative area, but it is used by the public to refer to a part of New York City. In this case, Manhattan is an alias of New York City. (Required)

NUM_STREETS

NUMBER

The number of streets inside this area. (Optional)


11.5.2 GC_COUNTRY_PROFILE Table

The GC_COUNTRY_PROFILE table stores country profile information used by the geocoder. This table contains one row for each supported country, and it contains the columns shown in Table 11-8.

Table 11-8 GC_COUNTRY_PROFILE Table

Column Name Data Type Description

COUNTRY_NAME

VARCHAR2(60)

Country name. (Required)

COUNTRY_CODE_3

VARCHAR2(3)

3- letter ISO country code. (Required)

COUNTRY_CODE_2

VARCHAR2(2)

2- letter ISO country code. (Required)

LANG_CODE_1

VARCHAR2(3)

3-letter ISO national language code. Some country might have multiple national languages, in which case LANG_CODE_2 and perhaps other columns should contain values. (Required)

LANG_CODE_2

VARCHAR2(3)

3-letter ISO national language code. (Optional)

LANG_CODE_3

VARCHAR2(3)

3-letter ISO national language code. (Optional)

LANG_CODE_4

VARCHAR2(3)

3-letter ISO national language code. (Optional)

NUMBER_ADMIN_LEVELS

NUMBER(1)

Number of administration hierarchy levels. A country can have up to 7 administration area levels, numbered from 1 to 7. The top level area (country) is level 1. For the United States, the administration hierarchy is as follows: level 1 = country, level 2 = state, level 3 = county, level 4 = city. (Required)

SETTLEMENT_LEVEL

NUMBER(1)

Administration hierarchy level for a settlement, which is the lowest area level used in addressing. In the United States, this is the city level. (Required)

MUNICIPALITY_LEVEL

NUMBER(1)

Administration hierarchy level for a municipality, which is the second-lowest area level used in addressing. In the United States, this is the county level. (Optional)

REGION_LEVEL

NUMBER(1)

Administration hierarchy level for the region level used in addressing. (Optional)

SETTLEMENT_IS_OPTIONAL

VARCHAR2(1)

Contains T if settlement information is required in the address data; contains F if settlement information is not required in the address data. (Required)

MUNICIPALITY_IS_OPTIONAL

VARCHAR2(1)

Contains T if municipality information is required in the address data; contains F if municipality information is not required in the address data. (Required)

REGION_IS_OPTIONAL

VARCHAR2(1)

Contains T if region information is required in the address data; contains F if region information is not required in the address data. (Required)

POSTCODE_IN_SETTLEMENT

VARCHAR(1)

Contains T if each postal code must be completely within a settlement area; contains F if a postal code can include area from multiple settlements. (Required)

SETTLEMENT_AS_CITY

VARCHAR(1)

Contains T if a city name can identify both a municipality and a settlement; contains F if a city name can identify only a settlement. For example, in the United Kingdom, London can be both the name of a municipality area and the name of a settlement area, which is inside the municipality of London. This is common in large cities in some European countries, such as the UK and Belgium. (Required)

CACHED_ADMIN_AREA_LEVEL

NUMBER

(Reserved for future use.)

GC_TABLE_SUFFIX

VARCHAR2(5)

Table name suffix identifying the country. For example, if the value of GC_TABLE_SUFFIX is US, the names of tables with geocoding data for this country end with _US (for example, CG_AREA_US). (Required)

CENTER_LONG

NUMBER

Longitude value of the center of the area. (Optional)

CENTER_LAT

NUMBER

Latitude value of the center of the area. (Optional)

SEPARATE_PREFIX

VARCHAR2(1)

Contains T if the street name prefix is a separate word from the street name; contains F if the street name prefix is in the same word with the street name. For example, in an American street address of 123 N Main St, the prefix is N, and it is separate from the street name, which is Main. (Optional; not currently used by Oracle)

SEPARATE_SUFFIX

VARCHAR2(1)

Contains T if the street name suffix is a separate word from the street name; contains F if the street name suffix is in the same word with the street name. For example, in an American street address of 123 Main St NW, the suffix is NW, and it is separate from the street name, which is Main, and from the street type, which is St. (Optional; not currently used by Oracle)

SEPARATE_STYPE

VARCHAR2(1)

Contains T if the street type is a separate word from the street name; contains F if the street type is in the same word with the street name. For example, in a German street address of 123 Beethovenstrass, the type is strass, and it is in the same word with the street name, which is Beethoven. (Optional; not currently used by Oracle)

AREA_ID

NUMBER

Not currently used by Oracle. (Optional)

VERSION

VARCHAR2(10)

Version of the data. The first version should be 1.0. (Required)


11.5.3 GC_INTERSECTION_<suffix> Table

The GC_INTERSECTION_<suffix> table (for example, GC_INTERSECTION_US) stores road intersection information. An intersection is typically associated with multiple roads. Each row represents an intersection and two different roads that intersect with each other at this intersection. This table contains the columns shown in Table 11-9.

Table 11-9 GC_INTERSECTION_<suffix> Table

Column Name Data Type Description

ROAD_ID_1

NUMBER

ID number of the first road on which the intersection is located. (Required)

ROAD_SEGMENT_ID_1

NUMBER

ID number of the road segment on the first road on which the intersection is located. (Required)

ROAD_ID_2

NUMBER

ID number of the second road on which the intersection is located. (Required)

ROAD_SEGMENT_ID_2

NUMBER

ID number of the road segment on the second road on which the intersection is located. (Required)

INTS_LONG

NUMBER

Longitude coordinate value of the intersection. (Required)

INTS_LAT

NUMBER

Latitude coordinate value of the intersection. (Required)

HOUSE_NUMBER

NUMBER

The leading numerical part of the house number at the intersection. (See the explanation of house numbers after Table 11-13 in Section 11.5.7.) (Required)

HOUSE_NUMBER_2

VARCHAR2(10)

The second part of the house number at the intersection. (See the explanation of house numbers after Table 11-13 in Section 11.5.7.) (Required)

SIDE

VARCHAR2(1)

Side of the street on which the house at the intersection is located. Possible values: L (left) or R (right). (Required)

COUNTRY_CODE_2

VARCHAR2(2)

2- letter ISO country code of the country to which the house at the intersection belongs. (Required)

PARTITION_ID

NUMBER

Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required)


11.5.4 GC_POI_<suffix> Table

The GC_POI_<suffix> table (for example, GC_POI_US) stores point of interest (POI) information for the country associated with the table name suffix. This table contains one or more rows for each point of interest. (For example, it can contain multiple rows for a POI if the POI is associated with multiple settlements.) The GC_POI_<suffix> table contains the columns shown in Table 11-10.

Table 11-10 GC_POI_<suffix> Table

Column Name Data Type Description

POI_ID

NUMBER

ID number of the POI. (Required)

POI_NAME

VARCHAR2(64)

Name of the POI. (Required)

LANG_CODE

VARCHAR2(3)

3-letter ISO national language code for the language for the POI name. (Required)

FEATURE_CODE

NUMBER

Feature code for the POI, if the data vendor classifies POIs by category. (Optional)

HOUSE_NUMBER

VARCHAR2(10)

House number of the POI; may contain non-numeric characters. (Required)

STREET_NAME

VARCHAR2(80)

Street name of the POI. (Required)

SETTLEMENT_ID

NUMBER(10)

ID number of the settlement to which the POI belongs. (Required if the POI is associated with a settlement)

MUNICIPALITY_ID

NUMBER(10)

ID number of the municipality to which the POI belongs. (Required if the POI is associated with a municipality)

REGION_ID

NUMBER(10)

ID number of the region to which the POI belongs. (Required if the POI is associated with a region)

SETTLEMENT_NAME

VARCHAR2(64)

Name of the settlement to which the POI belongs. (Required if the POI is associated with a settlement)

MUNICIPALITY_NAME

VARCHAR2(64)

Name of the municipality to which the POI belongs. (Required if the POI is associated with a municipality)

REGION_NAME

VARCHAR2(64)

Name of the region to which the POI belongs. (Required if the POI is associated with a region)

POSTAL_CODE

VARCHAR2(16)

Name of the postal code of the POI. (Required)

VANITY_CITY

VARCHAR2(35)

Name of the city popularly associated with the POI, if it is different from the actual city containing the POI. For example, the London Heathrow Airport is actually located in a town named Hayes, which is part of greater London, but people tend to associate the airport only with London. In this case, the VANITY_CITY value is London. (Optional)

ROAD_SEGMENT_ID

NUMBER

ID of the road segment on which the POI is located. (Required)

SIDE

VARCHAR2(1)

Side of the street on which the POI is located. Possible values: L (left) or R (right). (Required)

PERCENT

NUMBER

Percentage value at which POI is located on the road. It is computed by dividing the distance from the street segment start point to the POI by the length of the street segment. (Required)

TELEPHONE_NUMBER

VARCHAR2(20)

Telephone number of the POI. (Optional)

LOC_LONG

NUMBER

Longitude coordinate value of the POI. (Required)

LOC_LAT

NUMBER

Latitude coordinate value of the POI. (Required)

COUNTRY_CODE_2

VARCHAR2(2)

2- letter ISO country code of the country to which the POI belongs. (Required)

PARTITION_ID

NUMBER

Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required)


11.5.5 GC_POSTAL_CODE_<suffix> Table

The GC_POSTAL_CODE_<suffix> table (for example, GC_POSTAL_CODE_US) stores postal code information for the country associated with the table name suffix. This table contains one or more rows for each postal code. (For example, it can contain multiple rows for a postal code if the postal code is associated with multiple settlements.) The GC_POSTAL_CODE_<suffix> table contains the columns shown in Table 11-11.

Table 11-11 GC_POSTAL_CODE_<suffix> Table

Column Name Data Type Description

POSTAL_CODE

VARCHAR2(16)

Postal code. (Required)

SETTLEMENT_NAME

VARCHAR2(64)

Name of the settlement to which the postal code belongs. (Required if the postal code is associated with a settlement)

MUNICIPALITY_NAME

VARCHAR2(64)

Name of the municipality to which the postal code belongs. (Required if the postal code is associated with a municipality)

REGION_NAME

VARCHAR2(64)

Name of the region to which the postal code belongs. (Required if the postal code is associated with a region)

LANG_CODE

VARCHAR2(3)

3-letter ISO national language code for the language associated with the area. (Required)

SETTLEMENT_ID

NUMBER(10)

ID number of the settlement to which the postal code belongs. (Required if the postal code is associated with a settlement)

MUNICIPALITY_ID

NUMBER(10)

ID number of the municipality to which the postal code belongs. (Required if the postal code is associated with a municipality)

REGION_ID

NUMBER(10)

ID number of the region to which the postal code belongs. (Required if the postal code is associated with a region)

CENTER_LONG

NUMBER

Longitude value of the center of the area. The center is set to the closest road segment to the center longitude and latitude values. Oracle recommends that these two attributes be set properly. If these values are not set, the longitude and latitude coordinates of the geocoded result of an area will be (0,0). (Optional)

CENTER_LAT

NUMBER

Latitude value of the center of the area. (See the explanation for the CENTER_LONG column.) (Optional)

ROAD_SEGMENT_ID

NUMBER(10)

ID of the road segment to which the area center is set. This value must be set correctly if the geocoder is intended to work with the Oracle Spatial routing engine (described in Chapter 13); otherwise, it can be set to any nonzero value, but it cannot be null. (Required)

COUNTRY_CODE_2

VARCHAR2(2)

2- letter ISO country code of the country to which the area belongs. (Required)

PARTITION_ID

NUMBER

Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required)

NUM_STREETS

NUMBER

The number of streets inside this area. (Optional)


11.5.6 GC_ROAD_<suffix> Table

The GC_ROAD_<suffix> table (for example, GC_ROAD_US) stores road information for the country associated with the table name suffix. A road is a collection of road segments with the same name in the same settlement area; a road segment (defined in the GC_ROAD_SEGMENT_<suffix> table) is the segment of the road between two continuous intersections. The GC_ROAD_<suffix> table contains one or more rows for each road. (For example, it can contain multiple rows for a road if the road is associated with multiple settlements.) The GC_ROAD_<suffix> table contains the columns shown in Table 11-12.

Table 11-12 GC_ROAD_<suffix> Table

Column Name Data Type Description

ROAD_ID

NUMBER

ID number of the road. (Required)

SETTLEMENT_ID

NUMBER(10)

ID number of the settlement to which the road belongs. (Required if the road is associated with a settlement)

MUNICIPALITY_ID

NUMBER(10)

ID number of the municipality to which the road belongs. (Required if the road is associated with a municipality)

PARENT_AREA_ID

NUMBER(10)

ID number of the parent area of the municipality to which the road belongs. (Required if the road is associated with a parent area)

LANG_CODE

VARCHAR2(3)

3-letter ISO national language code for the language for the road name. (Required)

NAME

VARCHAR2(64)

Name of the road, including the type (if any), the prefix (if any), and the suffix (if any). For example, N Main St as NAME, with Main as BASE_NAME. (Required)

BASE_NAME

VARCHAR2(64)

Name of the road, excluding the type (if any), the prefix (if any), and the suffix (if any). For example, N Main St as NAME, with Main as BASE_NAME. (Required)

PREFIX

VARCHAR2(32)

Prefix of the road name. For example, N Main St as NAME, with N as PREFIX. (Required if the road name has a prefix)

SUFFIX

VARCHAR2(32)

Suffix of the road name. For example, Main St NW as NAME, with NW as SUFFIX. (Required if the road name has a suffix)

STYPE_BEFORE

VARCHAR2(32)

Street type that precedes the base name. For example, Avenue Victor Hugo as NAME, with Avenue as STYPE_BEFORE and Victor Hugo as BASE_NAME. (Required if the road type precedes the base name)

STYPE_AFTER

VARCHAR2(32)

Street type that follows the base name. For example, Main St as NAME, with St as STYPE_AFTER and Main as BASE_NAME. (Required if the road type follows the base name)

STYPE_ATTACHED

VARCHAR2(1)

Contains T if the street type is in the same word with the street name; contains F if the street type is a separate word from the street name. For example, in a German street address of 123 Beethovenstrass, the street type is strass, and it is in the same word with the street name, which is Beethoven. (Required)

START_HN

NUMBER(5)

(Should be set to the same value as CENTER_HN; not currently used by Oracle)

CENTER_HN

NUMBER(5)

Leading numerical part of the center house number. The center house number is the left side house number at the start point of the center road segment, which is located in the center of the whole road. (See the explanation of house numbers after Table 11-13 in Section 11.5.7.) (Required)

END_HN

NUMBER(5)

(Should be set to the same value as CENTER_HN; not currently used by Oracle)

START_HN_SIDE

VARCHAR2(1)

(Should be set to the same value as CENTER_HN_SIDE; not currently used by Oracle)

CENTER_HN_SIDE

VARCHAR2(1)

Side of the road of the center house number: L for left or R for right. The center house number is the left side house number at the start point of the center road segment, which is located in the center of the whole road. (See the explanation of house numbers after Table 11-13 in Section 11.5.7.) (Required)

END_HN_SIDE

VARCHAR2(1)

(Should be set to the same value as CENTER_HN_SIDE; not currently used by Oracle)

START_LONG

NUMBER

(Should be set to the same value as CENTER_LONG; not currently used by Oracle)

START_LAT

NUMBER

(Should be set to the same value as CENTER_LAT; not currently used by Oracle)

CENTER_LONG

NUMBER

Longitude value of the center house number. The center house number is the left side house number at the start point of the center road segment, which is located in the center of the whole road. (See the explanation of house numbers after Table 11-13 in Section 11.5.7.) (Required)

CENTER_LAT

NUMBER

Latitude value of the center house number. (See also the explanation of the CENTER_LONG column.) (Required)

END_LONG

NUMBER

(Should be set to the same value as CENTER_LONG; not currently used by Oracle)

END_LAT

NUMBER

(Should be set to the same value as CENTER_LAT; not currently used by Oracle)

START_ROAD_SEG_ID

NUMBER(5)

(Should be set to the same value as CENTER_ROAD_SEG_ID; not currently used by Oracle)

CENTER_ROAD_SEG_ID

NUMBER(5)

ID number of the road segment at the center point of the road. (Required)

END_ROAD_SEG_ID

NUMBER(5)

(Should be set to the same value as CENTER_ROAD_SEG_ID; not currently used by Oracle)

POSTAL_CODE

VARCHAR2(16)

Postal code for the road. (Required)

COUNTRY_CODE_2

VARCHAR2(2)

2- letter ISO country code of the country to which the road belongs. (Required)

PARTITION_ID

NUMBER

Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required)

CENTER_HN2

VARCHAR2(10)

The second part of the center house number. (See the explanation of house numbers after Table 11-13 in Section 11.5.7.) (Required)


11.5.7 GC_ROAD_SEGMENT_<suffix> Table

The GC_ROAD_SEGMENT_<suffix> table (for example, GC_ROAD_SEGMENT_US) stores road segment information for the country associated with the table name suffix. A road segment is the segment of the road between two continuous intersections, while a road (defined in the GC_ROAD_<suffix> table) is a collection of road segments with the same name in the same settlement area. The GC_ROAD_SEGMENT_<suffix> table contains one row for each road segment, and it contains the columns shown in Table 11-13.

Table 11-13 GC_ROAD_SEGMENT_<suffix> Table

Column Name Data Type Description

ROAD_SEGMENT_ID

NUMBER

ID number of the road segment. (Required)

ROAD_ID

NUMBER

ID number of the road containing this road segment. (Required)

L_ADDR_FORMAT

VARCHAR2(1)

Left side address format. Specify N if there are one or more house numbers on the left side of the road segment; leave null if there is no house number on the left side of the road segment. (Required)

R_ADDR_FORMAT

VARCHAR2(1)

Right side address format. Specify N if there are one or more house numbers on the right side of the road segment; leave null if there is no house number on the right side of the road segment. (Required)

L_ADDR_SCHEME

VARCHAR2(1)

Numbering scheme for house numbers on the left side of the road segment: O (all odd numbers), E (all even numbers), or M (mixture of odd and even numbers). (Required)

R_ADDR_SCHEME

VARCHAR2(1)

Numbering scheme for house numbers on the right side of the road segment: O (all odd numbers), E (all even numbers), or M (mixture of odd and even numbers). (Required)

START_HN

NUMBER(5)

The lowest house number on this road segment. (Required)

END_HN

NUMBER(5)

The highest house number on this road segment. (Required)

L_START_HN

NUMBER(5)

The leading numerical part of the left side starting house number. (See the explanation of house numbers after this table.) (Required)

L_END_HN

NUMBER(5)

The leading numerical part of the left side ending house number. (See the explanation of house numbers after this table.) (Required)

R_START_HN

NUMBER(5)

The leading numerical part of the right side starting house number. (See the explanation of house numbers after this table.) (Required)

R_END_HN

NUMBER(5)

The leading numerical part of the right side ending house number. (See the explanation of house numbers after this table.) (Required)

POSTAL_CODE

VARCHAR2(16)

Postal code for the road segment. If the left side and right side of the road segment belong to two different postal codes, create two rows for the road segment with identical values in all columns except for POSTAL_CODE. (Required)

GEOMETRY

SDO_GEOMETRY

Spatial geometry object representing the road segment. (Required)

COUNTRY_CODE_2

VARCHAR2(2)

2- letter ISO country code of the country to which the road segment belongs. (Required)

PARTITION_ID

NUMBER

Partition key used for partitioning geocoder data by geographic boundaries. If the data is not partitioned, set the value to 1. (Required)

L_START_HN2

VARCHAR2(10)

The second part of the left side starting house number. (See the explanation of house numbers after this table.) (Required if the left side starting house number has a second part)

L_END_HN2

VARCHAR2(10)

The second part of the left side ending house number. (See the explanation of house numbers after this table.) (Required if the left side ending house number has a second part)

R_START_HN2

VARCHAR2(10)

The second part of the right side starting house number. (See the explanation of house numbers after this table.) (Required if the right side starting house number has a second part)

R_END_HN2

VARCHAR2(10)

The second part of the right side ending house number. (See the explanation of house numbers after this table.) (Required if the right side ending house number has a second part)


The starting house number is the house number at the starting point of the street segment, which is the first shape point of the road segment geometry (GEOMETRY column). The ending house number is the house number at the ending point of the street segment, which is the last shape point of the road segment geometry. The left and right side starting house numbers do not have to be lower than the left and right ending house numbers.

A house number is divided into two parts: the leading numerical part and the second part, which is the rest of the house number. The leading numerical part is the numerical part of the house number that starts from the beginning of the whole house number string and ends just before the first non-numeric character (if any). If the house number contains any non-numeric characters, the second part of the house number is the part from the first non-numeric character through the last character. For example, if the house number is 123, the leading numerical part is 123 and the second part is null; however, if the house number is 123A23, the leading numerical part is 123 and the second part is A23.

11.5.8 Indexes on Tables for Geocoding

To use the vendor-supplied tables for geocoding, indexes must be created on many of the tables, and the names of these indexes must follow certain requirements.

Example 11-4 lists the format of CREATE INDEX statements that create the required indexes. In each statement, you must use the index name, table name, column names, and (if multiple columns are indexed) sequence of column names as shown in Example 11-4, except that you must replace all occurrences of <suffix> with the appropriate string (for example, US for the United States). Note that the first index in the example is a spatial index. Optionally, you can also include other valid keywords and clauses in the CREATE INDEX statements.

Example 11-4 Required Indexes on Tables for Geocoding

CREATE INDEX idx_<suffix>_road_geom ON gc_road_segment_<suffix> (geometry) INDEXTYPE IS mdsys.spatial_index;
CREATE INDEX idx_<suffix>_road_seg_rid ON gc_road_segment_<suffix> (road_id, start_hn, end_hn);
CREATE INDEX idx_<suffix>_road_id ON gc_road_<suffix> (road_id);
CREATE INDEX idx_<suffix>_road_setbn ON gc_road_<suffix> (settlement_id, base_name);
CREATE INDEX idx_<suffix>_road_munbn ON gc_road_<suffix> (municipality_id, base_name);
CREATE INDEX idx_<suffix>_road_parbn ON gc_road_<suffix> (parent_area_id, country_code_2, base_name);
CREATE INDEX idx_<suffix>_road_setbnsd ON gc_road_<suffix> (settlement_id, soundex(base_name));
CREATE INDEX idx_<suffix>_road_munbnsd ON gc_road_<suffix> (municipality_id, soundex(base_name));
CREATE INDEX idx_<suffix>_road_parbnsd ON gc_road_<suffix> (parent_area_id, country_code_2, soundex(base_name));
CREATE INDEX idx_<suffix>_inters ON gc_intersection_<suffix> (country_code_2, road_id_1, road_id_2);
CREATE INDEX idx_<suffix>_area_name_id ON gc_area_<suffix> (country_code_2, area_name, admin_level);
CREATE INDEX idx_<suffix>_area_id_name ON gc_area_<suffix> (area_id, area_name, country_code_2);
CREATE INDEX idx_<suffix>_poi_name ON gc_poi_<suffix> (country_code_2, name);
CREATE INDEX idx_<suffix>_poi_setnm ON gc_poi_<suffix> (country_code_2, settlement_id, name);
CREATE INDEX idx_<suffix>_poi_ munnm ON gc_poi_<suffix> (country_code_2, municipality_id, name);
CREATE INDEX idx_<suffix>_poi_ regnm ON gc_poi_<suffix> (country_code_2, region_id, name);
CREATE INDEX idx_<suffix>_ postcode ON gc_postal_code_<suffix> (country_code_2, postal_code);

11.6 Installing the Parser-Profile Tables

The Oracle Geocoder profile tables are typically supplied by a data provider. Use the data provider's profile tables for geocoding whenever they are available. For users building their own geocoder schema, Oracle provides the GC_PARSER_PROFILES and GC_PARSER_PROFILEAFS tables. However, if (and only if) profile tables are not supplied with the data tables by the data provider, you will need to install these Oracle-supplied profile tables.

The Oracle-supplied tables contain parser profiles for a limited number of countries. If your profiles of interest are included in the Oracle-supplied tables, you can use them directly or update them if necessary. If profiles for your country or group of countries of interest are not included, you will need to add them manually; however, for a quick start, you can copy the Oracle-supplied profiles of a country with a similar address format to your country of interest, and edit these copied profiles as needed.

To install and query the Oracle-supplied profile tables, perform the following steps:

  1. Log on to your database as the geocoder user. The geocoder user is the user under whose schema the geocoder schema will be loaded.

  2. Create the GC_PARSER_PROFILES and GC_PARSER_PROFILEAFS tables by executing the SDO_GCDR.CREATE_PROFILE_TABLES procedure:

    SQL> EXECUTE SDO_GCDR.CREATE_PROFILE_TABLES;
    
  3. Populate the GC_PARSER_PROFILES and GC_PARSER_PROFILEAFS tables by running the sdogcprs.sql script in the $ORACLE_HOME/md/admin/ directory. For example:

    SQL> @$ORACLE_HOME/md/admin/sdogcprs.sql
    
  4. Query the profile tables to determine if parser profiles for your country of interest are supplied, by checking if its country code is included in the output of the following statements:

    SQL> SELECT DISTINCT(country_code) FROM gc_parser_profiles;
    SQL> SELECT DISTINCT(country_code) FROM gc_parser_profileafs;
    

11.7 Using the Geocoding Service (XML API)

In addition to the SQL API, Oracle Spatial also provides an XML API for a geocoding service that enables you to geocode addresses. A Java geocoder application engine performs international address standardization, geocoding, and POI matching, by querying geocoder data stored in the Oracle database. The support for unparsed addresses adds flexibility and convenience to customer applications.

This geocoding service is implemented as a Java 2 Enterprise Edition (J2EE) Web application that you can deploy in either an Oracle Application Server or standalone Oracle Application Server Containers for J2EE (OC4J) environment.

Figure 11-1 shows the basic flow of action with the geocoding service: a client locates a remote the geocoding service instance, sends a geocoding request, and processes the response returned by the geocoding service instance.

Figure 11-1 Basic Flow of Action with the Spatial Geocoding Service

Description of Figure 11-1 follows
Description of "Figure 11-1 Basic Flow of Action with the Spatial Geocoding Service"

As shown in Figure 11-1:

  1. The client sends an XML geocoding request, containing one or more input addresses to be geocoded, to the geocoding service using the HTTP protocol.

  2. The geocoding service parses the input request and looks up the input address in the database.

  3. The geocoding service sends the geocoded result in XML format to the client using the HTTP protocol.

11.7.1 Deploying and Configuring the Geocoding Service

To enable the geocoding service to process geocoding requests and to generate responses, you must deploy the geocoder.ear file (in $ORACLE_HOME/md/jlib) using OC4J or the Oracle Application Server. This section describes the basic steps.

  1. Add the following element inside the <web-site> element in your http-web-site.xml or default-web-site.xml file of OC4J:

    <web-app application="geocoder" 
              name="web" 
              load-on-startup="true" 
              root="/geocoder" />
    
  2. Use the Oracle Application Server console to deploy the geocoder.ear file, or add the following element inside the <application-server> element in the server.xml file of OC4J (replace <ROUTE_SERVER_HOME> accordingly):

    <application name="geocoder" 
                 path="../applications/geocoder.ear"
                 parent="default"
                 start="true" />
    
  3. Add the following element inside the <application-server> element in the server.xml file of OC4J:

    <max-http-connections value="10" />
    

    It is important to limit the number of concurrent requests that the geocoding service can process at any given time to prevent java.lang.OutOfMemoryError errors.

  4. Configure the geocodercfg.xml file, as explained in Section 11.7.1.1.

  5. Start OC4J using the following command options:

    -server 
    -Xms<HEAP_SIZE>
    -Xmx<HEAP_SIZE>
    -XX:NewSize=<YOUNG_GENERATION_SIZE>
    -XX:MaxNewSize=<YOUNG_GENERATION_SIZE>
    -Dsun.rmi.dgc.server.gcInterval=3600000 
    -Dsun.rmi.dgc.client.gcInterval=3600000
    -verbose:gc  (optional)
    

    <HEAP_SIZE> must be at least 512 MB, and has a recommended size of at least 1024 MB (1 GB). Make sure that this memory is physical memory and not virtual memory.

    <YOUNG_GENERATION_SIZE> should be one-fourth (25%) of the <HEAP_SIZE> value.

    -verbose:gc will print all minor and major Java garbage collections. Monitoring these statistics could be useful for memory resource planning. If you find that garbage collections are occurring frequently or are lasting several seconds, you probably need to allocate more physical memory to the Java VM.

    Note:

    The amount of memory the Java VM will need depends mostly on two parameters: the <max-http-connections value="..." /> element in the <application-server> element in server.xml, and the partition_cache_size_limit parameter in geocodercfg.xml.

    The following command is an example that starts OC4J. Note that the -config flag is an OC4J command line parameter, not a VM option.

    c:\jdk1.5.0_06\bin\java -server 
                            -Xms1024m 
                            -Xmx1024m 
                            -XX:NewSize=256m 
                            -XX:MaxNewSize=256m 
                            -Dsun.rmi.dgc.server.gcInterval=3600000 
                            -Dsun.rmi.dgc.client.gcInterval=3600000
                            -verbose:gc 
                            -jar c:\oc4j\j2ee\home\oc4j.jar 
                            -config c:\oc4j\j2ee\home\config\server.xml
    
  6. Verify your deployment by visiting the URL in the following format:

    http://<hostname>:<port>/geocoder
    

    You should see a welcome page. You should also see a message in the console window in which you started OC4J indicating that the Oracle Spatial geocoding service was successfully initialized.

    If you do not see a welcome message, the geocoding service is probably not configured properly to run in your environment. In this case, edit the <J2EE_HOME>/applications/geocoder/web/WEB-INF/config/geocodercfg.xml file to reflect your environment and your preferences. (The geocodercfg.xml file is inside the geocoder.ear file, and it will not be visible until OC4J expands it into the geocoder directory structure under <J2EE_HOME>.) When you are finished editing, restart OC4J, and verify your deployment.

  7. Consult the supplied examples. The page http://<hostname>:<port>/geocoder/ has links at the bottom in a section named Test Samples. These examples demonstrate various capabilities of the geocoding service. This is the best way to learn the XML API, which is described in Section 11.7.2.

11.7.1.1 Configuring the geocodercfg.xml File

You will need to edit the <database> element in the default geocodercfg.xml file that is included with Spatial, to specify the database and schema where the geocoding data is loaded. Example 11-5 is an excerpt from a modified geocodercfg.xml file.

Example 11-5 Modified geocodercfg.xml File

. . .
   <database name="local"
             host="sdolnx2"
             port="1523"
             sid="orcl"
             mode="thin"
             user="geocoder_wld"
             password="!password" />
. . .

In the <database> element of the geocodercfg.xml file:

  • name is a descriptive name for the database connection.

  • host, port, and sid identify the database.

  • mode identifies the JDBC driver to use.

  • user and password are the user name and password for the database user in whose schema the geocoding data is stored.

  • For password, precede the password value with an exclamation point (!), so that when OC4J next restarts, it will automatically obfuscate and replace the clear text password.

11.7.2 Geocoding Request DTD and Example

For a geocoding request (HTTP GET or POST method), it is assumed the request has a parameter named xml_request whose value is a string containing the XML document for the request. The input XML document describes the input addresses that need to be geocoded. One XML request can contain one or more input addresses. Several internationalized address formats are available for describing the input addresses. (The input XML API also supports reverse geocoding, that is, a longitude/latitude point to a street address.)

The DTD for a geocoding request is as follows:

<!-- geocode_request DTD includes the GML Feature DTD as an external entity reference. The complete URL for the DTD: http://www.opengis.org/techno/specs/00-029/gmlfeature.dtd  -->
<!ENTITY % GMLFEATUREDTD SYSTEM "gmlfeature.dtd" %GMLFEATUREDTD; <!ELEMENT geocode_request (address_list)>
<!ATTLIST geocode_request >
<!ELEMENT address_list (input_location+)>
<!ELEMENT input_location (Point |input_address)>
<!ATTLIST input_location
  id CDATA #IMPLIED
  multimatch_number CDATA "4">
<!ELEMENT input_address (us_form1 | us_form2 | gdf_form | gen_form | unformatted)>
<!ATTLIST input_address match_mode CDATA #IMPLIED >
<!ELEMENT gdf_form EMPTY>
<!ATTLIST gdf_form
  name CDATA #IMPLIED
  street CDATA #IMPLIED
  intersecting_street CDATA #IMPLIED
  builtup_area CDATA #IMPLIED
  order8_area CDATA #IMPLIED
  order2_area CDATA #IMPLIED
  order1_area CDATA #IMPLIED
  country CDATA #IMPLIED
  postal_code CDATA #IMPLIED
  postal_addon_code CDATA #IMPLIED
>
<!ELEMENT gen_form EMPTY>
<!ATTLIST gen_form
  name CDATA #IMPLIED
  street CDATA #IMPLIED
  intersecting_street CDATA #IMPLIED
  sub_area CDATA #IMPLIED
  city CDATA #IMPLIED
  region CDATA #IMPLIED
  country CDATA #IMPLIED
  postal_code CDATA #IMPLIED
  postal_addon_code CDATA #IMPLIED
>
<!ELEMENT us_form1 EMPTY>
<!ATTLIST us_form1
  name CDATA #IMPLIED
  street CDATA #IMPLIED
  intersecting_street CDATA #IMPLIED
  lastline CDATA #IMPLIED
>
<!ELEMENT us_form2 EMPTY>
<!ATTLIST us_form2
  name CDATA #IMPLIED
  street CDATA #IMPLIED
  intersecting_street CDATA #IMPLIED
  city CDATA #IMPLIED
  state CDATA #IMPLIED
  zip_code CDATA #IMPLIED
>
<!ELEMENT unformatted (address_line) >
<!ATTLIST unformatted country CDATA #IMPLIED >
<!ELEMENT address_line EMPTY >
<!ATTLIST value #REQUIRED >

Example 11-6 is a request to geocode several four addresses (representing two different actual physical addresses), using different address formats and an unformatted address.

Example 11-6 Geocoding Request (XML API)

<?xml version="1.0" encoding="UTF-8" ?>
<geocode_request>
  <address_list>
    <input_location id="1">
      <input_address>
        <us_form2 name="Oracle" street="500 Oracle Parkway" city="Redwood City" state="CA" zip_code="94021" />
      </input_address>
    </input_location>
    <input_location id="2">
       <input_address>
         <gdf_form street="1 Oracle Drive" builtup_area="Nashua" order1_area="NH" postal_code="03062" country="US" />
       </input_address>
    </input_location>
    <input_location id="3">
       <input_address>
         <gen_form street="1 Oracle Drive" city="Nashua" region="NH" postal_code="03062" country="US" />
       </input_address>
    </input_location>
    <input_location id="4">
       <input_address>
         <unformatted country="UNITED STATES">
           <address_line value ="Oracle NEDC"/>
           <address_line value ="1 Oracle drive "/>
           <address_line value ="Nashua "/>
           <address_line value ="NH"/>
         </unformatted >
       </input_address>
     </input_location>
  </address_list>
</geocode_request>

11.7.3 Geocoding Response DTD and Example

A geocoding response contains one or more standardized addresses including longitude/latitude points, the matching code, and possibly multiple match and no match indication and an error message.

The DTD for a geocoding response is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT geocode_response (geocode+)>
<!ELEMENT geocode (match*)>
<!ATTLIST geocode id CDATA #REQUIRED
  match_count CDATA #IMPLIED
>
<!ELEMENT match (output_address)>
<!ATTLIST match sequence CDATA #REQUIRED
  longitude CDATA #REQUIRED
  latitude CDATA #REQUIRED
  match_code CDATA #REQUIRED
  error_message CDATA #IMPLIED >
<!ELEMENT output_address EMPTY>
<!ATTLIST output_address
  name CDATA #IMPLIED
  house_number CDATA #IMPLIED
  street CDATA #IMPLIED
  builtup_area CDATA #IMPLIED
  order1_area CDATA #IMPLIED
  order8_area CDATA #IMPLIED
  country CDATA #IMPLIED
  postal_code CDATA #IMPLIED
  postal_addon_code CDATA #IMPLIED
  side CDATA #IMPLIED
  percent CDATA #IMPLIED
  edge_id CDATA #IMPLIED>

Example 11-7 is the response to the request in Example 11-6 in Section 11.7.2.

Example 11-7 Geocoding Response (XML API)

<?xml version="1.0" encoding="UTF-8" ?>
<geocode_response>
  <geocode id="1" match_count="1">
    <match sequence="0" longitude="-122.26193971893862" latitude="37.53195483966782" match_code="10" error_message="????#ENUT?B281C??">
      <output_address name="" house_number="500" street="ORACLE PKY" builtup_area="REDWOOD CITY" order1_area="CA" order8_area="" country="US" postal_code="94065" postal_addon_code="" side="L" percent="0.33166666666666667" edge_id="28503563" />
    </match>
  </geocode>
  <geocode id="2" match_count="1">
    <match sequence="0" longitude="-71.45937299307225" latitude="42.70784494226865" match_code="1" error_message="????#ENUT?B281CP?">
      <output_address name="" house_number="1" street="ORACLE DR" builtup_area="NASHUA" order1_area="NH" order8_area="" country="US" postal_code="03062" postal_addon_code="" side="L" percent="0.01" edge_id="22325991" />
    </match>
  </geocode>
  <geocode id="3" match_count="1">
    <match sequence="0" longitude="-71.45937299307225" latitude="42.70784494226865" match_code="1" error_message="????#ENUT?B281CP?">
      <output_address name="" house_number="1" street="ORACLE DR" builtup_area="NASHUA" order1_area="NH" order8_area="" country="US" postal_code="03062" postal_addon_code="" side="L" percent="0.01" edge_id="22325991" />
    </match>
  </geocode>
  <geocode id="4" match_count="1">
    <match sequence="0" longitude="-71.45937299307225" latitude="42.70784494226865" match_code="1" error_message="????#ENUT?B281CP?">
      <output_address name="" house_number="1" street="ORACLE DR" builtup_area="NASHUA" order1_area="NH" order8_area="" country="US" postal_code="03062" postal_addon_code="" side="L" percent="0.01" edge_id="22325991" />
    </match>
  </geocode>
</geocode_response>