CiviCRM Geocoding


A short note on one way of setting up geocoding in CiviCRM on Drupal for UK contact addresses.

Objective is two fold:

  1. Display a map for contacts

  2. Use latitude/longitude to answer questions like ‘How many volunteers live with X miles of event Y’

For this we will be using:

  1. OpenStreetMap as the mapping provider.

  2. The OpenStreetMap Search API for geocoding.

  3. Freemaptools UK postcode to latitude/longitude cross reference data.
    At the time of writing this is delivered as a compressed set of SQL insert statements.

Before going any further, you will need direct access to your web server and MySQL database for this install.

To get started you will need:

  1. A working Drupal/CiviCRM install, CiviCRM version 4.6.19 for this test.

  2. Install the Civicrm OpenStreetMap geocoding extension

    $ cd /var/www/vhosts/crm-demo.example.org.uk/sites/default/files/civicrm/custom_ext
    
    $ wget https://github.com/systopia/de.systopia.osm/archive/master.zip
    
    $ unzip master.zip
    
    # Installation expects different name to that in zip file
    $ mv de.systopia.osm.master de.systopia.osm
    
    $ rm master.zip
    
    # install the extension via the usual route:
    # Administer -> System Settings -> Manage Extensions
  3. Enable mapping and geocoding in CiviCRM at Administer -> System Settings -> Mapping and geocoding.

    3.1 Mapping Provider = OpenStreetMaps
    3.2 Geocoding provider = OpenStreetMapCoding

    No need for an API key.

  4. Download the Freemaptools post code data and install it.
    There is a link on that page to SQL Script to create the required table.
    Extract the download (~230MB), create the required table and run the insert file into mysql, e.g.

    mysql> CREATE TABLE IF NOT EXISTS postcodelatlng (
        -> id int(11) NOT NULL AUTO_INCREMENT,
        -> postcode varchar(8) NOT NULL,
        -> latitude decimal(18,15) NOT NULL,
        -> longitude decimal(18,15) NOT NULL,
        -> PRIMARY KEY (id)
        -> ) AUTO_INCREMENT=1;
    
    # assuming your database name is crm01 ...
    $ mysql -p crm01 < ukpostcodesmysql.sql

    There should now be over 1.7 million rows in the postcodelatlng table.
    As we will be looking up rows via postcode we need an index on the postcode column, e.g.

    mysql> create unique index postcodelatlng_postcode on postcodelatlng (postcode);
    
    Query OK, 1734489 rows affected (10.40 sec)
    Records: 1734489  Duplicates: 0  Warnings: 0

At this point everything should be working, or at least there should be no obvious errors when you edit an address.
However, there are some things to take care of to get this working in the UK, see Post Codes

The final piece of the puzzle - a query to return all the contacts less than X miles away from a starting point.
Following query is a starting point for that, it is based on a good answer on StackOverflow

SELECT c.display_name, o.postal_code, dc.display_name,
  round(69.0 *
    DEGREES(ACOS(COS(RADIANS(o.geo_code_1))
         * COS(RADIANS(d.geo_code_1))
         * COS(RADIANS(o.geo_code_2 - d.geo_code_2))
         + SIN(RADIANS(o.geo_code_1))
         * SIN(RADIANS(d.geo_code_1))))
	,2) AS distance_in_miles /* change constant to 111.111 for Km */
  FROM civicrm_contact AS c
  JOIN civicrm_address AS o ON ( o.contact_id = c.id )
  JOIN civicrm_address AS d ON o.contact_id <> d.contact_id
  JOIN civicrm_contact AS dc ON ( dc.id = d.contact_id AND dc.is_deceased = 0)
 WHERE
	c.id = 1 /* the origin contact */
 AND 69.0 *
    DEGREES(ACOS(COS(RADIANS(o.geo_code_1))
         * COS(RADIANS(d.geo_code_1))
         * COS(RADIANS(o.geo_code_2 - d.geo_code_2))
         + SIN(RADIANS(o.geo_code_1))
         * SIN(RADIANS(d.geo_code_1)))) < 20 /* distance from origin */
order by 4 desc /* return furthest away contacts first */

That query is the basis for a CiviCRM custom search which allows people to select an origin contact, the types of contact to report on and their distance form the origin.

Trouble Shooting

Post codes

OpenStreetMaps knows very little about UK post codes or, as it turns out, rural street names.
That causes a problem for OpenStreetMaps geocoding extension as the code needs “at least a city or postal_code, a street and a country”.

My short term approach was to edit the code and unset the street and postcode parameters, e.g.

$ cd /var/www/vhosts/crm-demo.example.org.uk/sites/default/files/civicrm/custom_ext/de.systopia.osm
$ vi CRM/Utils/Geocode/OpenStreetMapCoding.php

    // There should be at least a city or postal_code, a street and a country
    if ( !(array_key_exists('street', $params) && array_key_exists('country', $params) && (array_key_exists('city', $params) || array_key_exists('postalcode', $params)))) {
      // the error logging is disabled, because it potentially produces a lot of log messages
      //CRM_Core_Error::debug_log_message('Geocoding failed. Address data is incomplete.');
      $values['geo_code_1'] = $values['geo_code_2'] = 'null';
      return FALSE;
    }

// OpenStreetMaps coverage of UK street names and post codes is not good so unset those param's to avoid false negatives
unset( $params['street'] );
unset( $params['postalcode'] );

That done, addresses should successfully geocode down to the city level.

To get postcode geocoding we need to directly update the civicrm_address table using data from the postcodelatlng table we created earlier.
The update is relatively straighforward but does require an additional index on civicrm_address, on the postal_code column (I was surprised this index did not already exist). This index is critical for update performance.

mysql> create index index_postal_code on civicrm_address (postal_code);
Query OK, 0 rows affected (0.16 sec)
Records: 0  Duplicates: 0  Warnings: 0

Now we are ready to update the civicrm_address table using this update:

update
  civicrm_address a
, postcodelatlng p
set
  a.geo_code_1 = p.latitude
, a.geo_code_2 = p.longitude
where p.postcode = a.postal_code
and (
  /* Only update where the geo_codes are null or not equal to the lat/lon data */
	( a.geo_code_1 is null or ( a.geo_code_1 != p.latitude ) )
or  ( a.geo_code_2 is null or ( a.geo_code_2 != p.longitude ) )
)
and a.country_id = 1226 /* UK addresses only */

Query OK, 8888 rows affected (5.01 sec)
Rows matched: 8888  Changed: 8888  Warnings: 0

We’ll be setting up a cron job to run that update overnight every day to catch new and updated addresses.

Note: If you want to roll your own data then the post code data is available free from Ordanance Survey Code-Point Open. That data is supplied as Easting and Northing points which need converting to Latitude and Longitude, Ordnance Survey provide a spreadsheet which can be used to get the calculation.

Map display

A successfully geocoded addresses causes a Map link to appear next to the address in a contact record.
Clicking the link will use the geocoded latitude/longitude to get a map from OpenStreetMaps.

There is one issue with this - by default the OSM map uses a http conection. If your CRM site is using https (which it should be) then the OSM connection is likely to be blocked by some browsers and all you will see is a big white screen. The developer console will show something like this:

Mixed Content: The page at 'https://crm-demo.example.org.uk/civicrm/contact/map?reset=1&cid=1&lid=1' was loaded over HTTPS, but requested an insecure script 'http://openlayers.org/api/OpenLayers.js'. This request has been blocked; the content must be served over HTTPS.

The obvious problem is the http load from openlayers.org which can be found in template /sites/all/modules/civicrm/templates/CRM/Contact/Form/Task/Map/OpenStreetMaps.tpl, i.e.

<script src="http://openlayers.org/api/OpenLayers.js" type="text/javascript"></script>

The OpenLayers.js file is available via https but the certificate is invalid for openlayers.org as it is registered to github.

So the first part of solution is to serve OpenLayers.js locally by downloading it into the directory of your choice, e.g. using wget http://openlayers.org/api/OpenLayers.js, and editing the above template (in your custom templates directory of course).
On it’s own that causes a separate bunch of issues as the default URL scheme in that js is http, e.g.

OpenLayers.js:165 GET https://crm-demo.example.org.uk/theme/default/style.css initialize @ OpenLayers.js:165initMap @ map?reset=1&cid=1&lid=1:175
map?reset=1&cid=1&lid=1:1
Mixed Content: The page at 'https://crm-demo.example.org.uk/civicrm/contact/map?reset=1&cid=1&lid=1' was loaded over HTTPS, but requested an insecure image 'http://b.tile.openstreetmap.org/12/2038/1354.png'. This content should also be served over HTTPS.

A lot of tiles are loaded to create the map image, which results in a lot of ‘mixed content’ warnings.

(Almost) the final solution, serve OpenLayers.js locally and edit it to change the URL scheme to https, e.g. in OpenLayers.js:

url:["https://a.tile.openstreetmap.org/${z}/${x}/${y}.png","https://b.tile.openstreetmap.org/${z}/${x}/${y}.png","https://c.tile.openstreetmap.org/${z}/${x}/${y}.png"]

Last remaining niggle was the hardcoded path to style.css in OpenLayers.js, e.g., this.theme=OpenLayers._getScriptLocation()+"theme/default/style.css";. While not a major issue it does cause a 404 during page load (unless you happen to have a style.css in that location of course).
That is easily fixed by changing the path to something appropriate for your Drupal install.
We chose to put OpenLayers.js in the /sites/all/themes directory and create an empty file named style.css in the same directory.
The URL in OpenLayers.js then becomes this.theme=OpenLayers._getScriptLocation()+"style.css";

Maps now load without any issues.

Notes

There is a “Geocode and Parse Addresses” scheduled job which we found useful for testing but have not enabled.
The job has some useful parameters:
* geocoding=1 - 0, disable geocoding
* parse=0 - 1 = enable address parsing - requires other admin changes
* start=2 - start processing at this contact ID
* end=2 - end processing at this contact ID
* throttle=0 - 1 = 5 second delay between addresses

Our advice - don’t enable this job to run daily.

  1. it will potentially hammer the OpenStreetMap server, see the extension Readme for why enabling throttling is a good idea.

  2. it will likely fail on most addresses in the UK.

Update
Recent versions of CiviCRM (4.7.22) and OpenStreetMaps do not appear to have the problem with https reported earlier.

Comment on this article using form below. Requires email login only for authentication. HTML forbidden, Markdown only.