In my previous article we explored Teradata's spatial features using geometric shapes on a 10x10 grid.  I hope this was a useful introduction into the new ST_Geometry data type and its powerful set of functions.  And now that we've covered the basics, let's take a step into the world of geospatial.  By geospatial, we're now talking about locations based on geographic reference points on Earth; for example longitude and latitude.   (This can and does get more complicated, but we'll try to keep this as light as possible).


So just like we did in the previous article, we'll start with an example using ST_Geometry for point data. Here's a table to store city locations as points on a map.

CREATE TABLE cities
(
    id INTEGER NOT NULL PRIMARY KEY,
    name VARCHAR(50),
    location ST_GEOMETRY
);

 

This table includes an ID column, a name column and an ST_Geometry column for storing the location. This of course is the field that we'll be using to flex Teradata's geospatial powers.

Now to add some data.

INSERT INTO cities VALUES (1, 'New York','POINT( -73.960257 40.768823)');
INSERT INTO cities VALUES (2, 'San Francisco', 'POINT(-122.420782 37.795219 )');
INSERT INTO cities VALUES (3, 'San Diego', 'POINT(-117.074830 32.995484 )');
INSERT INTO cities VALUES (4, 'Los Angeles', 'POINT(-118.243 34.05219 )');
INSERT INTO cities VALUES (5, 'Atlanta','POINT( -84.3897 33.75449)');
INSERT INTO cities VALUES (6, 'Dallas','POINT(-96.7971 32.78108 )');

 


The first difference you should notice is the value of the X,Y coordinates. We're using geospatial references; longitude and latitude.
Longitude are the verticle lines on a map, representing the X coordinate.  And latitude are the horizontal lines, representing the Y coordinate.

OK, so far nothing too complex here.  Let's try our distance query just like we did with our simply geometry example.

SELECT C1.name, C2.name, C1.location.ST_Distance(C2.location) AS Distance
FROM cities C1, cities C2
WHERE C1.id != C2.id
ORDER BY 1,3;

 

Results:

	name			name			Distance
	Atlanta			New York		12.57
	Atlanta			San Diego		32.69
	Atlanta			Los Angeles		33.85
	Atlanta			San Francisco		38.25
	Los Angeles		San Diego		1.58
	Los Angeles		San Francisco		5.61
	Los Angeles		Atlanta			33.85
	Los Angeles		New York		44.79
	New York		Atlanta			12.57
	New York		San Diego		43.81
	New York		Los Angeles		44.79
	New York		San Francisco		48.55
	San Diego		Los Angeles		1.58
	San Diego		San Francisco		7.18
	San Diego		Atlanta			32.69
	San Diego		New York		43.81
	San Francisco		Los Angeles		5.61
	San Francisco		San Diego		7.18
	San Francisco		Atlanta			38.25
	San Francisco		New York		48.55

 


Hmm, what's going on here? Atlanta to New York is 12.57?  What units are those - they can't be miles or kilometers?
What do these numbers mean?

OK, let's back up a little and talk through this.  We've entered our coordinates as longitude and latitude, right.  So what should expect from the distance calculation.  To Teradata these are arbitrary units.  We haven't done anything to tell the database that these now represent geospatial coordinates (long/lat).  We created the table and added data just the same way we did for our geometry shapes, so how is the database to know anything differently?

The answer to this problem lies with an ST_Geometry attribute called SRS - Spatial Reference Systems.  Until we do something to indicate that we are using a geographic SRS, all the ST_Distance function is going to do is assume that these are simple units.  And in that simple world, yes, Atlanta and New York are only 12.57 units apart.

This is where the complexity comes in; converting longitude and latitude calculations into units like miles or meters.  Not nearly as easy as it sounds.  A quick web search brings us information like "on Earth, the length of an arcdegree of north-south latitude difference is about 60 nautical miles, 111 kilometres or 69 statute miles at any latitude."

Let's try this simple rule of thumb value and try our calculation using each unit of latitude as 69 miles.  If we take our Atlanta to New York value of 12.57 and multiply by 69 we then have 867.33 miles.  That seems more reasonable.  A web search confirms that the driving distance at about 850 miles.   But wait, another web search has the directly flying distance between these cities at 748 miles.  That's the one we should be using to compare with our database results.  Why then is our calculation off by over a hundred miles?   Well, let's go back to latitude and longitude.

Latitudes are parallel lines running horizontal across the globe.   They are always the same distance apart (69 miles).  Longitude lines, though, are a much different story.  At the equator they too are roughly 69 miles apart, but as they reach the North and South poles, the converge and the distance between them goes to zero.  If we are going to accurately calculate distances, we're going to need to account for these narrowing longitude uints (Atlanta and New York are separated by almost 11 degrees longitude).  Doing this will involve telling Teradata about our coordinates when we ask it to perform any calculations.  The method we'll use for this is the ST_Transform() method.   ST_Transform allows us to convert ST_Geometry data between different SRS units.  In our example, we need to transform our coordinates into geodetic units so that the ST_Distance() calculation can accurately account for the narrowing longitude lines.

This topic does start to get more and more complex, so let's skip ahead and take a look at the answer to this query.  That should make it a little easier to work our way backwards to explain how we create this query.

SELECT C1.name, C2.name,
C1.location.ST_Transform(SRS_1.srtext,SRS_2.srtext).ST_Distance(C2.location.ST_Transform(SRS_1.srtext,SRS_2.srtext))*.6/1000 as Miles
FROM cities C1 ,cities C2,
     sysspatial.SPATIAL_REF_SYS SRS_1,
     sysspatial.SPATIAL_REF_SYS SRS_2
WHERE C1.id != C2.id
AND   SRS_1.AUTH_SRID = 32616 -- UTM 16 / WGS84
AND   SRS_2.AUTH_SRID = 4326 -- WGS84
ORDER BY 1,3;

 

Results:

	name			name			Miles
	Atlanta			New York		728.80
	Atlanta			San Diego		1872.56
	Atlanta			Los Angeles		1929.26
	Atlanta			San Francisco		2142.69
	Los Angeles		San Diego		105.88
	Los Angeles		San Francisco		375.60
	Los Angeles		Atlanta			1929.26
	Los Angeles		New York		2430.39
	New York		Atlanta			728.80
	New York		San Diego		2396.82
	New York		Los Angeles		2430.39
	New York		San Francisco		2560.22
	San Diego		Los Angeles		105.88
	San Diego		San Francisco		481.32
	San Diego		Atlanta			1872.56
	San Diego		New York		2396.82
	San Francisco		Los Angeles		375.60
	San Francisco		San Diego		481.32
	San Francisco		Atlanta			2142.69
	San Francisco		New York		2560.22

 


First let's look at the new distance calculation between Atlanta and New York - 728.80 miles. Yeah, that looks like the right answer.  And another check on the San Diego to New York distance of 2396.82 checks out as well. Looks like we have the right query.   Now let's take a closer look to understand what we did to get these results.

We can see that our select statement now includes two calls to ST_Transform() for the distance calculation.   That makes sense; we're converting both the C1.location and the C2.location for this new calculation.   We also see a unit conversion factor here, "*.6/1000". That's a pretty standard way of converting from meters to miles, so apparently ST_Distance() is returning meters.   All right, we're good to go, right?

Well, before we accept this query as our general solution to geospatial calculations, let's dig a little deeper into these two SRS transformations.  While we do seem to have a query that is working, we need to make sure we have a solution that will work for all locations, not just these US coordinates.  Looking back at the SQL and those commented lines of the query, we should be asking "What is UTM 16 and WGS84".

WGS stands for World Geodetic System and is a standard for use in mapping the Earth's spheroidal surface.  The latest revision is WGS 84, which is also the reference coordinate system used by Global Positioning System (GPS) devices.  Geodetic references are needed because the earth is not a perfect sphere.

UTM, or Universal Transverse Mercator, is a modern mapping coordinate system developed by the US Army Corps of Engineers in the 1940s.  UTM is a grid-based coordinate system that differs from traditional longitude and latitude in that it divides the world into a series of sixty zones.  This allows distance calculations to be done more easily with less distortion than the trigonometric formulas needed for longitude/latitude.   These calculations were particularly important during WWII and it was this work during the war that led to adoption of UTM in the post-war years. See http://egsc.usgs.gov/isb/pubs/factsheets/fs07701.html for some background information.

So back to our query.  What we've done here is taken the location data for our cities, which are UTM based locations, and converted them to geodetic values for more accurate distance calculations.  But the key here is to 'know your data'.  This query works here for US data.  Other regions will be using different coordinate reference systems for their locations.   Teradata does provide over a thousand SRS references to be used by ST_Transform().  Take a look at the table sysspatial.SPATIAL_REF_SYS table:

SELECT AUTH_SRID, SRTEXT 
FROM sysspatial.SPATIAL_REF_SYS

 This returns 1338 rows from my database.  Finding the right one for your data may take some research and trial and error.


Summary

In these past two articles we've had a hands-on tour of simple spatial objects and the powerful functional capabilities of Teradata and a good introduction to the concepts of geospatial data and the considerations required for using our calculations.  In the next articles, we'll continue working with geospatial data and methods, learn how to import external spatial reference data and finally how to visualize our data by creating viewable maps.  Stay tuned - this is great stuff!

Discussion
orozcohsu 3 comments Joined 03/13
09 Oct 2013

a question here,
I want to calcuate the distance for longitude east   121.516 latitude north 25.041 (Taiwan/Asia)
so, I insert one additional row into table you mentioned, and try to query the result.
But it always shows ST_Transform: Error in call to transform...
 
Some could help me looking this problem? I doubt it maybe need to change the number of
"SRS_1.AUTH_SRID = 32616 -- UTM 16 / WGS84  or  SRS_2.AUTH_SRID = 4326 -- WGS84 "
 
thanks in advance !

 

–121.516194

 

jobbazar 2 comments Joined 12/09
17 Jun 2016

Hi 
Taiwan is not on the same UTM 16, it is different, 

WGS 84 / UTM zone 51N
Did you figure this out? If not do you still need any help? Please post your SQL details for any help/error.
Thanks!

You must sign in to leave a comment.