Letting a DataSet read its CRS from data files
Most datasets have a Coordinate Reference System (CRS) that gives the meaning of the coordinates of all its features. However, occasionally a dataset contains features of several different Coordinate Reference Systems. Such a dataset still has a CRS property, but that CRS can be regarded as a spokesperson for the collective, and is used for certain dataset methods, for example when the dataset extent is asked for.
For some kinds of datasets, you must specify a Crs property explicitly, so you must find out from your data provider what their coordinates mean.
For other datasets, the Crs is fixed, or can always be retrieved from data files. For example, a DtedDataSet is always in Wgs84LongLat, so there is no problem. If a type of dataset does not have its crs property exposed in Carmenta Studio, its Crs should always be created automatically.
However, some file formats may or may not contain CRS information, and even if they do, it is not certain that Carmenta Engine can extract it. A difficulty is that different data providers may use different methods to encode the metadata. However, two methods are widely used:
If a coordinate reference system exists in the EPSG database, then it has a unique EPSG code in the range 1024 – 32767, which should suffice to identify the system. (Note that every type of subcomponent of a coordinate reference system has its own EPSG code, which can be confusing since the codes are unique only within their own type. Code 6135 is an example: the CRS with that EPSG code is CIGD11 which is used on the Cayman Islands, but the Geodetic Datum with that EPSG code is the Old Hawaiian datum.)
The Well-Known Text format, WKT, can describe both geometries and coordinate reference systems. Unfortunately, the WKT syntax for coordinate reference systems has several different vendor-specific dialects; see WellKnownTextFlavor, so there can be problems with interoperability, but the latest WKT versions, informally known as WKT2, should improve matters.
So, many datasets in Carmenta Engine can attempt to create their Crs automatically from data files, namely:
The EcwDataSet.
The GdalDataSet.
The ImageDataSet.
The Jpeg2000DataSet.
The MapPackageDataSet.
The MifDataSet.
The MrSidDataSet.
The OgrDataSet.
The RaveGeoDataSet.
The ShapefileDataSet.
These datasets have null as the default value for the Crs property. If you specify this property explicitly, the datasets will use the value you provide. But if you leave the property undefined, these datasets will try to automatically set the Crs property from available metadata.
Automatic reading of CRS from file can fail
Automatic reading of a Coordinate Reference System from file can sometimes throw an exception, or if you are unlucky you may get no exception but see the geodata appear more or less incorrectly georeferenced. This section lists some of the reasons for failure.
The meta-data must be present and correct
Just because a geodata format allows metadata for the Coordinate Reference System, it is not guaranteed that the metadata is really given, or that it is correct. If Carmenta Engine fails to read the metadata, it may be a good idea to see whether gdalinfo can understand it.
The projection method must be supported
There are some types of map projections that Carmenta Engine does not support, for example the French Lambert nearly-conformal conic. But in practice, that is rarely a problem.
Deprecated or erroneous EPSG entries are dangerous
One reason for failure is simply that the EPSG database contains errors. When a serious error is discovered, the database entry is not corrected in the next version; it is just marked as deprecated, and a new correct entry is created, with a new code.
A change of name was once considered serious by EPSG, although it would not affect how coordinates are interpreted. For example, EPSG:2400, named "RT90 2.5 gon W", was deprecated in 2003 and replaced with EPSG:3021, named "RT90 2.5 gon V". Since the only reason for the deprecation was to use a Swedish rather than an English abbreviation, it is harmless to tag geodata with EPSG:2400 instead of EPSG:3021. (Nowadays, a name change will not cause deprecation by EPSG.)
On the other hand, sometimes it is the numerical parameters that are wrong in an EPSG entry. One example is the entry EPSG:2171 for a Polish coordinate reference system, which wrongly claimed that the False Northing should be 5647000 m. When the error was discovered, the 2171 entry was deprecated (in October 2005), and EPSG:3120 was created with the correct False Northing of 5467000 m.
Now, if the meta-data for some geodata files just says "EPSG:2171", you cannot know whether
the wrong False Northing is used, in agreement with the 2171 entry,
or if someone knew the correct False Northing and used it when creating the geodata files, and looked in the EPSG database (before October 2005) only to find the code 2171, failing to notice that the False Northing was wrong there.
When Carmenta Engine reads a Coordinate Reference System deprecated by EPSG, it is the definition in the EPSG database that is used, be it right or wrong, because there is no way to distinguish between a harmless and a truly serious deprecation. And also because the metadata may very well contain the complete description of the deprecated EPSG entry, perhaps as Well-Known Text.
So, your dataset may think that it could read the metadata successfully. Nevertheless, if it happened to be correct Polish data that was erroneously tagged with EPSG:2171, you would see a displacement of the data by 180 kilometers.
The geodetic datum needs a datum shift
To construct a new GeodeticDatum in Carmenta Engine, we need to supply a DatumShift that can transform the coordinates to some known geodetic datum, usually WGS84. And in some metadata formats, a datum shift can be specified explicitly. For example, Well-Known Text can contain a Helmert datum shift written like this:
TOWGS84[ 582, 105, 414, 1.04, 0.35, -3.08, 8.3]
which can be converted to a HelmertDatumShift in Carmenta Engine.
But the datum shift is not mandatory in Well-Known Text, and other metadata formats may not allow a datum shift at all. In fact, a common opinion is that although the data provider must specify the geodetic datum clearly, for example by an EPSG code for the datum or for the CRS, they should not prescribe the datum shift to WGS84, instead it is the responsibility of the data user to select a suitable datum shift. The reason is that datum shifts are empirical and imprecise rather than mathematically exact. So, there can be tradeoffs between a simple kind of datum shift that gives medium accuracy and a more accurate datum shift that requires access to a large grid shift file. Or if the geodetic datum covers a large area, say the European Datum 1950, there can be a tradeoff between a medium-accuracy datum shift usable in the whole datum area and a high-accuracy datum shift usable only in Spain, for example. Since the data provider does not know the accuracy required by the data user, or whether the data user cares only about a part of the geodetic datum area, it is better that the data user makes the choice.
On the other hand, the end user of a GIS system may not be aware that there are such things as geodetic datums that differ from WGS84. So, while it is a fact that most data providers do not prescribe datum shifts, it can be unclear exactly who is best placed to choose them, since there are some alternatives:
You can let Carmenta choose datum shifts and hope that they are accurate enough.
Or you – an application developer using Carmenta Engine – can choose datum shifts yourself for the specific geodata that you provide to your end users.
Or you may have to push the responsibility of choosing datum shifts to your own end users, if your application is not restricted to use only geodata provided by you, but is flexible enough to allow end users to find and import other geodata.
So this is a meta-choice: who should do the choosing? And the meta-choice is up to you, since Carmenta Engine can be used in all three ways, but obviously it would be simplest for you to just rely on Carmenta's choices of datum shifts.
For each geodetic datum, Carmenta Engine does have a default datum shift, but the accuracy depends on the quality of the geodetic datum, as well as the size of its area of use, and Carmenta's knowledge about the datum. In the best cases, our default datum shifts can give sub-meter accuracy when transforming datum coordinates to WGS84. But in the very worst cases, Carmenta knows nothing about how some geodetic datum is related to WGS84; then the default datum shift will just guess that the datum's latitude and longitude values are the same as in WGS84, and the error from that guess could be up to about a kilometer.
As an example of an intermediate case, take the North American Datum 1927. Since all of our default datum shifts either identify the datum with WGS84 or else use a simple HelmertDatumShift, our default datum shift for this datum is a simple Helmert one that has a worst error of 60 meters, at the westmost of the Aleutian Islands. You can construct a better datum shift for a restricted part of North America yourself by using a more advanced kind of datum shift, a GridFileDatumShift, but then you must find and download a grid shift file from some trusted source, since they are not included in the Carmenta Engine distribution.
So this means that you can choose to always use the Carmenta Engine default datum shifts, but the errors can be up to a kilometer. And that can make sense in some contexts, like in meteorology, or if you want the ability to preview new geodata sources easily before you decide to worry about datum shifts. However, trusting all our default datum shift is something you must opt in to, by specifying that you want to use the setting DatumShiftChoice = Lenient. Otherwise, the setting DatumShiftChoice = Conservative will be used – this setting will choose the default datum shift only if Carmenta regards it as generally useful, and otherwise throw an exception. The definition of "generally useful" is a bit vague (see DatumShiftChoice for details), but roughly it means that, unless you opt in to possible kilometer errors, you can expect the accuracy from a default datum shift to be no worse than 10 meters in most cases. So that can save you from some blunders.
But if your customers need an accuracy of just a couple of meters, even some of our "conservative" default datum shifts can be too inaccurate. So in that case, you yourself should choose the datum shift for all geodata you provide to your end users, and investigate their average and worst case accuracy. Our default datum shifts do have a DatumShift.Info property giving metadata like the accuracy, so you can check that, but it is usually an accuracy with 67% confidence rather than the worst case.
Or, as we mentioned earlier, if your application allows your end users to find and import new geodata, you must place the responsibility on them, but there are Carmenta Engine methods you can use to construct a GUI that helps them to choose a datum shift (or you can construct such a GUI for your own in-house use, of course). See next section.
Can I read the CRS from file, and just assign the datum shift manually?
Yes, there is some API support for doing this. It will probably work best when you implement a wizard that generates a configuration file (px file) from geodata.
To find out the Crs of the geodata, the wizard should first construct an appropriate DataSet in code, using crs = null and datumShiftChoice = Lenient (see previous section). The Lenient setting gives the best chances to extract the Crs from the geodata files, but the default datum shift can be questionable. So the wizard should display information from the DatumShift.Info, like the accuracy, the area of use, and the information source.
If the accuracy is acceptable, and the area of use is the same as the area of use for the geodetic datum (can be retrieved from the EPSG code of the datum), then the wizard user can approve the datum shift, and the wizard can use the same DataSet settings in the configuration file that it generates.
Otherwise, the wizard user should be presented with a list of alternative datum shifts. The wizard can the get the geodetic datum of the Crs, and call one of the methods
GeodeticDatum.AlternativeDatumShifts(gridFileRootPath)
GeodeticDatum.AlternativeDatumShiftsForArea(areaName, gridFileRootPath)
GeodeticDatum.AlternativeDatumShiftsForArea(areaBounds)
GeodeticDatum.AlternativeDatumShiftsForArea(areaBounds, gridFileRootPath)
Strictly speaking, these methods gives a collection of DatumShiftInfo instances. A DatumShiftInfo gives documentation about some datum shift, such as accuracy and information source, and you can construct the datum shift from the info.
The gridFileRootPath argument tells where the grid files for any GridFileDatumShift are expected to be. If this argument is omitted, this kind of datum shift will not be returned.
The areaName argument can be given to reduce the number of alternatives. For example, with the ED50 geodetic datum, calling ED50.alternativeDatumShifts(gridFileRootPath) will give 35 alternatives, but if the user can tell the wizard that all geodata is in Spain, then calling ED50.alternativeDatumShiftsForArea("Spain", gridFileRootPath) will give only 7 alternatives. (Those with a DatumShiftInfo.AreaOfUseDescription containing Spain).
Instead of the area name for the geodata, one could give the area bounds, which is easier for the wizard to do automatically. The wizard would get the DataSet.Bounds, which is a rectangle expressed in the DataSet.Crs coordinates. Then, the wizard must call Crs.ProjectTo( Wgs84LongLat, bounds) which will give one or two rectangles expressed in Wgs84LongLat, which is needed when calling GeodeticDatum.AlternativeDatumShiftsForArea. The results would be restricted to those with a DatumShiftInfo.AreaOfUseBounds that overlaps the given areaBounds. (The reprojection of the dataset bounds to Wgs84LongLat would have to use the leniently chosen datum shift, but accurate results are not important here.)
Choosing a datum shift
When the wizard user chooses one of the alternatives, it is not always the one with the highest accuracy that is best. For example, let us look at the results from calling ED50.alternativeDatumShiftsForArea("Spain", gridFileRootPath):
Name | Accuracy | Area Name | Area Description | Information Source |
---|---|---|---|---|
ED50_to_WGS84_1 | 10 m | Europe - west (DMA ED50 mean) | Austria; Belgium; Denmark; Finland; Faroe islands; France; Germany (west); Gibraltar; Greece; Italy; Luxembourg; Netherlands; Norway; Portugal; Spain; Sweden; Switzerland. | U.S. Defense Mapping Agency TR8350.2 September 1987. |
ED50_to_WGS84_13 | 9 m | Europe - Portugal and Spain | Portugal; Spain - mainland. | U.S. Defense Mapping Agency TR8350.2 revision of August 1993. |
ED50_to_WGS84_27 | 1.5 m | Spain - Balearic Islands | Spain - Balearic Islands. | Centro Nacional de Información Geográfica via EuroGeographics; http://crs.bkg.bund.de/crs-eu/ |
ED50_to_WGS84_28 | 1.5 m | Spain - mainland except northwest | Spain - mainland except northwest (north of 41°30'N and west of 4°30'W). | Centro Nacional de Información Geográfica via EuroGeographics; http://crs.bkg.bund.de/crs-eu/ |
ED50_to_WGS84_29 | 1.5 m | Spain - mainland northwest | Spain - mainland north of 41°30'N and west of 4°30' W. | Centro Nacional de Información Geográfica via EuroGeographics; http://crs.bkg.bund.de/crs-eu/ |
ED50_to_WGS84_C41 | 1 m | Spain - mainland and Balearic Islands onshore | Spain - mainland, Balearic Islands, Ceuta and Melila - onshore. | IGN: Instituto Geográfico Nacional, www.cnig.es |
If all the ED50 geodata is in Spain, the wizard user would probably choose either ED50_to_WGS84_C41 which uses a grid file, or ED50_to_WGS84_28 which does not need a grid file and covers most of Spain. (For the latter, it may be difficult to find out how accurate it would be in northwest Spain and on the Balearic Islands, since the EPSG database lacks this information.)
However, let us assume that our ED50 dataset with Spanish data will be used alongside another ED50 dataset with French data. Then, it is usually recommended to use a single datum shift whose area of use covers all the ED50 data (in this case, ED50_to_WGS84_1). One could instead use a Spanish datum shift in the Crs for the Spanish dataset and a French datum shift in the Crs for the French dataset, and perhaps get better local accuracy, but the two datum shifts would probably disagree somewhat along the Spanish-French border, and the discontinuity may cause worse problems than a single datum shift would.
This means that a wizard user must know the context in which a dataset will be used, to be able to choose a good datum shift.
Using the chosen datum shift
After the wizard user has chosen a datum shift, the wizard would like to use it to replace the original datum shift (which was leniently chosen when the dataset Crs was extracted from files). However, the datum shift is a read-only property of a GeodeticDatum, and a geodetic datum is a read-only property of a Crs, so the wizard cannot modify the original Crs. It is necessary to construct a new instance of GeodeticDatum from the original Ellipsoid and PrimeMeridian but the user-selected datum shift. Then, a new Crs instance can be constructed from the new geodetic datum and the original Projection. Finally, the new Crs instance must be attached to a new instance of the dataset, and the new dataset instance should be serialized to a px file. So when this px file is used, the dataset has an explicit Crs containing the datum shift chosen by the user.
Previous: When are things reprojected?
Up: Coordinate Systems: Contents
Next: Types of vertical coordinates