COMPARISON OF DIFFERENT LAND USE OBJECT CLASSES BY MEANS OF SEMANTIC SIMILARITY
Land use and land cover data are often required for public tasks on regional, national and even on european level. Some datasets using different thematic and geometric accuracy are available for these tasks, for example CORINE Land Cover (CLC)1, the Amtliche Topographisch-Kartographische Informationssystem (ATKIS®)2 or Global Monitoring for Environment and Security (GMES)3 services.
The CLC and GMES classification systems contain substantial, object-oriented descriptions of land use and land coverage classes which are produced via satellite in order to provide environmental monitoring on european level. In contrast ATKIS is an information system with the goal of a national topographic land survey with a high geometric accuracy and progressive actualization degree for urban areas. However, for many specialized applications in environmental studies this nation-wide data set does not provide the required information accuracy.
For mapping- and updating services it is necessary to join these heterogeneous data sets . This is technical feasible but you are still confronted with a semantic problem. To tackle this problem two steps has to be taken:
i) formalizing the description of object classes
ii) setting up a similarity measure by use of a knowledge-based-model
These approaches are developed by Delphi IMM as contribution to the german research project DeCOVER4 which is a cooperative project of eleven partners and funded by the Federal Ministry of Economics and Technology (BMWI) via the German Aerospace Center (DLR e.V.). DeCOVER was initiated to conceptualize and demonstrate innovative and cost-efficient geo-information services based on semantic interoperability and remote sensing data for up-date land cover data sets in Germany. Data sets based on heterogeneous classification systems can only be exchanged and integrated if they are comparable. Figure 1 depicts sport and leisure objects classified to ATKIS and CLC. These objects are to some extent congruent (compare “ATKIS-2201 Sports facilities” and “CLC-1.4.2 Sport and leisure facilities”) although the terms of the object classes are different but other objects in the map (compare “ATKIS-4108 Grove” and “CLC-3.2.1 Natural grassland”) show obviously a deviation This difference is caused by different conceptualizations and object descriptions. The logical conclusion towards comparability of object classes is to bring them on the same definition level by means of ontologies. An ontology is an explicit description of a common 'world outlook' [Grub03]. Making the knowledge of classification systems explicit was done in a multi-level process regarding the catalogues of ATKIS and CLC: extraction of the required information from the existing mapping instructions of the catalogues creation of a basic knowledge model definition of all object classes as an application ontology [Sch06]. The basic model deals with concepts, object properties and relations for a general object class. For each object class of a classification system there has to be a more specific application ontology based on taxonomy and relation of the basic knowledge model [LUT07]. Concept expressions and property restrictions characterize application ontologies. The main components ‘land cover’ (for example vegetation, water or urban area) and ‘land use’ are universal properties. Other parameters are the location in respect to the sea, characteristic neighborhood or genesis (natural or man-made). In addition, an object class can be expressed by specific properties depending on land cover. As an example for a vegetated object class it may is interesting if there is a specific humidity in the soil which gives further information about a swamp. The following process of reasoning is an automated process which leads by logical conclusions to automatic subsumption, a depiction of the hierarchy of object classes. The ontology-based reasoning serves as validation of formalized application ontologies within one domain or catalogue. The object class 'mixed forest' for example is consequently a subclass of 'forest'. For this task automated reasoning works well. The comparison of object classes from different catalogues shows a different picture because only equal or more special properties are taken into account. However, it is necessary to consider similarity between different object classes as well. For this reason a similarity measure was developed. The similarity of two object classes can be qualified in a Network Model or Feature Model. The Network Model describes the distance of nodes in a hierarchical tree by the number of edges between two nodes within a multidimensional area [RAD89]. The Feature Model figures the similarity by comparison the common properties of two object classes [TVE77]. We used a combination of both models: The Feature Model enables a rough estimation of similarity regarding the number of common properties. In addition the Network Model lights the refinement, this means the distance of two similar properties based on the knowledge model. We distinguish between Substantial Similarity and Mapping Similarity and developed therefore two different algorithms. Both of them are based on a combined Feature-Network-Model. Substantial Similarity considers two object classes which have to be compared as symmetric objects. Therefore a one-time evaluation has to be executed for each pair of property, regardless of which object class is the origin and which is the destination (Table 1). The calculated similarity shows the proximity of two object classes, but not the opportunity to map an object on another object. The intent of the Mapping Similarity is to show the potential transfer from an origin object class into a destination object class (Table 2, Table 3). This asymmetric examination considers only properties from the source class and provides for this reason an opportunity to map a source object class into a destination object class. The Substantial Similarity of object class ‘CLC 141 Green urban areas’ and ‘ATKIS 2201 Sport facilites’ is 88 % (Table 1). According to this high value the two classes are very similar but this method doesn’t shows whether the class ‘CLC 141 Green urban areas’ could be map into ‘ATKIS 2201 Sport facilites’. Therefore we use the method of Mapping Similarity. Table 2 shows a opportunity of 65% to map ‘CLC 141 Green urban areas’ into ‘ATKIS 2201 Sport facilites’ and obversely the validation is 79%. In contrast to Substantial Similarity the values of Mapping Similarity are lower. The difference of similarity values can be explained by means of the properties. Figure 2 shows the basic knowledge which is used for all object classes. ‘ATKIS 2201 Sport facilities` has the property ‘Sport’ and ‘CLC 141 Green urban areas’ has the property ‘recreation [Erholung] or culture [Kultur]’ for the parameter land use [Nutzung]. The results run in a web application where heterogeneous catalogues of land use and land cover are compared automatically. The web application is based on the language Java. The framework Jena which is developed by sourceforge.net supports the extraction of several ontology concepts. We have achieved a semantic comparison of different land use classes on catalogue level by means of ontologies and similarity measure algorithm. We developed our basic knowledge model as well as the similarity measure algorithm in an open-ended way: Any other land use catalogue can be added and compared. Our further developments focus on a feature based search and especially on a mapping on object level. With this achievement it will be possible to map spatial data classified to one catalogue to spatial data of another catalogue regarding all geometrical and topological aspects. 1http://www.corine.dfd.dlr.de