In the previous article, we referred to the sources needed to gather data for further geodata mining. In this article, it is worthwhile to mention some tools, methods, and relationships existed between geodata, in order to make the process of geodata mining easier and more effective. Initially, you should know that all the tools and methods described below are compatible with GIS, which is the most holistic software especially with the external extensions for geodata mining. So, let’s go ahead seeing the most important relationships and rules govern the process of geodata mining, which led us to conclusions about geodata.
Epigraphical presentation of the whole article
These relationships were inspired by the science of statistics and aim to explain what are the common points between different types of spatial geodata. To make it more specific, let’s suppose that we have two spatial objects A and B with a topological relationship. Those items are consisted of a set of points and may be overlapped, disjoint, intersected or contained one in the other.
Disjoint: The case when there are no points of an item contained in the other
Overlaps & Intersects: Includes at least one common point of A and B.
Equals: Demands all points of A and B to be in common in order to be equals.
Covered by or inside or contained in: One of the two items may be smaller and included in the other. All the points of the smaller are contained in the bigger, but points of the bigger one may be outside the other.
Covers or contains: In this case, the item A may contain B, if B item is contained in A.
There are more relationships between geodata such as the direction of the item and the distance, but it is harder to be determined. However, it is very important to know these parameters, due to better identification and as a result to end up in better conclusions according to our research.
There are many tools helping us in the process of geodata mining. All of them are based in a programming language.
DBMiner & GeoMiner: Specialized in geodata mining and query language.
GeoDA: Another open source tool based on Python language, used for statistic spatial relationships and spatial regression.
Weka-GDPM: Based on Java language supports many usual operations of geodata
Descrates: It is a tool intended to analyze the geodata, visualize and display the result. Operates with Python language.
There are several programming languages specialized and absolutely suitable for geodata mining, such as Python, C, FORTRAN, R language etc. Each language has its advantages and disadvantages and it is used for different scopes in the same process of geodata mining. Moreover, as we saw above, all the tools created for geodata mining operate in one of those programming languages, so we end up that the programming languages is the first and the last step of the successful geodata mining, as well as we can use them self-contained, or in a completed software like GIS for more efficient results.
There is a categorization of the several techniques we use for geodata mining, depending on the results we want to take. The first category is the descriptive geodata mining while the second one is referred to the predictive geodata mining. The descriptive mining has the ability to describe capabilities and behavior of a dataset. It is an easy way to understand general properties and end up in superficial conclusions. On the other hand, if we want to go deeper to the dataset we can use predictive geodata mining. This technique is more complicated as it is based on computer algorithms which attempt to fabricate patterns between geodata, which led us to further predictions about the environment. It is obvious that a prediction model may be wrong especially when we talk about computer algorithms. For that reason, human intervention is necessary. However, nowadays the algorithms are so sophisticated, and they take into account many parameters, that it is almost impossible to find a mistake.
Furthermore, prediction geodata mining is divided into 4 subcategories as shown below:
Spatial items or geodata, in other words, are grouped in categories called clusters. We create many clusters with geodata from the same dataset, where each cluster includes geodata with many similarities, while between the clusters there are many divergencies. In that way, we deal with big datasets, but we can discover rules only for the non-spatial properties of geodata.
This method is used to find spatial rules connecting geodata between them. In this case, we search not only the non-spatial properties but every single spatial feature. Moreover, “association” technique has the ability to make a research on all the already discovered rules among other geo-datasets to find if one or some of the rules are suitable for that case.
According to the classification technique, we create a possible model and then we analyze geodata according to that model. We could say that it is the inverse process, where the help of the human factor led to exclusion of other cases and patterns in order to find the right one.
Trend detection demands a previous existence of geodata about a specific spatial object or location, in order to find changes over the time. Therefore, we talk about a technique which is specialized in the way a spatial feature change, respect to the time, distance, size, quality and other parameters.
To learn more about geodata mining and GIS you can visit UIZ webpage or call us at +49-30-20679115.