Differences

This shows you the differences between two versions of the page.

Link to this comparison view

generalization [2014/09/15 14:36] (current)
admin created
Line 1: Line 1:
 +===== Generalization =====
 +We take generalization to denote the outcome of a data mining task. In OntoDM-core,​ we consider and model three different aspects of generalizations,​ each aligned with a different description layer: ​
 +  * the specification of a generalization, ​
 +  * a generalization as a realizable entity, and 
 +  * the process of executing a generalization. ​
  
 +Many different types of generalizations have been considered in the data mining literature. The most fundamental types of generalizations,​ as proposed by Dzeroski (2006) are in line with the data mining tasks. These include clusterings,​ patterns, probability distributions,​ and predictive models. ​
 +
 +==== Generalization specification ====
 +
 +In OntoDM-core,​ the //​generalization specification//​ class is a subclass of the OBI class //data representational model//. It specifies the type of the generalization and includes as part the //data specification//​ for the data used to produce the generalization,​ and the //​generalization language//, for the language in which the generalization is expressed. Examples of generalization language formalisms for the case of a //​predictive model// include the languages of: trees, rules, Bayesian networks, graphical models, neural networks, etc. 
 +
 +As in the case of datasets and data mining tasks, we can construct a taxonomy of generalizations. In OntoDM-core,​ at the first level, we distinguish between a //single generalization specification//​ and an //ensemble specification//​. Ensembles of generalizations have as parts single generalizations. We can further extend this taxonomy by taking into account the data mining task and the generalization language.
 +
 +==== Dual nature of generalizations ====
 +
 +Generalizations have a dual nature. They can be treated as data structures and as such represented,​ stored and manipulated. On the other hand, they act as functions and are executed, taking as input data examples and giving as output the result of applying the function to a data example. In OntoDM-core,​ we define a generalization as a sub-class of the BFO class //​realizable entity//. It is an output from a //data mining algorithm execution//​. ​
 +
 +The dual nature of generalizations in OntoDM-core is represented with two classes that belong to two different description layers: //​generalization representation//,​ which is a sub-class of information content entity and belongs to the specification layer, and //​generalization execution//,​ which is a subclass of planned process and belongs to the application layer. ​
 +
 +A //​generalization representation//​ is a sub-class of the IAO class //​information content entity//. It represents a formalized description of the generalization,​ for instance in the form of a formula or text. For example, the output of a decision tree algorithm execution in any data mining software usually includes a textual representation of the generated decision tree. A //​generalization execution// is a sub-class of the OBI class //planned process// that has as input a //dataset// and has as output another //​dataset//​. The output dataset is a result of applying the //​generalization//​ to the examples from the input dataset.

QR Code
QR Code Generalization (generated for current page)