This shows you the differences between two versions of the page.

Link to this comparison view

data_mining_algorithm [2014/09/15 14:39] (current)
admin created
Line 1: Line 1:
 +===== Data mining algorithm =====
 +A //data mining algorithm// is an algorithm (implemented in a computer program), designed to solve a data mining task. It takes as input a dataset of examples of a given datatype and produces as output a generalization (from a given class) on the given datatype. A specific data mining algorithm can typically handle examples of a limited set of datatypes: For example, a rule learning algorithm might handle only tuples of Boolean attributes and a boolean class. ​
 +In the OntoDM-core ontological framework, we consider three aspects of the DM algorithm entity:
 + a DM algorithm (as a specification),​
 + a DM algorithm implementation,​ and 
 +a DM algorithm execution. ​
 +==== Data mining algorithm as a specification ​ ====
 +//Data mining algorithm// as a specification is a subclass of the IAO class //plan specification//​ having as parts a //data mining task//, an //action specification//​ (reused from IAO), a //​generalization specification//,​ and a //​document//​ (reused from IAO). The //data mining task// defines the objective that the realized plan should fulfill at the end giving as output a generalization,​ while the //action specification//​ describes the actions of the data mining algorithm realized in the process of execution. The //​generalization specification//​ denotes the type of generalization produced by executing the algorithm. Finally, having a //​document//​ class as a part allows us to connect the algorithm to the annotations of documents (journal articles, workshop articles, technical reports) that publish knowledge about the algorithm. ​
 +In analogy with the taxonomy of datasets, data mining tasks and generalizations,​ in OntoDM-core we also construct a taxonomy of data mining algorithms. As criteria, we use the data mining task and the generalization produced as the output of the execution of the algorithm. ​
 +==== Data mining algorithm implementation ====
 +Data mining algorithm implementation is defined as a sub-class of the BFO class //​realizable entity//. It is a concretization of a //data mining algorithm//,​ in the form of a runnable computer program, and has as qualities //​parameters//​. The parameters of the algorithm affect its behavior when the algorithm implementation is used as an operator. A parameter itself is specified by a //parameter specification//​ that includes its name and description. ​
 +==== Data mining software ====
 +In OntoDM-core,​ we define data mining softwareas a sub-class of //directive information entity// (reused from IAO). It represents a specification of a //data mining algorithm implementation//​. It has as parts all the meta-information entities about the software implementation such as: //source code//, //software version specification//,​ //​programming language//, //software compiler specification//,​ //software manufacturer//,​ the //data mining software toolkit// it belongs to, etc. Finally, a //data mining software toolkit// is a specification entity that contains as parts //data mining software// entities. ​
 +==== Data mining operator ====
 +//Data mining operator// is defined as sub-class of the BFO class //role//. In that context, it is a role of a data //mining algorithm implementation//​ that is realized (executed) by a //data mining algorithm execution// process. ​
 +//Data mining operator// has information about the specific //parameter setting// of the algorithm, in the context of the realization of the operator in the process of execution. The //parameter setting// is a subclass of //data item// (reused from IAO), which is a quality specification of a //​parameter//​. ​
 +==== Data mining algorithm execution ====
 +In OntoDM-core,​ we define //data mining algorithm execution// as a sub-class of //planned process// (reused from the OBI ontology). A //data mining algorithm execution// realizes (executes) a //data mining operator//, has as input a //​dataset//,​ has as output a //​generalization//,​ has as agent a //​computer//,​ and achieves as a planned objective a //data mining task//.

QR Code
QR Code Data mining algorithm (generated for current page)