Context Mediation is a field of research that is concerned with the interchange of information
across different environments, which provides a vehicle to bridge
semantic gaps amongst disparate entities. Knowledge Discovery
is concerned with the extraction of actionable information from
large databases. A challenge that has received relatively little
attention is knowledge discovery in a highly disparate environment,
that is multiple heterogeneous data sources, multiple domain knowledge
sources and multiple knowledge patterns. This thesis tackles
the problem of semantic interoperability amongst data, domain
knowledge and knowledge patterns in a knowledge discovery process
using context mediation.
Context fundamentals are
introduced, which encompasses the concepts of context identity,
semantic values, contextual equivalence, contextual orders, contextual
distance, inheritance of contexts, and inter-ontology relationships.
Based on this foundation, the principles of context mediators
for data, domain knowledge and knowledge patterns are outlined,
a context mediator prototype is developed and performance tests
are carried out. Expanding on these rudimentary elements, context
mediation is introduced for data, domain knowledge and knowledge
patterns.
Contextual data mediation
is concerned about semantic conflicts among heterogeneous data sources
which are used as input for knowledge discovery. In order to treat
contexts as first class citizens and allow inheritance as well
as overloading and overriding operations, an object data model
has been chosen, namely ODMG. In addition to extending the ODMG
meta model, also its object definition language and object query
counterpart have been extended appropriately.
Contextual domain knowledge
mediation deals with the integration
of pre-existing knowledge about data, preferences and biases generated
in multiple contexts, which is incorporated in the knowledge discovery
process. Different types of contextual domain knowledge are formulated,
namely taxonomies, constraints, user preferences, and previously
discovered knowledge. In order to allow the support of subjective
and objective domain knowledge, context mediation among different
domain knowledge entities is proposed, which is compatible with
the context mediation formulated earlier.
Contextual knowledge
pattern mediation is concerned about
the interpretation of the outputs from data mining algorithms
from different perspectives. An object-oriented framework is presented
that models the output of virtually any knowledge discovery exercise.
Based on the proposed skeleton, two operations are developed,
which allow the viewing or interpreting of data mining output
within different contexts. Contextual ranking allows the ordering
of information based on qualitative and quantitative information.
Three manipulative operations are introduced which provide a further
vehicle to tailor knowledge sets, namely balancing, boosting,
and inversion. Comparison of discovered knowledge provides a powerful
mechanism to evaluate the equivalence between two or more knowledge
patterns or pattern objects. A summary value is introduced which
is used to calculate pattern equivalence and example summary values
have been given for segments and associations.
All presented techniques,
methods and models are applied in real-world scenarios, covering
disciplines from a wide range of industry, namely web mining and
marketing, manufacturing, meteorology and internationalisation.
When feasible industry standards were utilised, for instance ODMG,
PMML and KQML.
The carried out research
has resulted in almost fifty international publications, including
a co-authorship of a book, a journal editorship, four journal
publications, three book chapters, and one best paper award.