Welcome to A.I. Lab
Research projects ARRS
01.05.2009 - 30.04.2012
/
Qualitative models are models which, for a contrast from classification and regression models which predict classes or numerical quantities, describe qualitative relations between the observed variables. An example of such a relation is y=Q(+x), “y increases with x”. Qualitative models usually also include conditions under which they are true, e.g. y=Q(-x, +z) if x<12, and y=Q(-z) otherwise.
Although such models do not predict exact numerical values, they also have several advantages as compared to regression models. Qualitative descriptions are close to the human way of thinking (“the more it rains, the more wet I will get”, and not “the quantity of water in my clothes equals 0.45 l/cm × rt, where r is the rain intensity in cm/h and t is the time spent under it”), therefore such models are easier to explain and can reveal more information than regression models. They are typically also more robust since qualitative relations are simpler to model. They are thus often used as a step before regression modeling, where the relations in the qualitative model are used as constraints for the regression model (Šuc, 2004).
Despite the nice properties of qualitative models, there are no efficient algorithms for their construction. One of the rare methods from the field, QUIN (Šuc, 2001) induces trees similar to classification trees except that their leaves contain qualitative constraints. While using QUIN on real-world data (Žabkar 2005, 2006) we noticed a number of its deficiencies. The algorithm becomes very slow with the growing number of dimensions. It cannot treat the variable representing the time separate from other variables, which decreases its suitability for modeling dynamic systems. Based on an impurity measure, it is limited to the construction of tree models which are inappropriate for many practical problems. Finally, its formal definition of a constraint does not correspond to the mathematical definition of a derivative.
To solve the above problems, we developed a new approach to qualitative modeling which is based on the approximation of partial derivatives of the sampled multidimensional function. The procedure computes the derivative at each point where the function is sampled using the points at its vicinity which is defined either by triangulation or by the axis in which direction we compute the derivative. The computed derivatives can be treated as numbers or, to proceed with qualitative modeling, we can observe only their signs. The data preprocessed in this way can be subsequently modeled by any general machine learning algorithm or presented using a suitable visualization method.
We developed a prototype implementation named Pade (Žabkar 2007a). Its experimental results are excellent even on rather complicated synthetic data, such as the function sin(x)sin(y) over a few periods. Even in its prototype form, the algorithm has also been successfully used in EU projects XPERO in XMEDIA.
The goal of the project is to theoretically investigate the field and develop the procedures which will be useful for qualitatively modeling real-world data. The expected research problems are:
The usefulness of the developed methods will be tested on synthetic data and, especially, on the real-world data from industry, medicine and elsewhere. All methods will be implemented inside the system Orange (Demšar, 2004) and be freely available to their potential users.
The topic of the project partially overlaps with the anticipated topic of the doctoral dissertation of Jure Žabkar, a researcher at the Faculty of Computer and Information Science whose co-advisor is the leader of this proposed project and who already worked on these problems. We also expect to engage other under- and postgraduate students on the project, while our partners on various EU projects will test the developed methods as a part of the respective projects.
The developed methods will be quite applicable in practice as qualitative models are useful for:
Implemented procedures will also be used for teaching at the Faculty of computer and information science.
Project funding:
Slovenian Research and Innovation Agency
