# Completed PhD Theses

## Context-based Reasoning in Ambient Intelligence

**Hristijan Gjoreski **, supervisor: Matjaž Gams, co-supervisor: Mitja Luštrek, January 2015

The availability of small, wearable, low-cost, power-efficient sensors, combined with advanced signal processing and information extraction, is driving the revolution in the ambient intelligence (AmI) domain. This revolution has enabled novel approaches and technologies for accurate measurements in the area of healthcare, enhanced sports and fitness training, and life-style monitoring.

Early AmI systems included a single type of sensors that has made it possible to develop the first proof-of-concept applications. As the field has matured, these systems have gained additional sensors, resulting in the development of advanced and more accurate multi-sensor techniques and applications. However, combining multiple sources of information from multiple sensors is a challenging task. The first issue is that each sensor has its own technical configuration (for example, the data sampling rate) and requires different data-processing techniques in order to first align the different sensor data, and later to extract useful information. The second issue is that even if the multi-source data is aligned, it can be challenging to find an intelligent way to combine this multi-source information in order to reason about the user or the environment. While several approaches for combining multiple sources of information and knowledge have been developed (such as Kalman filters, ensemble learning, and co-training), these approaches have not been specialized for AmI tasks.

This thesis addresses the problem of combining multiple sources of information extracted from sensor data by proposing a novel context-based approach called CoReAmI (Context-based Reasoning in Ambient Intelligence). The CoReAmI approach consists of three phases: context extraction, context modeling, and context aggregation. In the first phase, multiple contexts are extracted from the sensor data. In the second phase, the problem is modeled using the already extracted contexts. In the third phase, when evaluating a data sample, the models that correspond to the current context are invoked, and their outputs are aggregated in the final decision.

The feasibility of this approach is shown in the three domains that have emerged as essential building blocks in AmI: activity recognition, energy-expenditure estimation, and fall detection. For each of these domains, the thesis offers an appropriate description of the domain, its relevance, and its most relevant related work. The application of the CoReAmI approach to each problem domain is then described, followed by a thorough evaluation of the approach. The results show that CoReAmI significantly outperforms the competing approaches in each of the domains. This is mainly due to the fact that, by extracting multiple sources of information and combining them by using each source of information as a context, a multi-view perspective is created, which leads to better performance than with conventional approaches.

[download pdf]

[download presentation]

## Visualizing Solution Sets in Multiobjective Optimization

**Tea Tušar**, supervisor: Bogdan Filipič, September 2014

Many real-world optimization problems are inherently multiobjective, for example, searching for trade-off solutions of high quality and low cost. As no single optimal solution exists for such problems, multiobjective optimization algorithms provide a set of optimal (or near-optimal) trade-off solutions to choose from, called an approximation set. Because such algorithms are often stochastic, they return a different approximation set in every run. The Empirical Attainment Function (EAF) is able to describe the probabilistic distribution of multiple approximation sets and can be therefore used to analyze and compare the performance of such algorithms. As multiobjective optimization deals with vectors in the multidimensional objective space, it is very important to be able to visualize them. This thesis addresses two distinct tasks in visualization in multiobjective optimization—visualization of approximation sets and visualization of EAFs.

While scatter plots can be used for visualizing 2D and 3D approximation sets, more advanced approaches are needed to handle four or more objectives. This thesis presents a comprehensive review of the existing visualization methods used in evolutionary multiobjective optimization, showing their outcomes on two novel 4D benchmark approximation sets. In addition, a visualization method that uses prosection (projection of a section) to visualize 4D approximation sets is proposed. The method adequately reproduces the shape, range and distribution of vectors in the observed approximation sets and can handle multiple large approximation sets while being robust and computationally inexpensive. Even more importantly, for numerous vectors, the visualization with prosections preserves the Pareto dominance relation and relative closeness to reference vectors. Visualization with prosections is analyzed theoretically and demonstrated on several approximation sets.

While the visualization of EAFs is rather straightforward in two objectives, the three-objective case presents a great challenge as we need to visualize a large number of 3D cuboids. This thesis addresses the visualization of exact as well as approximated 3D EAF values and differences in these values provided by two competing multiobjective optimization algorithms. First, we compute the 3D cuboids. Then, we show that the exact EAFs can be visualized using Slicing and Maximum Intensity Projection, while the approximated EAFs can be visualized using Slicing, Maximum Intensity Projection and Direct Volume Rendering. In addition, the thesis demonstrates the use of the proposed visualization techniques on a real-world steel-casting optimization problem.

## Multiobjective Discovery of Driving Strategies

**Erik Dovgan**, supervisor: Bogdan Filipič, co-supervisor: Matjaž Gams, January 2014

When driving a vehicle along a given route, several objectives need to be considered, such as traveling time and fuel consumption. This can be viewed as an optimization problem and solved using appropriate optimization algorithms. Most existing optimization algorithms combine objectives into a weighted-sum cost function and solve the corresponding single-objective problem. In principle, it should be advantageous to use a multiobjective approach, since it enables better exploration of the multiobjective search space; however, no results have yet been reported regarding the optimization of driving with this approach.

In order to explore the multiobjective approach, we designed a two-level multiobjective optimization algorithm for discovering driving strategies (MODS). This algorithm finds a set of nondominated driving strategies with respect to two conflicting objectives: traveling time and fuel consumption. The lower-level algorithm, which is based on a deterministic breadth-first search and nondominated sorting, searches for nondominated driving strategies. The upper-level algorithm is an evolutionary algorithm that optimizes the input parameters for the lower-level algorithm. MODS was implemented in three variants. The initial version, MODS1, implements only the multiobjective approach; however, it fails to find better driving strategies than existing optimization algorithms. MODS2 is an improved version of MODS1 that combines multiobjective with single-objective approach and identifies better driving strategies than existing algorithms. Finally, MOCDS is an enhanced version of MODS2 that, in addition to reducing traveling time and fuel consumption, also reduces driving discomfort in order to identify comfortable driving strategies.

The MODS algorithm was tested on data from real-world routes and compared with the existing single-objective algorithms for discovering driving strategies; namely, predictive control and dynamic programming. The results show that, on average, MODS2 significantly outperforms MODS1 and the existing algorithms. In addition, MODS1 outperforms the dynamic programming algorithm, but is outperformed by the predictive control algorithm. Moreover, MOCDS found more comfortable driving strategies than MODS2, without significantly deteriorating their traveling time and fuel consumption. Finally, a human-versus-machine test was conducted by comparing the driving strategies obtained by MODS2 and MOCDS with those obtained by a group of volunteers operating a vehicle simulator. The results show that MODS2 always finds better driving strategies than the volunteers, especially when fuel consumption is to be reduced. The results also show that some volunteers always drive in a similar way in terms of traveling time and fuel consumption, while others vary their driving strategies significantly. On the other hand, the best human driving strategies are similar to the MOCDS driving strategies in terms of traveling time, fuel consumption, and driving discomfort.

## Detection of Anomalous and Suspicious Behavior Patterns from Spatio-Temporal Agent Traces

**Boštjan Kaluža**, supervisor: Matjaž Gams, co-supervisor: Mitja Luštrek, May 2013

Many applications, including smart environments, surveillance, human-robot interaction, and ambient assisted living, involve the problem of learning patterns of agent behavior from sensor data. Deviant behavior is a pattern in the data that either does not conform to the expected behavior, that is, anomalous behavior, or matches previously defined unwanted behavior, that is, suspicious behavior. This thesis proposes a unified framework to analyze agent behavior from prior knowledge and external observations in order to detect deviant behavior patterns, regardless of whether the observed entities are humans, software agents, or even robots. From the behavior analysis perspective, the thesis proposes a novel, efficient encoding that referred to as a spatio-activity matrix. This matrix is able to capture behavior dynamics in a specific time period using spatio-temporal features, whereas its visualization allows visual comparison of different behavior patterns. Furthermore, the thesis introduces a clear problem definition that helps establish a theoretical framework for detecting anomalous and suspicious behavior from agent traces in order to show how to optimally perform detection. It discusses why detection error is often inevitable and prove the lower error bound, and provide several heuristic approaches that either estimate the distributions required to perform detection or to directly rank the behavior signatures using machine learning approaches. The established theoretical framework is extended to show how to perform detection when the agent is observed over longer periods of time and no significant event is sufficient to reach a decision. We specify conditions that any reasonable detector should satisfy, analyze several detectors, and propose a novel approach, referred to as a F-UPR detector, that generalizes utility-based plan recognition with arbitrary utility functions. The unified framework is demonstrated empirically in three studies. The first study addresses detection of decreased behavior that indicates disease or deterioration in the health of elderly persons, while the second study deals with the detection of suspicious passengers in the airport simulation. Finally, the third study concerns the verification of persons at an access control point in high-security application.

PDF will be available soon.

## Searching for Credible Relations in Machine Learning

**Vedrana Vidulin**, supervisor: Matjaž Gams, co-supervisor: Bogdan Filipič, Februar 2012

Can a model that is constructed using data mining (DM) and machine learning (ML) programs be trusted? For example, it is known that a decision-tree model can contain relations that are statistically significant, but in reality meaningless to the human. When the task is domain analysis, meaningless relations are problematic since they can lead to wrong conclusions and can consequently undermine a human’s trust in DM and ML programs. To eliminate the problematic relations from the constructed and similar models, we propose an interactive method called Human–Machine Data Mining (HMDM). The method constructs multiple models in a specific way so that a human can reexamine the relations in different contexts and based on additional evidence to conclude which relations and models are credible, i.e., both meaningful and of high quality. Based on credible relations and models, the human can construct correct conclusions about the domain. The method is demonstrated in three complex domains. An experimental evaluation shows that the method is capable of finding important relations in a domain and that credible models are better in both meaning and quality in comparison to models constructed solely by the DM programs.

## Pathology in Heuristic Search Algorithms

**Mitja Luštrek**, supervisor: Matjaž Gams, co-supervisor: Ivan Bratko, June 2007

Minimax and single-agent search algorithms can be considered two main classes of heuristic search algorithms. Practice shows that in both cases deeper search results in better decisions. Mathematical analyses by previous researchers, however, have shown that the opposite is true: that minimaxing amplifies the error of the heuristic evaluations and hence deeper search gives worse decisions. This phenomenon has been termed search pathology. In the thesis we use a minimax model with real numbers as both true and heuristic position values to show that proper modeling of the heuristic error is enough to prevent the pathology. We proceed to examine how the number of possible position values, branching factor and similarity of nearby positions and affect the pathology. We also show why minimax search is actually beneficial. The pathology in single-agent search is explained on synthetic search trees and path-finding on maps from computer games. We finally explain why single-agent search is beneficial.