Recognizing Complex Activities
based on Activity Spotting
Sensing and understanding the context of the user plays an essential role in human-computer-interaction. It may enable natural communication, for instance, with robots that understand our goals, or allow electronic devices to support us at the right time.
In the past, researchers focused mostly on isolated activities and successfully recognized activities such as walking, standing or well-defined gestures.
However, human activity is highly complex
and an activity can be performed in many ways, and for a multitude of reasons. In my research, I focus both on the detection of isolated activities, as well as, on related composite and long-term activities.
Spotting atomic activity events [1, 2]
Human activity can be decomposed into a sequence of isolated activity events. In several experiments we studied and optimized the recognition of a large diversity (44) of isolated activities from a continuous datastream. Subject of our studies is for instance to identify the optimal sensor modality, sensor number and feature definition.
Discovering and combining relevant events [3]
Usually, it is not the plain, disconnected sequence of atomic activities that is interesting. Rather, the higher level goal at which these activities are directed. Since composite activities contain large amount of unrelated activity, using the complete observation [5], i.e, the complete datastream, can be suboptimal and therefore confuse the recognition.
We can observe for many composite acivities that it is sufficient to spot only a few underlying activity events to alow their recognition. For example, having lunch can be characterized by walking at a certain time of the day, without even observing the actual eating activity.
In a discriminative analysis [3] using two wearable accelerometers, we discover that a surprisingly small fraction of relevant parts can be sufficient to recognize the higher level composite activity and allow efficient recognition systems.
Hierarchical model for composite activites [4]
Considering a construction manual for a mirror (such as in the figure above), one of several tasks is to fix the frame to the panel. This seemingly simple task consists of various steps, and it becomes obvious that composite activites add significant variation. Interruptions can occur, the duration can vary strongly across different users, or underlying activities can happen in different ordering. Using the algorithms from recognizing atomic activities can be suboptimal, as these require prohibitive amounts of trainging data. Therefore, we propose a hierarchical model that observes relevant activity events and combines them to recognize composite activities, similar to the way in which letters create words.
Experiments show superior performance compared to single-layer approaches frequently used in activity recognition. Furthermore, preserving the partonomy of activities, allows the recognition on different levels, in which certainty of lower layers is increased by knowledge of the part-of-whole relationship.
Transferring and recombining relevant events [4]
Parts that are similar in different composite activities can be shared, much like vocabulary. Instead of re-learning composite activities from scratch, transferrin shared parts reduces the training effort for new composite activities significantly.
|