This thesis describes the classification human activity for the purpose of detecting falls, later extended to multiple activities. This was done with the final intention of implementing a robotic companion for older persons that can provide a certain level of automated care in case of some sort of emergency. The complexity of this work, combined with restrictions of the robot, motivated a creation of an infrastructure abstraction to allow deferred (decentralized) processing. The initial work was done by implementing classifiers that work use pre-processed skeleton data extracted from RGBD sensors and implements some steps in order to make classification robust to changes. RGB-D classification first focused on falling detection and then extended into general activities which could be classified from skeleton data. A later attempt used CNNs for classification of video footage of activities. All of those algorithms were modified to output classifications in real time. Results achieved were around 90% in accuracy for a simple fall vs. not fall activity in the TST fall v2 dataset, 70% global combined accuracy for the 12 actions of the CAD60 using skeleton data and 75% accuracy for the 51 actions on the HMDB51, all of those showing close to state-of-the-art performance on datasets. On new activity data based on skeletons and video, however, results were less encouraging with 33.5% accuracy on skeleton data and 37.9% accuracy based on video. While these results do not allow for a robotic platform to perform action detection currently, the overarching structure of systems necessary to execute it was demonstrated and used successfully, opening up doors for future research using more complex systems.

Document Type


Publication Date