Home

Awesome

Gesture Recognition for Human-Robot Interaction: An approach based on skeletal points tracking using depth camera

<p align="justify"> Human-robot interaction (HRI) has been a topic of both science fiction and academic speculation even before any robots existed. HRI research is focusing to build an intuitive and easy communication with the robot through speech, gestures and facial expressions. The use of hand gestures provides a better solution than conventional human-machine interfaces. Furthermore, translations of hand gestures can help in accomplishing the ease and naturalness desired for HRI. This has motivated a very active research concerned with computer vision-based analysis and interpretation of hand gestures. </p> <p align="justify"> In this project, we aim to implement the hand gesture recognition for robots with modeling, training, classifying and recognizing gestures based on computer vision algorithms and machine learning techniques. Gestures are modeled based on skeletal points and the features are extracted using NiTE framework using a depth camera. </p> <p align="justify"> In order to recognize gestures, we propose to learn and classify hand gestures with the help of Adaptive Naive Bayes Classifier using Gesture Recognition Toolkit. Furthermore, we aim to build a dashboard that can visualize the interaction between all essential parts of the system. Finally, we attempt to integrate all these functionalities into a system that interacts with a humanoid robot NAO. </p> <p align="justify"> As a result, on one hand, gestures will be used command the robot to execute certain actions and on the other hand, gestures will be translated and spoken out by the robot. </p>

VIDEO

AUTHOR

Aravinth Panchadcharam < http://www.aravinth.info >

ACKNOWLEDGEMENTS

DESIGN

<p align="justify"> Our goal in this project to implement a system that should be integrated into NAO to recognize hand gestures. Existing stereo cameras of NAO are greatly limited by the quality of the input image. Variations in lighting and background clutters would only worsen the problem. On the other hand, depth-based approaches are able to provide satisfactory results for hand gesture recognition even with poor indoor lighting and cluttered background condition. Therefore, we have chosen Asus Xtion which has sensors that capture both RGB and depth data. Asus Xtion is an OpenNI compatible device, thus, we have chosen a NiTE middleware for the purpose of tracking the human skeletal points. </p> <p align="center"> <img src="./figure/content/hri-architecture.jpg" alt="Architecture" width=80% height=80%/> </p> <p align="justify"> We have chosen Gesture Recognition Toolkit (GRT) to train and predict the 3D skeletal modeled gestures with feature based statistical learning algorithm. Adaptive Naive Bayes Classifier (ANBC) is the supervised machine learning algorithm which is chosen for the purpose of classifying and predicting the hand gestures in real time. </p> <p align="justify"> Furthermore, all these interactions must be displayed to visually understand the status of the system. Finally, recognized hand gestures must be translated to robotic actions as following : </p> <p align="justify"> The goal should be reached by studying the various solution to this problem and an appropriate design must be chosen. The main challenge is to find a solution that can integrate all these components into a robust system. </p> <p align="justify"> Furthermore, this system must be tested and results must be presented clearly. Evaluations must be carried out to demonstrate the effectiveness of the classifier and to validate its potential for real time gesture recognition </p>

BACKGROUND

Aldebaran NAO - Humanoid Robot

<p align="center"> <img src="./figure/content/nao-body.jpg" alt="NAO" width=50% height=50%/> </p>

Asus Xtion PRO LIVE - Depth Camera

<p align="center"> <img src="./figure/content/xtion.jpg" alt="XTION" width=50% height=50%/> </p>

OpenNI 2 and NiTE 2 - Skeleton Tracking Algorithm

<p align="center"> <img src="./figure/content/ni-skeleton.jpg" alt="NiTE" width=50% height=50%/> </p>

Gesture Recognition Toolkit (GRT)

Adaptive Naive Bayes Classifier (ANBC) - Classification Algorithm

<p align="center"> <img src="./figure/content/alg-anbc.png" alt="ANBC"/> </p>

IMPLEMENTATION

Gesture Modelling

<p align="center"> <img src="./figure/content/ges-all.png" alt="GESTURES" width=80% height=80%/> </p>

NAO Depth Camera Mount

<p align="center"> <img src="./figure/content/xtion-mount.jpg" alt="MOUNT" width=32% height=32%/> <img src="./figure/content/xtion-mount-3d.jpg" alt="3D PRINT" width=15% height=15%/> </p>

HRI Module

<p align="center"> <img src="./figure/content/hri-flow.jpg" alt="HRI" width=50% height=50%/> </p>

Brain Module

<p align="center"> <img src="./figure/content/brain-flow.jpg" alt="BRAIN" width=50% height=50%/> </p>

CC Module

<p align="center"> <img src="./figure/content/cc-hand.jpg" alt="CC" width=80% height=80%/> </p>

Command Module

RESULTS

<p align="center"> <img src="./figure/result/usr-walk.jpg" width=20% height=20%/> <img src="./figure/result/usr-turn-left.jpg" width=20% height=20%/> <img src="./figure/result/usr-move-right.jpg" width=20% height=20%/> </p> <p align="center"> <img src="./figure/result/nao-gm-walk.jpg" width=20% height=20%/> <img src="./figure/result/nao-gm-turn-left.jpg" width=20% height=20%/> <img src="./figure/result/nao-gm-move-right.jpg" width=20% height=20%/> </p>

EVALUATION

Confusion Matrix - Precision, Recall, F-Measure

<p align="center"> <img src="./figure/result/metrics.jpg" width=70% height=70%/> </p>

Classification - Accuracy

<p align="center"> <img src="./figure/result/test-accuracy-anbc.png" width=80% height=80%/> </p>

Training Data - Mean Positions

<p align="center"> <img src="./figure/result/train-all-ges-mean.png" width=80% height=80%/> </p>

Training Data - Min-Max Distance

<p align="center"> <img src="./figure/result/train-walk-all.jpg" width=70% height=70%/> </p>

CONCLUSION & FUTURE WORK