Multi-modal Asian Conversation Mobile Video Dataset for Recognition Task

Dewi Suryani, Valentino Ekaputra, Andry Chowanda

Abstract


Images, audio, and videos have been used by researchers for a long time to develop several tasks regarding human facial recognition and emotion detection. Most of the available datasets usually focus on either static expression, a short video of changing emotion from neutral to peak emotion, or difference in sounds to detect the current emotion of a person. Moreover, the common datasets were collected and processed in the United States (US) or Europe, and only several datasets were originated from Asia. In this paper, we present our effort to create a unique dataset that can fill in the gap by currently available datasets. At the time of writing, our datasets contain 10 full HD (1920 1080) video clips with annotated JSON file, which is in total 100 minutes of duration and the total size of 13 GB. We believe this dataset will be useful as a training and benchmark data for a variety of research topics regarding human facial and emotion recognition.

Keywords


multi-modal dataset; asian conversation dataset; mobile video; recognition task; facial features expression; emotion recognition

Full Text:

PDF


DOI: http://doi.org/10.11591/ijece.v8i5.pp4042-4046

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).