Robust emotion recognition in thermal imaging with convolutional neural networks and grey wolf optimization

Atchogou, Anselme; Tepe, Cengiz

doi:10.1016/j.image.2025.117363

Robust emotion recognition in thermal imaging with convolutional neural networks and grey wolf optimization

Atchogou A., Tepe C.

SIGNAL PROCESSING-IMAGE COMMUNICATION, cilt.138, 2025 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 138
Basım Tarihi: 2025
Doi Numarası: 10.1016/j.image.2025.117363
Dergi Adı: SIGNAL PROCESSING-IMAGE COMMUNICATION
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, Civil Engineering Abstracts
Ondokuz Mayıs Üniversitesi Adresli: Evet

Özet

Facial Expression Recognition (FER) is a pivotal technology in human-computer interaction, with applications spanning psychology, virtual reality, and advanced driver assistance systems. Traditional FER using visible light cameras faces challenges in low light conditions, shadows, and reflections. This study explores thermal imaging as an alternative, leveraging its ability to capture heat radiation and overcome lighting issues. We propose a novel approach that combines pre-trained models, particularly EfficientNet variants, with Grey Wolf Optimization (GWO) and various classifiers for robust emotion recognition. Ten pre-trained CNN models, including variants of EfficientNet (EfficientNet-B0, B3, B4, B7, V2L, V2M, V2S), ResNet50, MobileNet, and InceptionResNetV2, are utilized to extract features from thermal images. GWO is employed to optimize the parameters of four classifiers: Support Vector Machine (SVM), Random Forest, Gradient Boosting, and k-Nearest Neighbors (kNN). Two popular thermal image datasets, IRDatabase and KTFE, are used to assess the suggested methodology. Combining EfficientNet-B7 with GWO and kNN or SVM for eight distinct emotions (fear, anger, contempt, disgust, happiness, neutrality, sadness, and surprise) yielded the highest accuracy of 91.42 % on the IRDatabase dataset. Combining EfficientNet-B7 with GWO and Gradient Boosting for seven distinct emotions (anger, disgust, fear, happiness, neutrality, sadness, and surprise) yielded the highest accuracy of 99.48 % on the KTFE dataset. These results demonstrate the effectiveness and reliability of the proposed approach for emotion identification in thermal images, making it a viable way to overcome the drawbacks of conventional visible-light-based FER systems.