Authors
Mechatronic Engineering Department, Al-Muthanna University, Al-Muthanna, Iraq
[email protected]
MSc in Construction Management Engineering, Civil Engineering Department,- Al-Muthanna University, Samawah, Iraq
[email protected]
Abstract
This study introduces BBE-Net, a novel BiLSTM-Boosted EfficientNet architecture designed for highly accurate and robust static hand-gesture recognition. The proposed framework integrates EfficientNet as a deep feature extractor and Bidirectional LSTM (BiLSTM) as a spatiotemporal dependency modeler, enabling the network to capture both fine-grained spatial structures and contextual relationships within gesture images. A preprocessing pipeline—consisting of standardized image resizing and histogram equalization—enhances contrast and illumination invariance, producing clearer input representations for feature extraction. EfficientNet generates multi-scale, semantically rich feature maps, which are subsequently refined by BiLSTM layers to model long-range and bidirectional correlations. The resulting discriminative features are classified through an ensemble-learning module that employs Bagging with Decision Trees and majority voting to improve stability and reduce variance. Experiments conducted on the Sebastian Marcel Static Hand Posture Database demonstrate the effectiveness of the proposed method. With extensive augmentation, 10-fold cross-validation, and repeated trials, BBE-Net achieves an accuracy of 99.70%, outperforming several recent state-of-the-art approaches. Analyses using confusion matrices, ROC curves, and class-wise metrics confirm the method’s near-perfect discriminative capability.
