Ayuda
Ir al contenido

Dialnet


Advancing Face Analysis in Images and Videos: Age Estimation and Drowsiness Detection

  • Autores: Salah Eddine Bekhouche
  • Directores de la Tesis: Fadi Dornaika (dir. tes.), Ignacio Arganda-Carreras (tut. tes.)
  • Lectura: En la Universidad del País Vasco - Euskal Herriko Unibertsitatea ( España ) en 2025
  • Idioma: inglés
  • Programa de doctorado: Programa de Doctorado en Ingeniería Informática por la Universidad del País Vasco/Euskal Herriko Unibertsitatea
  • Enlaces
    • Tesis en acceso abierto en: ADDI
  • Resumen
    • Deep learning, particularly through sophisticated architectures like CNN, has significantly advanced automated facial analysis. Tasks such as recognition and attribute analysis have seen performance boosts. However, achieving truly robust and versatile systems, especially for complex regression tasks like age estimation or dynamic state assessments like driver drowsiness detection, faces persistent challenges. A key hurdle remains developing models that reliably handle extreme variations in real-world conditions including pose, illumination, expression, occlusions, and intrinsic image quality issues. While techniques like attention mechanisms and specialized network designs exist, accurately interpreting subtle age-related facial changes across a lifetime or detecting fine-grained behavioral cues indicative of drowsiness under these variations remains difficult. Furthermore, ensuring robust generalization across diverse demographics, unseen environments, and varying data acquisition setups often requires more than standard data augmentation or transfer learning, demanding tailored methodological innovations.

      Addressing these specific challenges in facial age estimation and driver drowsiness detection is crucial. Current age estimation methods, often relying on direct regression or simple classification with standard CNNs, can struggle with the non linear nature of aging, sensitivity to variations unrelated to age, and may not adequately capture distinct features relevant to different life stages. Similarly vision-based drowsiness detection often relies on indicators like PERCLOS, yawn frequency, or head pose, typically extracted using conventional computer vision techniques or basic deep learning models. These approaches can be sensitive to illumination changes, fail to integrate multiple cues effectively, lack robustness to individual differences in fatigue expression, and may not fully leverage the rich spatiotemporal information present in video sequences. Existing methods often lack mechanisms to specifically enhance feature discriminability for these challenging tasks orto compare systematically foundational approaches (handcrafted features) against modern deep learning within these specific contexts.

      This thesis delves into these specific problems within automated facial analysis, proposing and evaluating advanced computational methods focused explicitly on enhancing facial age estimation from static images and driver drowsiness detection from video sequences. Motivated by the limitations of existing approaches and the need for systems robust to real-world variability, our work explores both the comparative efficacy of traditional handcrafted featu res versus contemporary deep learning techniques and introduces novel deep learning architectures and strategies tailored to these tasks. The primary objective is to develop methods that push the boundaries of accuracy, robustness, and practica! applicability in these domains. The core contributions, validated through extensive experiments detailed herein using relevant benchmark datasets and evaluation metrics (e.g., Mean Absolute Error for age; Accuracy, Fl -seore for drowsiness), are: A comprehensive comparative study systematically evaluating the performance of established handcrafted features against various deep learning-based features for human facial age estimation. This establishes critica! baselines and contextualizes the performance gains achievable with deep learning, while also highlighting scenarios where traditional methods remain competitive or complementary.

      The development and validation of a novel multi-stage deep neural network architecture specifically designed for facial age estimation. This approach aims to improve accuracy and robustness by decomposing the complex regression task into distinct stages, potentially better modeling age-related transformations across the lifespan.

      The design and implementation of a specialized Spatiotemporal Convolutional Neural Network (ST-CNN) incorporating Pyramid Bottleneck Blocks. This architecture is demonstrated effectively for eye blinking detection, targeting the efficient capture of multi-scale spatiotemporal features crucial for recognizing micro-expressions relevant to drowsiness analysis in video data.

      A new hybrid approach for end-to-end driver drowsiness detection in video sequences. This method utilizes a strategy for selecting and integrating deep features from different network levels or temporal windows, aiming to enhance the discriminative power of the feature representation and improve classification performance by focusing on the most salient fatigue indicators overtime.

      The methodologies employed span comparative analysis, feature engineering, the design of novel deep network architectures (multi-stage CNNs, ST-CNNs with specialized blocks), and hybrid feature selection strategies within deep learning frameworks. Collectively, this research advances the state-of-the-art in robust facial age estimation and video-based drowsiness detection. lt contributes valuable comparative insights and introduces tailored deep learning solutions specifically designed to address the limitations of prior methods and overcome persistent challenges encountered in real-world facial analysis applications.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno