The present investigation describes an advanced multi-task deep learning framework for automated inspection of coffee cherry quality using YOLOv8 with color-based segmentation and Vision Transformer-Convolutional Neural Network (ViT-CNN) feature extraction. The model performs ripeness stage classification, defect detection, and size and shape analysis. For ripeness detection, YOLOv8 was enhanced with a color segmentation module, achieving class-wise accuracies of 90–95 % for unripe, partially ripe, and fully ripe cherries, with moderate performance (85 %) for overripe samples. ViT-CNN feature maps improved segmentation clarity and bounding-box localization, particularly in high-density clusters. Defect detection was carried out across five categories (healthy, blackened, moldy, wrinkled, and insect-damaged), achieving F1-score values between 0.88 and 0.96 and mean average precision at 50 % intersection over union (mAP@50) values above 0.97 for key defect classes after 150 training epochs. Quantitative evaluation of morphological characteristics for size and shape assessment further demonstrated model robustness, with insect-damaged cherries reaching a contour accuracy of 0.98 and an Intersection over Union (IoU) of 0.96. Comparative analysis with YOLOv5 and Faster Region-Based Convolutional Neural Network (Faster R-CNN) showed superior performance of the proposed architecture across all metrics, including precision, recall, F1-score, and mAP. By incorporating contextual embeddings and attention mechanisms, the framework enables accurate, real-time sorting for smart agricultural systems.
© 2001-2026 Fundación Dialnet · Todos los derechos reservados