Muthu Selvam
, Rubén González Vallejo
The adoption of AI-powered grading systems in academic institutions promised improved efficiency, consistency, and scalability. However, these benefits introduced ethical challenges, including algorithmic bias, contextual insensitivity, and reduced transparency, particularly in high-stakes assessments. To address these concerns, the chapter presented a Human-in-the-Loop (HITL) grading framework that integrated AI-generated recommendations with human oversight. The model consisted of four layers: (i) pre-grading configuration with customizable rubrics and model calibration; (ii) preliminary scoring using transformer-based language models; (iii) human validation and contextual adjustment of AI outputs; and (iv) transparent feedback supported by dual-logged audit trails. A case study was conducted at a mid-sized university, where the framework was applied to 800 undergraduate essays. As a result of this implementation, the faculty validated 87 % of the AI-generated scores with only minor adjustments, while 13 % required overrides due to misinterpretations involving creative expression, linguistic nuance, or cultural context. The grading time was reduced by 40 %, and student satisfaction improved due to transparent assessment and educator involvement. These findings demonstrate that the HITL model has the potential to balance automation with ethical oversight, promoting fairer evaluations and preserving academic integrity. It enhanced faculty agency, ensured equity across diverse student populations, and built trust through explainable AI tools such as SHAP and LIME. The chapter concluded by proposing policy guidelines, technical integrations, and communication strategies, while advocating for future applications in multimodal grading and open-source ethical assessment platforms.
© 2001-2025 Fundación Dialnet · Todos los derechos reservados