Metrics for Estimating Validity, Reliability and Bias in Peer Assessment

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10045/89471
Información del item - Informació de l'item - Item information
Título: Metrics for Estimating Validity, Reliability and Bias in Peer Assessment
Autor/es: Molina-Carmona, Rafael | Satorre Cuerda, Rosana | Compañ, Patricia | Llorens Largo, Faraón
Grupo/s de investigación o GITE: Informática Industrial e Inteligencia Artificial
Centro, Departamento o Servicio: Universidad de Alicante. Departamento de Ciencia de la Computación e Inteligencia Artificial
Palabras clave: Peer assessment | Success rate | Agreement degree | Reliability | Validity | Bias | Confusion matrix | Automatic classification
Área/s de conocimiento: Ciencia de la Computación e Inteligencia Artificial
Fecha de publicación: 2018
Editor: Tempus Publications
Cita bibliográfica: International Journal of Engineering Education. 2018, 34(3): 968-980
Resumen: Peer assessment is a widespread way of evaluating and rating the quality of a work in the field of education. Although it results to be a very effective learning instrument, it is subjected to possible problems of reliability, validity and some potential biases. Most works that study and try to solve these problems are focused on specific cases and the statistics for measuring reliability, validity or bias are global, that is, they give a measure of these values for the whole process, but they do not allow an individual study. In this work the approach is different. It proposes some metrics for reliability and validity of each reviewer, as well as an approximation to the possible biases that may appear in the assessment process, so that the review process can be itself assessed. An analogy between the work of a reviewer in a process of peer assessment and the operation of an automatic classifier is proposed. This has allowed us to leverage the usual measures in evaluating the quality of automatic classifiers to establish the quality of peer assessment. The reviewers are characterized by obtaining their confusion matrices and six new indicators: success rate (which estimates the validity); agreement degree (as a measure of reliability); assessment median and its interquartile range (for the estimation of central tendency and restriction of range biases); and average distance to diagonal and its standard deviation (to determine possible leniency and harshness biases). This method provides indicators of the reviewer’s task and the detection of different profiles, so that the teacher can assess the work of the students as reviewers and introduce some correction mechanisms in the final assessment of the works. A practical example of application to an engineering degree is provided to illustrate the potential of the method.
URI: http://hdl.handle.net/10045/89471
ISSN: 0949-149X
Idioma: eng
Tipo: info:eu-repo/semantics/article
Derechos: © 2018 TEMPUS Publications
Revisión científica: si
Versión del editor: https://www.ijee.ie/contents/c340318.html
Aparece en las colecciones:INV - i3a - Artículos de Revistas
INV - Smart Learning - Artículos de Revistas

Archivos en este ítem:
Archivos en este ítem:
Archivo Descripción TamañoFormato 
Thumbnail2018_Molina-Carmona_etal_IJEE.pdfVersión final (acceso restringido)1,17 MBAdobe PDFAbrir    Solicitar una copia


Todos los documentos en RUA están protegidos por derechos de autor. Algunos derechos reservados.