Diferencia entre revisiones de «Fundamentos teóricos del análisis de ítems»

De MoodleDocs
(tidy up)
(tidy up)
Línea 55: Línea 55:
==Dificultades==
==Dificultades==


What about repeat attempts at a particular quiz by the same student. What does this do to the analysis?
¿Qué pasa con los intentos repetidos de resolver un examenparticular por el mismo estudiante? ¿Qué hace esto con el análisis?


What about adaptive mode?
¿Qué pasa con el modo adaptativo?


==Conclusiones==
==Conclusiones==
Línea 63: Línea 63:
Probablemente sea suficiente con que Moodle le ofrezca a los profesores una forma facil de [http://www.education.com/reference/article/item-analysis/ análisi de ítems]. Esto obviamente atrapará los ítemes de evaluación defectuosos.
Probablemente sea suficiente con que Moodle le ofrezca a los profesores una forma facil de [http://www.education.com/reference/article/item-analysis/ análisi de ítems]. Esto obviamente atrapará los ítemes de evaluación defectuosos.


Nosotros probalemente no deberíamos de tratar de implementar esquemas muy complicados para el [http://www.education.com/reference/article/item-analysis/ análisis de ítems]. Estan propensos a ser mal utilizados, lo que es más una desventaja que el poder extra que proporcionarían al usarse correctamente.
Nosotros probablemente no deberíamos de tratar de implementar esquemas muy complicados para el [http://www.education.com/reference/article/item-analysis/ análisis de ítems]. Están propensos a ser mal-utilizados, lo que es más una desventaja que el poder extra que proporcionarían al usarse correctamente.


==Vea también==
==Vea también==

Revisión del 23:23 29 sep 2014


I am trying to understand what the state of the art is with Item analysis. I am struggling to find my way into the literature, but have started this page to record my progress. So far the following references were found after several hours in the OU library around classmark 371.26.--Tim Hunt 05:58, 30 November 2007 (CST)

Referencias que yo tengo

J J Barnard, Item Analysis in Test Construction pp. 195-206, in Geofferey N Masteres & John P Keeves 1999 Advances in Measurement in Educational Research and Assessment, Pergamon.

  • Mentions two topics "Classical Test Theory" and "Item Response Theory".
  • "Item analysis is not a substitute for the originality, effort and skill of the item writer and relatively poor statistical results can be overruled on logical grounds."
  • "The two most basic statistics computed and examined during item analysis are the items' difficulty and values."
  • The difficulty if basically the average score for the item. The higher the average score, the easier it is.
  • When computing this average, you have to decide whether to ignore students who did not submit an answer, or to include them as zero score. And you have to consider that in a timed test, questions near the end are more likely to be missed.
  • For discrimination, there are different techniques for questions with and without partial scores.
  • For questions that are scored 0/1 (dichotomously scored) point biserial correlation is most commonly used.
  • You should allow for the fact that the score for this item is included in the score for the whole test. However, for tests with many questions, the correction is small.
  • Item reliability index: (Gulliksen's product) rit Si.
  • The above is Classical Test Theory.
  • Item Response Theory is based on more computer-intensive techniques, involving fitting models to the data (maximum likelihood estimation).
  • "It can be concluded that CTT and IRT should be be viewed as rival theoretical frameworks. A duet, rather than a dual bewteen CTT and IRT will provide most information to the test developer. The results obtained from a CTT based item anaysis can yiedl useful information in finding flaws in items and guiding the test developer towards choosing an appropriate IRT model. The advantages that IRT parameters offer should subsequently be used for constructiong tests for specific purposes, ..."

R L Ebel 1972, Essentials of Educational Measurement, Prentice Hall.

William A Mehrens & Irvin J Lehmann 1973, Measurement and Evaluation in Education and Psychology, Holt Rinehart and Winston Inc.


R L Thorndike 1971, Educational Measurement, American Council on Education.

  • Repeats the point about items at the end of a timed test being omitted by a lot of students leading to skewed statistics.


Referencias para intentar conseguir

These last two above have probably both been superseded by:

R L Thorndike 2004, Measurement and Evaluation in Psychology and Education (Seventh edition), Prentice Hall.

This looks like it might be worth getting (previous edition cited by J J Barnard):

L Crocker & J Algina 2006, Introduction to Classical and Modern Test Theory, Wadsworth Pub Co.

Otros puntos

  • Another book mentioned that sometimes you want to, for example, analyse test data by group (e.g. male/female) to look for possible discrimination.
  • There is the idea that you can look at the reliability of a test by randomly splitting the class in half, and comparing the statistics for the two halves.
  • What you really want to do is compare item scores to the property you are trying to measure in the test (student's mathematical ability), as opposed to their score on the test as a whole. However, you don't have any measure of the property you are really interested in - the overall test score is the best (only) estimate you have of that.
  • The age of the references I have read so far means that they cannot assume the processing power of modern computers. Therefore, the procedures they describe are unnecessarily simplified.

Dificultades

¿Qué pasa con los intentos repetidos de resolver un examenparticular por el mismo estudiante? ¿Qué hace esto con el análisis?

¿Qué pasa con el modo adaptativo?

Conclusiones

Probablemente sea suficiente con que Moodle le ofrezca a los profesores una forma facil de análisi de ítems. Esto obviamente atrapará los ítemes de evaluación defectuosos.

Nosotros probablemente no deberíamos de tratar de implementar esquemas muy complicados para el análisis de ítems. Están propensos a ser mal-utilizados, lo que es más una desventaja que el poder extra que proporcionarían al usarse correctamente.

Vea también