Abstracts
Résumé
La Théorie de la réponse aux items (TRI) est une classe de modèles de mesure très utilisée en éducation. À ce jour, de nombreux logiciels, tel BILOG-MG, sont disponibles afin de procéder à l’estimation des paramètres d’item et de sujet. Parmi ces logiciels, il ne faut pas négliger ICL et R qui sont gratuits et qui peuvent permettre de produire des analyses diversifiées. Cette étude a pour objectif de comparer la qualité d’estimation des paramètres selon une des modélisations issues de la TRI : le modèle de Rasch. Pour ce faire, nous comparons les estimateurs du paramètre de difficulté et de sujet selon trois logiciels : BILOG-MG, ICL et la librairie ltm, disponible sous le logiciel R. Nous procédons à une analyse par simulation informatique et, dans un second temps, nous analysons un test de classement en anglais, langue seconde. Les résultats démontrent que les logiciels étudiés permettent d’obtenir des estimateurs des paramètres similaires, la différence principale entre ces logiciels étant leur temps d’exécution des procédures d’estimation.
Mots-clés :
- modèle de Rasch,
- paramètre de difficulté d’item,
- paramètre de sujet,
- BILOG-MG,
- R,
- ICL
Abstract
Item response theory (IRT) is a class of measurement models extensively used in education. Curently, many softwares are available (e.g. BILOG-MG) for the estimation of item and examinee parameters. Among these softwares, one must mention ICL and R, which are free and allow to produce to a large variety of analyses. The main objective of this study is to use the Rasch model to compare the quality of estimation of the difficulty and subject parameters. Here, we will compare item parameters through three software packages: BILOG-MG, ICL and the R package ltm. The demonstration will be twofold: we will make a simulation study and an analysis of an English proficiency test, as second language. Our results show that these softwares obtained similar parameters estimates, their main difference pertaining to their respective computation times.
Keywords:
- Rasch model,
- item difficulty parameter,
- subject parameter,
- BILOG-MG,
- R,
- ICL
Resumo
A teoria de resposta aos itens (TRI) é uma classe de modelos de medida muito utilizada em educação. Atualmente, muitos softwares, como BILOG-MG, estão disponíveis para a estimação dos parâmetros de item e de sujeito. Entre estes softwares, não se deve negligenciar o ICL e R, os quais são gratuitos e podem permitir análises diversificadas. Este estudo tem por objetivo comparar a qualidade de estimação dos parâmetros segundo uma das modelizações da TRI: o modelo Rasch. Para isso, comparamos os estimadores do parâmetro de dificuldade e de sujeito segundo três softwares: BILOG-MG, ICL e a biblioteca ltm disponível no software R. Procedemos a uma análise por simulação informática e, num segundo tempo, analisamos um teste proficiência em inglês como segunda língua. Os resultados demonstram que os softwares estudados permitem obter de estimadores de parâmetros similares, sendo que a diferença principal entre estes softwares é o tempo de execução dos procedimentos de estimação.
Palavras chaves:
- modelo Rasch,
- parâmetro de dificuldade de item,
- parâmetro de sujeito,
- BILOG-MG,
- R,
- ICL
Download the article in PDF to read it.
Download
Appendices
Références
- Alexander, B. (2006). web 2.0: A new wave of innovation for teaching and learning? EdUCAUSE Review, 41, 32–44.
- Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques (2e éd.). new york, ny: dekker.
- Benzécri, J.-P. (1973). La place de l’a priori. Encyclopaedia universalis, Organum, 17, 11-24.
- Bertrand, R., & Blais, J.-G. (2004). Modèles de mesure: L’apport de la théorie de la réponse aux items. sainte-Foy, canada: Presses de l’université du Québec.
- Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. lord & M. R. novick (eds.), Statistical theories of mental test scores (pp. 397-479). Reading, Ma: addison–wesley.
- De Ayala, R. J. (2009). The theory and practice of item response theory. New York, ny: The Guilford Press.
- De Boeck, P., Bakker, M., Zwitser, R., nivard, M., hofman, a., tuerlinckx, F., & Partchev, I. (2011). the estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39, 1-28.
- Forsyth, R., sarsangjan, v., & Gilmer, J. (1981). some empirical results related to the robustness of the Rasch model. Applied Psychological Measurement, 5, 175-186. doi: 10.1177/014662168100500203
- Hambleton, R. K., & Swaminathan, h. (1985). Fundamentals of item response theory. Newbury Park, ca: sage.
- Hanson, B. a. (2002). IRT Command Language (ICL). Computer software version 0.020301. Retrieved from http://www.b-a-h.com/software/irt/icl/index.html
- Hulin, C.L., Drasgow, F., & Parson, C.K. (1983). Item response theory- Application to psychological measurement. Homewood, Il: Irwin.
- Jurich, D., & Goodman, J.T. (2009, october). Comparison IRT parameter recovery of mixed format examinations in PARSCALE and ICL. Poster session presented at the meeting of northeastern educational Research association, Rocky hill, connecticut .
- Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking (2nd edition): Methods and practice. new york, ny: springer.
- Laurier, M., Froio, L., Pearo, C., & Fournier, M. (1998). Test de classement d’anglais langue seconde au collégial. Rapport technique. document inédit, collège de Maison-Neuve, Montréal, canada.
- Lord, F. M. (1952). A Theory of Test Scores (Psychometric Monograph no. 7). Richmond, va: Psychometric corporation.
- Lord, F. M. (1980). Applications of item response theory to practical testing problems. New York, ny: lawrence erlbaum.
- Magis, D., Béland, S., & Raîche, G. (2012). difR: Collection of methods to detect dichotomous differential item functioning (dIF) in psychometrics. R package version 4.2. Retrieved from http://cRan.R-project.org/package=difR
- Magis, D., Béland, S., Tuerlinckx, F., & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior research methods, 42, 847-862. doi: http://dx.doi.org/10.3758/BRM.42.3.847
- Magis, D., & Raîche, G. (2011). catR: An R package for computerized adaptive testing. Applied Psychological Measurement, 35, 576-577. doi: http://dx.doi.org/10.1177/0146621611407482
- Magis, D., & Raîche, G. (2012). Random generation of response patterns under computerized adaptive testing with the R package catR. Journal of Statistical Software, 48 (8), 1-31.
- Mair, P., & Hatzinger, R. (2007a). Extended Rasch modeling: the eRm package for the application of IRt models in R. Journal of Statistical Software, 20, 1-20.
- Mair, P., & Hatzinger, R. (2007b). CMl based estimation of extended Rasch models with the eRm package in R. Psychology Science, 49, 26-43.
- Mair, P., hatzinger, R., & Maier, M. (2010). eRm: Extended Rasch Modeling. R package version 0.13-0. Retrieved from http://cRan.R-project.org/package=eRm
- Mead, a. d., Morris, s. B., & Blitz, d. l. (2007). Open-source IRT: A comparison of BILOG-MG and ICL features and item parameter recovery. unpublished document. Retrieved from uRl: http://mypages.iit.edu/~mead/MeadMorrisBlitz2007.pdf
- Partchev, I. (2011). Irtoys: Simple interface to the estimation and plotting of IRT models. R package version 0.1.4. Retrieved from http://cRan.R-project.org/package=irtoys R development core team (2012). R: A language and environment for statistical computing. vienne, autriche: R Foundation for statistical computing.
- Raîche, G. (2002). Le dépistage du sous-classement aux tests de classement en anglais, langue seconde, au collégial. document inédit, Gatineau, canada: collège de l’outaouais.
- Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. chicago, Il: the university of chicago Press.
- Rentz, R. R., & Barshaw, W. L. (1977). The national reference scale for reading: an application of the Rasch model. Journal of Educational Measurement, 14, 161-180.
- Rizopoulos, D. (2006). Ltm: An R package for latent variable modelling and item response theory analyses. Journal of Statistical Software, 17, 1–25.
- Rizopoulos, D. (2012). ltm. R package version 0.9-7. Retrieved from http://cRan.R-project.org/package=ltm
- Rupp, A. A. (2003). Item response modeling with BIloG-MG and MultIloG for windows. International Journal of Testing, 3, 365-384. doi: http://dx.doi.org/10.1207/s15327574IJt0304_5
- Sijtsma, K., & Junker B. W. (2006). Item response theory: Past performance, present developments, and future expectations. Behaviormetrika, 33, 75-102. doi: 10.2333/bhmk.33.75
- Warm, T. A. (1989). Weighted likelihood estimation of ability in item response models. Psychometrika, 54, 427-450.
- Weeks, J. P. (2010). Plink: an R package for linking mixed-format tests using IRt-based methods. Journal of Statistical Software, 35, 1-33.
- Weeks, J. P. (2011). Plink. R package version 1.3-1. Retrieved from http://cRan.R-project.org/package=plink estimation des paramètres d’item et de sujet 109
- Wright, B. D., & Stone, M. H. (1979). Best test design: Rasch measurement. chicago, Il: Mesa Press.
- Zimowski, M. F., Muraki, e., Mislevy, R. J., & Bock, R. d. (1996). BILOG-MG: Multiplegroup IRT analysis and test maintenance for binary items [Computer software]. chicago, Il: scientific software International.