Résumés
Abstract
Testing agencies require large numbers of high-quality items that are produced in a cost-effective and timely manner. Increasingly, these agencies also require items in different languages. In this paper we present a methodology for multilingual automatic item generation (AIG). AIG is the process of using item models to generate test items with the aid of computer technology. We describe a three-step AIG approach where, first, test development specialists identify the content that will be used for item generation. Next, the specialists create item models to specify the content in the assessment task that must be manipulated to produce new items. Finally, elements in the item model are manipulated with computer algorithms to produce new items. Language is added in the item model step to permit multilingual AIG. We illustrate our method by generating 360 English and 360 French medical education items. The importance of item banking in multilingual test development is also discussed.
Keywords:
- automatic item generation,
- test development,
- technology and testing
Résumé
Les agences d’évaluation ont besoin d’un grand nombre d’items de première qualité produits de façon rapide et économique, et de plus en plus souvent dans différentes langues. Dans cet article, une méthodologie de génération automatique d’items (AIG) multilingues est proposée. L’AIG correspond au processus d’utilisation de modèles d’items dans le but de générer les items d’un test à l’aide de la technologie informatique. Une approche AIG en trois étapes est décrite, dans laquelle les spécialistes en développement de test doivent d’abord identifier le contenu qui sera utilisé pour générer les items. Par la suite, ces spécialistes créent des modèles d’items afin de préciser le contenu de la tâche d’évaluation qui doit être manipulée pour produire de nouveaux items. Enfin, les éléments du modèle d’items sont manipulés à l’aide d’algorithmes informatiques pour générer de nouveaux items. L’ajout des langues désirées à l’étape de création des modèles d’items permet d’effectuer une génération automatique d’items multilingues. Cette méthode est illustrée en générant 360 items en français et 360 items en anglais dans le domaine de la formation médicale. L’importance de créer des banques d’items lors du développement de tests multilingues est également discutée.
Mots-clés :
- génération automatique d’items,
- développement de test,
- technologie et évaluation
Resumo
As agências de avaliação precisam de um grande número de itens de primeira qualidade produzidos de forma rápida e económica, e, cada vez mais, em diferentes línguas. Neste artigo, é proposta uma metodologia para a geração automática de itens (AIG) multilingues. A AIG é o processo de utilização de modelos de itens com a finalidade de gerar itens de um teste com o apoio da tecnologia informática. Descreve-se uma abordagem AIG em três etapas, na qual os especialistas em desenvolvimento de testes devem identificar, desde logo, o conteúdo que será utilizado para gerar os itens. De seguida, estes especialistas criam os modelos de itens para especificar o conteúdo da tarefa de avaliação que deve ser manipulado para produzir novos itens. Finalmente, os elementos do modelo de itens são manipulados usando algoritmos informáticos para gerar novos itens. Adicionando as línguas desejadas à etapa de criação de modelos de itens é possível efetuar a geração automática de itens multilingues. Este método é ilustrado através da geração de 360 itens em francês e 360 itens em inglês no campo da formação médica. Discute-se também a importância da criação de bancos de itens no desenvolvimento de testes multilingues.
Palavras chaves:
- geração automática de itens,
- desenvolvimento de testes,
- tecnologia e avaliação
Parties annexes
References
- Drasgow, F., Luecht, R. M., & Bennett, R. (2006). Technology and testing. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 471-516). Washington, DC: American Council on Education.
- Embretson, S. E., & Yang, X. (2007). Automatic item generation and cognitive psychology. In C. R. Rao & S. Sinharay (Eds.) Handbook of Statistics: Psychometrics, Volume 26 (pp. 747-768). North Holland, UK: Elsevier. doi: 10.1016/S0169-7161(06) 26023-1
- Gierl, M. J., & Haladyna, T. (2013). Automatic item generation: Theory and practice. New York, NY: Routledge.
- Gierl, M. J., & Lai, H. (2013). Using automated processes to generate test items. Educational Measurement: Issues and Practice, 32, 36-50. doi: 10.1111/emip.12018/abstract
- Gierl, M. J., Lai, H., Fung, K., & Zheng, B. (in press). Using technology-enhanced processes to generate items in multiple languages. In F. Drasgow (Ed.), Technology in testing: Measurement issues. New York, NY: Routledge.
- Gierl, M. J., & Lai, H. (2012, March). Using automatic item generation to create items for medical licensure exams. In K. Becker (Chair), Beyond essay scoring: Test development through natural language processing. Paper presented at the annual meeting of the National Council on Measurement in Education, Vancouver, BC. doi: 10.1111/j.1365-2923.2012.04289.x/abstract
- Gierl, M. J., Lai, H., & Turner, S. (2012). Using automatic item generation to create multiple-choice items for assessments in medical education. Medical Education, 46, 757-765. doi: 10.1111/j.1365-2923.2012.04289.x
- Gierl, M. J., Zhou, J., & Alves, C. (2008). Developing a taxonomy of item model types to promote assessment engineering. Journal of Technology, Learning, and Assessment, 7(2). Retrieved from http://www.jtla.org.
- Haladyna, T. (2013). Automatic item generation: A historical perspective. In M. J. Gierl & T. Haladyna (Eds.), Automatic item generation: Theory and practice (pp. 13-25). New York, NY: Routledge.
- Irvine, S. H., & Kyllonen, P. C. (2002). Item generation for test development. Hillsdale, NJ: Erlbaum.
- Isabelle, P., & Foster, G. (2006). Machine learning: Overview. In K. Brown (Ed.), Encyclopedia of Language and Linguistics (2nd ed., pp 404-422). Oxford, UK: Elsevier.
- Jurafsky, D., & Martin, J. H. (2009). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition (2nd ed.). Upper Saddle River, NJ: Pearson
- Koehn, P. (2010). Statistical machine translation. Cambridge, UK: Cambridge University Press. doi: 10.1017/CBO9780511815829
- Senellart, J., Dienes, P., & Várdi, T. (2001). New generation SYSTRAN translation system. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=9A507FD444755EA3B27B2075AD123D68?doi=10.1.1.68.568&rep=rep1&type=pdf
- van der Linden, W. J. (2005). Linear models for optimal test design. New York, NY: Springer.
- Vauquois, B. (1968). A survey of formal grammars and algorithms for recognition and transformation in machine translation. In IFIP Congress-68 (pp. 254-260). Edinburgh, UK. Reprinted in C. Boitet (Ed.), Berbard Vauquois et la TAO: Vingt-cinq ans de traduction automatique – Analectes (pp. 201-213). Grenoble, France: Association Champollion.