Generative AI tools in designing MCQs for English language examinations: Insights from Lecturers
DOI:
https://doi.org/10.54855/979-8-9870112-9-4_1Keywords:
MCQs, Generative AI, AI-generated exam questions, syntax-based sentence transformations, ReParaphrased classification frameworkAbstract
This study examines ChatGPT's ability to generate syntax-based sentence transformation multiple-choice questions (MCQs) using the syntactic types listed in the ReParaphrased classification framework. These include: Negation Switching (NS), Diathesis Alternation (DA), Subordination and Nesting Changes (SNC), Coordination Changes (CC), and Ellipsis (Ell). Using a quantitative approach, the researchers aim to provide personal insights into designing exam questions based on the content of B1 Empower. A statistical analysis of 120 AI-generated test items was conducted to identify the frequency and distribution of each syntactic transformation type, highlighting the favored patterns in the generated dataset. The findings suggest that ChatGPT tended to create test items using tactics with a clear pattern of transformation, such as SNC and NS, while showing less favor for tactics that require more nuanced contextual understanding, such as Ell and CC. In addition, AI could create questions quickly and effectively; however, some problems remained, including semantic distortions and awkward forms, such as double negatives or passives. This result highlights the crucial role of human intervention in proofreading and refining AI-generated questions to ensure the accuracy and relevance of the dataset items.
References
Alderson, J. C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge: Cambridge University Press, 1-305.
Barzilay, R., & Lee, L. (2003). Learning to paraphrase: An unsupervised approach using multiple-sequence alignment. In Proceedings of HLT-NAACL 2003, 16-23. Association for Computational Linguistics. https://doi.org/10.3115/1073445.1073448
Cer, D., Yang, Y., Kong, S., Hua, N., Limtiaco, N., St. John, R., Constant, N., Guajardo-Cespedes, M., Yuan, S., Tar, C., Sung, Y.-H., Strope, B., & Kurzweil, R. (2018). Universal sentence encoder for English. In Proceedings of Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 169–174. https://doi.org/10.48550/arXiv.1803.11175
Chen, M. H., Huang, S. T., Chang, J. S., & Liou, H. C. (2015). Developing a corpus-based paraphrase tool to improve EFL learners’ writing skills. Computer Assisted Language Learning, 28(1), 22–40. http://dx.doi.org/10.1080/09588221.2013.783873
Chomsky, N. (1957). Syntactic structures. The Hague: Mouton.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press.
Chomsky, N. (2006). Language and Mind (3rd ed.). New York: Cambridge Press.
Dhawaleswar Rao, C. H., & Saha, S. K. (2020). Automatic multiple choice question generation from text: A survey. IEEE Transactions on Learning Technologies, 13(1), 14–25. https://doi.org/10.1109/TLT.2018.2889100
Fries, C. C. (1945). Teaching and learning English as a foreign language. Ann Arbor: University of Michigan Press, 1-153.
Ganitkevitch, J., Van Durme, B., & Callison-Burch, C. (2013). PPDB: The paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 758–764. Association for Computational Linguistics.
Gignac, G. E., & Szodorai, E. T. (2024). Defining intelligence: Bridging the gap between human and artificial perspectives. Intelligence, 104, 101832. https://doi.org/10.1016/j.intell.2024.101832
Hamilton, C. (2025). Keyword transformations: On the border of stylistics and language testing. Études de stylistique anglaise, 20, 1-18. https://doi.org/10.4000/14a6j
Heaton, J.B. (1979). Writing English Language Tests: A Practical Guide for Teachers of English. 5th Edition, Longman, London, 138.
Hirvela, A., & Du, Q. (2013). Why am I paraphrasing? Undergraduate ESL writers’ engagement with source-based academic writing and reading. Journal of English for Academic Purposes, 12(2), 87–98. https://doi.org/10.1016/j.jeap.2012.11.005
Hosking, T., & Lapata, M. (2021). Factorising meaning and form for intent-preserving paraphrasing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 1405–1418, Online. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.112
Hughes, A. (2003). Testing for language teachers (2nd ed.). Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511732980
Isley, C., Gilbert, J., Kassos, E., Kocher, M., Nie, A., Brunskill, Domingue, B., Hofman, J., Legewie, J., Svoronos, T., Tuminelli, C., & Goel, S. (2025). Assessing the Quality of AI-Generated Exams: A Large-Scale Field Study [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2508.08314
Kim, N., Carlson, K., Dickey, M., & Yoshida, M. (2020). Processing gapping: Parallelism and grammatical constraints. Quarterly Journal of Experimental Psychology, 73(5), 781-798. https://doi.org/10.1177/1747021820903461
Kovatchev, V., Martí, M. A., & Salamó, M. (2018). ETPC: A paraphrase identification corpus annotated with extended paraphrase typology and negation. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 1384–1392.
Lado, R. (1964). Language teaching: A scientific approach. New York: McGraw-Hill.
Lakoff, G. (1971). On generative semantics. In D. D. Steinberg & L. A. Jakobovits (Eds.), Semantics: An interdisciplinary reader in philosophy, linguistics, and psychology, 232–296. Cambridge: Cambridge University Press.
Lakoff, G., & Ross, J. R. (1976). Is deep structure necessary? In J. D. McCawley (Ed.), Notes from the linguistic underground, 159–164. Brill. https://doi.org/10.1163/9789004368859_011
Le, T. T. H. (2024). Evaluating HUFLIT Lecturers’ Perspectives on ChatGPT’s Capabilities in Designing English Testing and Assessment. In Proceedings of the AsiaCALL International Conference, 6, 157-181. https://doi.org/10.54855/paic.24612
Maas, A., Yamada, K., Nagahama, T., Kawada, T., & Horita, T. (2024). Question Generation for English Reading Comprehension Exercises using Transformers. IIAI Letters on Informatics and Interdisciplinary Research, 5, 1-12. https://doi.org/10.52731/liir.v005.183
McCawley, J. D. (1968). Lexical insertion in a transformational grammar without deep structure. In Proceedings from the 4th Annual Meeting of the Chicago Linguistic Society, 4 (1), 71–80. Chicago Linguistic Society.
Mulla, N., & Gharpure, P. (2023). Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Progress in Artificial Intelligence, 12(1), 1-32. https://doi.org/10.1007/s13748-023-00295-9
Na, C. D., & Mai, N. X. N. C (2017). Paraphrasing in academic: A case study of Vietnamese learners of English. Language Education in Asia, 8(1), 9–24. http://dx.doi.org/10.5746/LEiA/17/V8/I1/A02/Na_Mai
Nation, I. S. P. (2009). Teaching ESL/EFL reading and writing. New York: Routledge.
Nguyen, T. P. T. (2023). The Application of ChatGPT in Language Test Design – The What and How. In Proceedings of the AsiaCALL International Conference, 4, 104-115. https://doi.org/10.54855/paic.2348
Nunan, D. (1989). Designing tasks for the communicative classroom. Cambridge: Cambridge University Press.
Paribakht, T. S. (2004). The role of grammar in second language lexical processing. RELC Journal, 35(2), 149-160. https://doi.org/10.1177/003368820403500204
Poppels, T. (2020). Towards a referential theory of ellipsis. University of California, San Diego, 1-242. https://escholarship.org/uc/item/2830w1xn
Postal, P. M. (1974). On raising: One rule of English grammar and its theoretical implications. Cambridge, MA: MIT Press.
Rodriguez-Torrealba, R., Garcia-Lopez, E., & Garcia-Cabot, A. (2022). End-to-end Generation of Multiple-choice Questions using Text-to-text Transfer Transformer Models. Expert Systems with Applications, 118258. https://doi.org/10.1016/j.eswa.2022.118258
Settles, B., LaFlair, G. T., & Hagiwara, M. (2020). Machine Learning - Driven Language Assessment. Transactions of the Association for Computational Linguistics, 8, 247-263. https://doi.org/10.1162/tacl_a_00310
The Case HQ. (2025, April 7). Powerful guide to writing exam questions using Gen AI effectively. The Case HQ. https://thecasehq.com/powerful-guide-to-writing-exam-questions-using-gen-ai-effectively/
Thompson, D., Ling, S. P., Myachykov, A., Ferreira, F., & Scheepers, C. (2013). Patient-related constraints on get- and be-passive uses in English: evidence from paraphrasing. Frontiers in Psychology, 4. https://doi.org/10.3389/fpsyg.2013.00848
Tran, T. T. T., & Nguyen, H. B. (2022). The Effects of Paraphrasing on EFL Students’ Academic Writing. Journal of Language and Linguistic Studies. 18(1), 976-987.
Vahtola, T., Creutz, M. & Tiedemann, J. (2022). It Is Not Easy To Detect Paraphrases: Analysing Semantic Similarity With Antonyms and Negation Using the New SemAntoNeg Benchmark. In Proceedings of the 5th BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 249–262. Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.blackboxnlp-1.20
Vila, M., Martí, M. A., & Rodríguez, H. (2014). Is this a paraphrase? What kind? Paraphrase boundaries and typology. Open Journal of Modern Linguistics, 4(3), 205–218. https://doi.org/10.4236/ojml.2014.41016
Wahle, J. P., Gipp, B., & Ruas, T. (2023). Paraphrase Types for Generation and Detection. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 12148-12164. https://doi.org/10.18653/v1/2023.emnlp-main.746
Widdowson, H. G. (1978). Teaching language as communication. Oxford: Oxford University Press.
Wieting, J., & Gimpel, K. (2017). ParaNMT-50M: Pushing the limits of paraphrastic sentence embeddings with millions of machine translations. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistic, 1, 451-462. https://doi.org/10.18653/v1/P18-1042
Wieting, J., Bansal, M., Gimpel, K., Livescu, K., & Roth, D. (2015). From paraphrase database to compositional paraphrase model and back. Transactions of the Association for Computational Linguistics, 3, 345–358. https://doi.org/10.1162/tacl_a_00143
Zhang, M., & Li, J. (2021). A Commentary of GPT-3 in MIT Technology Review 2021. Fundamental Research, 1(6), 831-833. https://doi.org/10.1016/j.fmre.2021.11.011
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., & Artzi, Y. (2019). BERTScore: Evaluating text generation with BERT [Preprint]. ArXiv (Cornell University). https://doi.org/10.48550/arXiv.1904.09675
Zhou, C., Qiu, C., Liang, L., & Acuna, D. E. (2025). Paraphrase Identification with Deep Learning: A Review of Datasets and Methods. IEEE Access, 13, 65797-65822. https://doi.org/10.1109/access.2025.3556899
Zhou, J., & Bhat, S. (2021). Paraphrase Generation: A Survey of the State of the Art. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 5075–5086. https://doi.org/10.18653/v1/2021.emnlp-main.414
Zhou, Z., Sperber, M., & Waibel, A. (2019). Paraphrases as Foreign Languages in Multilingual Neural Machine Translation [Preprint]. ArXiv (Cornell University). https://doi.org/10.18653/v1/p19-2015
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Vu Thi Kim Chi, Nguyen Trinh To Anh, Vo Dao Vuong Co

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the picte the right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the proceedings' published version of the work (e.g., post it to an institutional repository, in a journal, or publish it in a book), with an acknowledgment of its initial publication in this proceedings.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process.








