Evaluating ChatGPT’s Reliability in Grading Writing Assignments on the EOP Learning Platform
DOI:
https://doi.org/10.54855/979-8-9870112-8-7_1Keywords:
AI grading, assessment reliability, ChatGPT, writing evaluation, advantages and limitationsAbstract
As Artificial Intelligence (AI) is being used more and more in education, utilizing AI to grade the writing of students is a concern for trustworthiness. Using a mixed-methods research design that combines both quantitative and qualitative data collection tools -questionnaires and semi-structured interviews - this study investigates the reliability of using ChatGPT to mark students' writing assignments on the EOP online learning platform (https://eop.edu.vn/) compared with human evaluators at the School of Languages and Tourism, Hanoi University of Industry. The findings provide the advantages and limitations of AI-supported grading, highlighting the accuracy, consistency, and alignment with human grading criteria of AI grading. The attitudes of teachers toward AI scoring are also examined in this paper to determine its accuracy. Recommendations for enhancing AI scoring systems to enable more effective and fairer assessments are provided based on the findings. The research contributes to the academic literature on the use of AI in education, emphasizing the importance of sustaining the enhancement of AI-driven evaluation tools to enable effective and fairer online learning.
References
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative research in psychology, 3(2), 77-101.
Flower, L., & Hayes, J. R. (1981). A cognitive process theory of writing. College Composition and Communication, 32(4), 365–387. https://doi.org/10.2307/356600
Ho, L. T. P., Doan, N. A. H., & Dinh, T. L. (2023). An Investigation into The Online Assessment and The Autonomy of Non-English Majored Students in Vinh Long Province. ICTE Conference Proceedings, 3, 41–51. https://doi.org/10.54855/ictep.2334
Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for teaching and learning. Center for Curriculum Redesign.
Hyland, K., & Hyland, F. (2006). Interpersonal aspects of response: Constructing and interpreting. In K. Hyland & F. Hyland (Eds.), Feedback in second language writing: Contexts and issues (pp. 206–224). Cambridge University Press.
Dwivedi, Y. K., Hughes, D. L., Ismagilova, E., Aarts, G., Coombs, C., Crick, T., ... & Williams, M. D. (2023). So what if ChatGPT wrote it? Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy. International Journal of Information Management, 71, 102642. https://doi.org/10.1016/j.ijinfomgt.2023.102642
Floridi, L., & Chiriatti, M. (2020). GPT-3: Its nature, scope, limits, and consequences. Minds and machines, 30(4). 681-694 https://doi.org/10.1007/s11023-020-09548-1
Guo, K., & Wang, D. (2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8435–8463. https://doi.org/10.1007/s10639-023-12306-1
Kumar, R., & Rose, C. (2023a). The promise and peril of ChatGPT for language assessment. Language Testing, 40(2), 123–139. https://doi.org/10.1177/02655322231156807
Kumar, S., & Rose, C. (2023b). Evaluating ChatGPT as a writing evaluator: A comparison with human raters. Journal of Educational Technology Development and Exchange, 16(2), 20–35. https://doi.org/10.18785/jetde.1602.02
Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., ... & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274
Montenegro-Rueda, M., Fernández-Cerero, J., Fernández-Batanero, J. M., & López-Meneses, E. (2023). Impact of the implementation of ChatGPT in education: A systematic review. Computers, 12(8), 153. https://doi.org/10.3390/computers12080153
Nguyen, T. T. H. (2023). EFL Teachers’ Perspectives toward the Use of ChatGPT in Writing Classes: A Case Study at Van Lang University. International Journal of Language Instruction, 2(3), 1-47. https://doi.org/10.54855/ijli.23231
Luckin, R., Holmes, W., Griffiths, M., & Forcier, L. B. (2016). Intelligence unleashed: An argument for AI in education. Pearson Education.
Lu, Q., Yao, Y., Xiao, L., Yuan, M., Wang, J., & Zhu, X. (2024). Can ChatGPT effectively complement teacher assessment of undergraduate students’ academic writing? Assessment & Evaluation in Higher Education, 49(5), 616–633. https://doi.org/10.1080/02602938.2023.2290436
Plano Clark, V. L. (2017). Mixed methods research. The Journal of Positive Psychology, 12(3), 305-306. https://doi.org/10.1080/17439760.2016.1262619
Prompiengchai, S., Narreddy, C., & Joordens, S. (2025). A practical guide for supporting formative assessment and feedback using generative AI. arXiv. https://arxiv.org/abs/2505.23405
Ranalli, J., Link, S., & Chukharev-Hudilainen, E. (2017). Automated writing evaluation for formative assessment of L2 writing: Investigating the accuracy and usefulness of feedback as part of argument-based validation. Educational Psychology, 37(1), 8-25
Sari, A. N. (2024). Exploring the potential of using AI language models in democratising global language test preparation. International Journal of TESOL & Education, 4(4), 111–126. https://doi.org/10.54855/ijte.24447
Selwyn, N. (2019). Should robots replace teachers? AI and the future of education. Polity Press.
Shermis, M. D., & Hamner, B. (2012, April). Contrasting state-of-the-art automated scoring of essays: Analysis. Paper presented at the Annual Meeting of the National Council on Measurement in Education (NCME), Vancouver, Canada.
Wang, S., Wang, F., Zhu, Z., Wang, J., Tran, T., & Du, Z. (2024). Artificial intelligence in education: A systematic literature review. Expert Systems with Applications, 252, 124167. https://doi.org/10.1016/j.eswa.2023.124167
Weigle, S. C. (2002). Assessing writing. Cambridge University Press.
Zhai, X. (2022). ChatGPT user experience: implications for education: A review and research agenda. Educational Technology Research and Development, 70, 1–24. http://dx.doi.org/10.2139/ssrn.4312418
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Tran Yen Van, Le Thi Huong Giang

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the picte the right of first publication with the work simultaneously licensed under a Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the proceedings' published version of the work (e.g., post it to an institutional repository, in a journal, or publish it in a book), with an acknowledgment of its initial publication in this proceedings.
Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process.