C-ing Beyond AI: A Cloud-Based Approach to Authentic Programming Assessment Ildar Akhmetov Maryam Tanha Khoury College of Computer Sciences Northeastern University Vancouver, B.C., Canada i.akhmetov@northeastern.edu Khoury College of Computer Sciences Northeastern University Vancouver, B.C., Canada m.tanha@northeastern.edu Logan W. Schmidt Saeed Yazdanian Khoury College of Computer Sciences Northeastern University Vancouver, B.C., Canada l.schmidt@northeastern.edu Abstract Recent advances in large language models (LLMs) introduce unprecedented challenges for academic integrity in programming courses. This paper presents a cloud-based programming assessment system that creates ephemeral coding environments to preserve the authenticity of student work and deter AI-assisted plagiarism. Using Terraform and AWS, the system provisions individualized virtual machines for in-person assessment, mirroring the course environment without granting access to pre-existing code or external resources. Integrated with GitHub Classroom, the system handles assignment distribution, code submission, and resource clean-up. We discuss the design, cost analysis, and preliminary observations from implementation in a CS 2 course in C at Northeastern University (Vancouver). Preliminary results indicate that this controlled environment promotes student engagement and discourages reliance on AI for routine tasks. Future work will include studying how this approach impacts learning outcomes and AI usage patterns. Keywords Assessment, Programming, Cloud Computing, Academic Integrity, Large Language Models, Terraform, AWS ACM Reference Format: Ildar Akhmetov, Maryam Tanha, Logan W. Schmidt, and Saeed Yazdanian. 2025. C-ing Beyond AI: A Cloud-Based Approach to Authentic Programming Assessment. In Proceedings of the 27th Western Canadian Conference on Computing Education (WCCCE). April 28–29, 2025, Calgary, Canada. 2 pages. https://doi.org/10.1145/10.60770/4bhf-g950 1 Introduction Recent developments in large language models (LLMs) such as ChatGPT, Gemini, and Claude pose new risks for academic integrity. Students can easily use AI tools to solve homework or take-home This work is licensed under a Creative Commons Attribution International 4.0 License. WCCCE ’25, Calgary, AB, Canada © 2025 Copyright held by the authors. https://doi.org/10.1145/10.60770/4bhfg950 Khoury College of Computer Sciences Northeastern University Vancouver, B.C., Canada s.yazdanian@northeastern.edu exams, making it difficult to assess their mastery of fundamental concepts. While remote-proctoring solutions and plagiarism detection tools have been used to mitigate cheating, these methods often fail to match the pace of AI advancements and raise ethical and privacy concerns [1, 2, 5, 8, 9]. Educational researchers have examined how LLMs transform both teaching and learning, highlighting the need for new instructional designs and assessment strategies that respond to AI technology’s rapid diffusion [4, 7]. As evidence shows that reliance on AI for routine tasks can hinder deeper conceptual understanding, it is imperative to redesign assessments that accurately measure knowledge acquisition. Educators have proposed incorporating in-person practical evaluations, code-walks, and assignments that emphasize problem-solving and critical thinking over simple solution generation. In line with the growing trend of online evaluation tools [3, 6], this paper describes a cloud-based approach to creating short-lived, controlled coding environments. Our goal is to preserve student work authenticity by mirroring the course environment while preventing access to previously saved code or external resources. We implemented this system in a CS2 course (algorithms, data structures and basic computer systems using C) at Northeastern University (Vancouver). While not a formal study, preliminary observations suggest that controlled, face-to-face assessments may reduce reliance on AI assistants and encourage deeper engagement with course material. This paper outlines the system design, costeffectiveness, and avenues for future research. 2 Cloud-based Programming Assessment System Our cloud-based assessment system provides ephemeral, isolated programming environments that replicate the environment used in the course. The course requires students to write C programs using a specific toolchain on Rocky Linux, employing vim as the primary text editor. Our goals are summarized as follows: Faithfully emulate the course environment for in-person exams; Prevent students from accessing pre-existing work or external resources; Allow instructors to manage student instances efficiently; Minimize complexity and cost. Akhmetov et al. WCCCE ’25, April 28–29, 2025, Calgary, AB, Canada 2.1 System Architecture We use Terraform1 for infrastructure as code, deploying individual AWS EC2 instances for each student at the beginning of the assessment. These instances are destroyed upon completion. GitHub Classroom serves as the assignment distribution and submission platform, integrated through the GitHub REST API to automate repository creation and access control. Figure 2: Screenshot of the provisioning dashboard showing instances for each student Figure 1: Architecture of the cloud-based programming assessment provisioning system 2.2 Implementation Details Our Python Flask application offers a dashboard that manages student lists, provisions EC2 instances, and monitors their lifecycle. It imports GitHub usernames for upcoming sessions, generates and configures EC2 instances using Terraform, assigns each instance a unique password and GitHub deploy key for repository access, and tracks the instance states. Once assessments have been completed, the application displays instance details and initiates their destruction to clean up resources. Each instance is configured with Rocky Linux, gcc/clang, vim, and any additional dependencies required. It clones the student’s repository from GitHub upon launch and allows the student push their changes via the deploy key. Once the student’s session ends, instructors use the dashboard to terminate the instance and revoke write access to the repository (using GitHub REST API). 2.3 Cost Analysis We evaluated system costs during the Spring 2025 offering of CS2, which enrolled 71 students. Students were assessed in groups of 9 and worked for up to one hour on a t2.small AWS instance. Over multiple sessions, the total AWS cost was approximately CA$3.00, or CA$0.04 per student. To control costs, the system provisions instances only for upcoming sessions (usually 1–2 hours in advance) and promptly tears them down after completion. 3 Future Work Our observations suggest students increased their engagement with course materials following the in-person coding assessment. Several reported a shift away from routine reliance on AI tools, recognizing 1 https://www.terraform.io that authentic practice was essential for success and mastery of fundamental course concepts. In upcoming semesters, we plan to conduct a study on the system’s impact on learning outcomes and AI usage, expand its use in other programming-intensive courses, and release the source code and documentation for broader educational adoption. By exploring these directions, we will refine methods for assessing genuine coding ability in an era where AI assistance is increasingly prevalent. References [1] Debby RE Cotton, Peter A Cotton, and J Reuben Shipway. 2024. Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in education and teaching international 61, 2 (2024), 228–239. [2] Héctor Galindo-Domínguez, Lucía Campo, Nahia Delgado, and Martín Sainz de la Maza. 2025. Relationship between the use of ChatGPT for academic purposes and plagiarism: the influence of student-related variables on cheating behavior. Interactive Learning Environments (2025), 1–15. [3] J Geetha, DS Jayalakshmi, E Naresh, and N Sreenivasa. 2024. Lightweight CloudBased Solution for Digital Education and Assessment. Science & Technology Libraries 43, 3 (2024), 274–286. [4] Ahmed M Hasanein and Abu Elnasr E Sobaih. 2023. Drivers and consequences of ChatGPT use in higher education: Key stakeholder perspectives. European journal of investigation in health, psychology and education 13, 11 (2023), 2599–2614. [5] Muntasir Hoq, Yang Shi, Juho Leinonen, Damilola Babalola, Collin Lynch, Thomas Price, and Bita Akram. 2024. Detecting ChatGPT-generated code submissions in a CS1 course using machine learning models. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1. 526–532. [6] Magendran Munisamy, SZ Osman, and Mageswaran Sanmugam. 2024. Code, click, learn: a systematic review of online assessment tools in 21st century programming education. Int J Mod Educ 6, 20 (2024), 358–377. [7] Iris Cristina Peláez-Sánchez, Davis Velarde-Camaqui, and Leonardo David Glasserman-Morales. 2024. The impact of large language models on higher education: exploring the connection between AI and Education 4.0. In Frontiers in Education, Vol. 9. Frontiers Media SA, 1392091. [8] Zachary Taylor, Cy Blair, Ethan Glenn, and Thomas Ryan Devine. 2023. Plagiarism in entry-level computer science courses using ChatGPT. In Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE). IEEE, 1135–1139. [9] Yunkai Xiao, Soumyadeep Chatterjee, and Edward Gehringer. 2022. A new era of plagiarism the danger of cheating using AI. In 20th International Conference on Information Technology Based Higher Education and Training (ITHET). IEEE, 1–6.