Enhancing Academic Advising with Generative AI: A Retrieval-Augmented Decision Support System Arhaan Khaku Abdallah Mohamed arhaank@student.ubc.ca University of British Columbia Department of Computer Science Kelowna, British Columbia, Canada abdallah.mohamed@ubc.ca University of British Columbia Department of Computer Science Kelowna, British Columbia, Canada ABSTRACT • Designing a RAG-based chatbot for general academic advising in various fields, focusing on areas beyond course selection. • Assessing the effectiveness of generative AI in handling advising queries related to transfer credits, study strategies and graduation pathways. • Demonstrating the scalability and accessibility benefits of AI-driven advising systems compared to traditional models. • Offering a technical perspective on implementing RAG for decision support in higher education. Academic advising encompasses a broad range of student support services, including guidance on transfer credits, graduation requirements, major and minor declarations, and institutional policies. However, traditional advising methods often face challenges related to scalability, consistency, and accessibility. This paper presents a Retrieval-Augmented Generation (RAG)-based chatbot designed to optimize general academic advising by leveraging generative AI. The model aims to enhance data retrieval, improving the precision and relevance of advising responses. We evaluate the chatbot’s effectiveness based on response accuracy, user satisfaction, and its ability to address nuanced advising scenarios. Our findings indicate that the RAG-based approach improves efficiency and accessibility in academic advising, providing a scalable AI-driven decision support system for higher education institutions. 1 Preliminary findings indicate that RAG-based chatbots can significantly improve response accuracy, consistency, and accessibility in academic advising. The remainder of this paper is structured as follows: Section 2 reviews related work in AI-driven academic advising. Section 3 details our methodology, including system architecture and implementation. Section 4 presents experimental results and evaluation metrics. Section 5 discusses key findings, challenges, and future directions, and Section 6 concludes the paper. INTRODUCTION Academic advising plays a crucial role in guiding students through institutional policies, graduation requirements, transfer credit evaluations, and major/minor declarations. However, traditional advising systems often struggle with scalability, efficiency, and accessibility [6]. Many students face long wait times, unclear or incomplete information, and limited access to advisors, particularly during peak enrollment periods. These inefficiencies can lead to delays in academic progress, misunderstandings about graduation requirements, and difficulties in credit transfer, ultimately impacting student success. To address these challenges, we propose EduRAG, a RAG-based decision support system that enhances general academic advising through AI-driven interaction. Unlike conventional rule-based advising tools, which rely on static databases or predefined responses, EduRAG dynamically retrieves relevant institutional policies and guidelines, such as academic calendars, while leveraging a large language model (LLM) to generate accurate and context-aware responses. The chatbot supports students across various advising topics, such as graduation requirements, transfer credit evaluations, major/minor declarations, and institutional policies. EduRAG delivers an interactive, 24/7 advising experience, providing real-time, personalized guidance to students across various disciplines, including Computer Science, Data Science, Mathematics, Statistics, Physics and Engineering, tailoring recommendations based on each student’s unique academic path. 2 RELATED WORK 2.1 Generative AI in Education and Academic Advising Recent advancements in Generative AI (GenAI), particularly large language models (LLMs), have transformed academic support by enabling personalized tutoring, automated feedback, and student service chatbots [2]. AI-assisted advising reduces advisor workload and enhances response consistency but often struggles with institution-specific policies. Many institutions employ traditional machine learning techniques to develop academic chatbots [7], utilizing approaches such as self-supervised fine-tuning and reinforcement learning with human feedback (RLHF). Others opt for menu-driven chatbots, which are cost-effective but less user-friendly. Most prior research has focused on fine-tuning LLMs on custom datasets, a process that typically requires extensive domain-specific labeled data. However, fine-tuned models remain static unless retrained, limiting adaptability for future retrieval. In contrast, RAG dynamically fetches the latest data from external sources, ensuring up-to-date and policy-compliant responses. Our approach integrates retrieval-based knowledge with GenAI to enhance accuracy and relevance in academic advising. The key objectives and contributions of the proposed work include: 1 WCCCE ’25, March 4, 2025, Vancouver, BC 2.2 Arhaan Khaku and Abdallah Mohamed 3.2 RAG in Research and Industry The RAG Pipeline and Workflow RAG improves response accuracy by grounding AI outputs in domain-specific documents, reducing hallucination [1]. RAG has been applied in healthcare, legal analysis, and finance, with emerging use in adaptive learning [4]. However, its application to academic advising remains unexplored. Our work bridges this gap by implementing a domain-specific RAG system to retrieve institutional regulations, ensuring accurate and scalable advising. Our work builds upon these prior studies by applying RAG techniques to general academic advising. By integrating institutional policy retrieval with generative AI, our system aims to provide students with accurate, real-time guidance while ensuring scalability and accessibility. 3 SYSTEM ARCHITECTURE & DESIGN The core architecture of EduRAG is composed of several key components working together to provide accurate and context-aware responses to user queries. In this section, we discuss the key components of the system architecture, the RAG pipeline, the design choices made, and how these contribute to the system’s overall functionality. 3.1 Figure 1: A depiction on how the RAG pipeline operates The RAG pipeline follows a structured flow from user input to output, ensuring that each stage of the process contributes to a high-quality, accurate response (Figure 4): (1) User Input: The process begins when the user interacts with the chatbot, submitting a query related to academic advising (e.g., "What are the graduation requirements for a Computer Science major?"). (2) Vectorization and Query Encoding: The query is encoded into a vector representation using an embedding model. This vector is then used for similarity search for which it will return the top 3 results. (3) Similarity Search: The query vector is compared to the vectors of stored documents in the pgvector database using cosine similarity. The most relevant documents are retrieved based on their proximity in the vector space. (4) Generation and Response: Once relevant documents are retrieved, they are passed to a generative model (such as GPT or Ollama) along with the query. The model synthesizes a response based on the retrieved documents and returns it to the user. (5) Message History and History-Aware Retrieval: To improve context and the relevance of responses, the system maintains a history of previous interactions. This history is used in history-aware retrieval, where past messages are factored into the similarity search, ensuring that the context is preserved across multiple exchanges. Key Components of the RAG Architecture Our RAG-based system consists of the following primary components (Figure 4): • Data Collection and Preprocessing: The initial process begins with web-scraping data from sources such as academic calendars, faculty websites and university resource pages using the beautifulsoup4 library. All the collected data then gets divided and formatted into 14 different .txt files, each having content-specific data. • Data Loading and Chunking: All the data then gets loaded with a chunk_size of 1000 characters with an overlap of 250 characters, to make sure we do not miss any crucial data out. • Vectorization: This process converts textual data into vector representations using embeddings generated by Ollama’s mxbai-embed-large with 512 token size. The goal is to capture the semantic meaning of each document or query, allowing for efficient similarity-based retrieval. All embeddings then get stored in pgvector. • Similarity Search: Once data has been vectorized, similarity search is performed using efficient algorithms, such as cosine similarity, to identify the most relevant documents or pieces of information to answer user queries. • Generative Model: After retrieval, the system uses a generative model (GPT or Ollama) to synthesize a human-like response by conditioning on both the retrieved documents and the user’s query. • Vector Store (pgvector): To facilitate fast retrieval, vectorized data is stored in a specialized database (pgvector) that supports similarity search over high-dimensional vectors. This allows the system to quickly retrieve relevant information based on user input. 3.3 Design Choices 3.3.1 Prompt Engineering. Effective prompt engineering is crucial for optimizing chatbot performance in RAG-based systems. We carefully designed and refined prompt templates to enhance response accuracy, maintain relevance, and ensure compliance with institutional policies. • Contextual Prompting: Incorporating previous user queries into prompts improved response relevance and coherence. • Task-Specific Prompts: Tailored prompts for advising tasks (e.g., transfer credit evaluations, graduation requirements) ensured domain-specific accuracy. 2 Enhancing Academic Advising with Generative AI: A Retrieval-Augmented Decision Support System • Response Filtering: The model was instructed to ignore irrelevant user queries, preventing off-topic or misleading answers. • Source Attribution: Every response included a verifiable source to enhance transparency and reliability. • Document Retrieval: The chatbot was designed to provide PDFs for relevant applications when necessary. Additionally, we developed a separate prompt template focused on history-aware retrieval, enabling the model to retain context across interactions for more informed advising. highly similar), and 0 means there is no similarity. Cosine similarity is efficient in high-dimensional spaces and well-suited for comparing the semantic similarity of queries and documents. Evolution of Prompt Design In the initial design phase, the chatbot used simple, direct prompts such as: "You are a academic advisor for students. Ignore irrelevant questions asked." However, responses were often generic and lacked institutional specificity. Another issue we faced with this was that the model used to hallucinate when it did not know the answer to a user query. To address this, we refined the prompt to incorporate additional context: "You are an academic adivsor for UBC students. Only answer relevant question, if you don’t know the answer to something, reply I can only help you with academic advising. Answer the question based on the context below." This improved response accuracy. Another feature we wanted to integrate was source attribution. Hence, we engineering the chat prompt template to incorporate this feature: "You are an academic adivsor for UBC students. Only answer relevant question, if you don’t know the answer to something, reply I can only help you with academic advising. Answer the question based on the context below. In addition to your final answer also return a source as a raw link. Do not say, Source: provided context, a source is always a URL. Here are all the sources: Sources: " This iterative refinement ensured that responses remained grounded in authoritative data while enhancing user experience. Figure 2: Visualization of cosine similarity between two vectors. 3.3.3 Message History and History-Aware Retrieval. One important design choice in our system is the use of history-aware retrieval. This approach helps maintain context across multiple interactions with the chatbot. For example, when a student inquires about graduation requirements and follows up with a question about credit transfer, the system uses both the original question and the follow-up query to retrieve more relevant documents and generate more accurate responses. The history-aware retrieval ensures the system considers the entire interaction, making it more conversational and effective. 3.3.4 Data Collection and Processing. Our data is sourced from multiple channels to ensure accuracy and relevance in academic advising. • Web Scraping: Institutional websites, academic handbooks, and official advising pages are scraped to extract up-todate information on graduation requirements, major/minor policies, and transfer credits. • Stored Text Files: Scraped data is stored in structured text files, allowing for efficient access and preprocessing before vectorization. • Data Cleaning and Formatting: Raw textual data undergoes cleaning to remove extraneous information (e.g., HTML tags) and is standardized for consistency. This ensures high-quality embeddings, improving similarity search accuracy and reducing retrieval noise. 3.3.2 Similarity Search: Cosine Similarity. The similarity search relies on cosine similarity, a commonly used metric to measure the cosine of the angle between two vectors in a vector space. Mathematically, cosine similarity is defined as: Cosine Similarity(𝐴, 𝐵) = WCCCE ’25, March 4, 2025, Vancouver, BC 𝐴·𝐵 ∥𝐴∥ ∥𝐵∥ Where: • 𝐴 and 𝐵 are two vectors representing a query and a document, respectively. • 𝐴 · 𝐵 is the dot product of the vectors. • ∥𝐴∥ and ∥𝐵∥ are the magnitudes (Euclidean norms) of the vectors. The cosine similarity measure ranges from −1 to 1, where 1 indicates that the two vectors are identical (i.e., the documents are This structured approach ensures that our data remains current, well-organized, and optimized for retrieval in the advising system. 3.4 Implementation Details The system integrates LangChain for retrieval-augmented generation, FastAPI for backend efficiency, PostgreSQL with pgvector for 3 WCCCE ’25, March 4, 2025, Vancouver, BC Arhaan Khaku and Abdallah Mohamed vector storage, and OpenAI and Ollama models for language processing, ensuring fast and scalable deployment. The user interface is built with RemixJS and shadcn/ui. For efficient document retrieval, we use pgvector to store highdimensional embeddings of institutional guidelines and user queries. PostgreSQL’s indexing and querying capabilities enable fast similarity searches, improving response accuracy and efficiency. 3.5 Overall, students found the chatbot helpful but highlighted the need for improved explanation depth and a more intuitive interface. 4.2 4.2.1 GoGlobal Program. Students in the GoGlobal study abroad program face challenges with credit transfer. In a test scenario, a student inquired about international course evaluations, and the chatbot provided relevant policies, retrieved transfer guidelines, and offered a PDF advising sheet. This case highlights the chatbot’s role in streamlining complex advising tasks. Additional Features The data used for EduRAG allows it to perform several sophisticated tasks, such as: • Mathematical Calculations: EduRAG can calculate transfer credits by applying mathematical formulas based on the credit value of transferred courses and the institution’s conversion rates. • Providing PDF Files: EduRAG can retrieve and provide downloadable advising sheets, such as engineering advising documents or transfer credit forms, directly from the interface. 3.6 Design Rationale The architectural choices made in this system were driven by the need to balance scalability, accuracy, and accessibility. The use of vectorization and pgvector ensures that we can efficiently handle large datasets and provide fast, relevant responses. The choice to use cosine similarity for similarity search enables precise document retrieval, ensuring that students receive accurate and context-aware guidance. By incorporating features like history-aware retrieval, the system can maintain context over multiple interactions, enhancing the overall user experience. These design decisions ultimately result in a robust, scalable academic advising system that can efficiently support students across various advising tasks. 4 Figure 3: GoGlobal Student User Case Scenario 4.2.2 Course and Credit Transfers. A first year student majoring in Data Science had queries related to certain keywords in the academic calendar. It asked EduRAG for advice on those words DEMONSTRATION AND EVALUATION 4.3 In this section, we present the usability testing and evaluation results of our RAG-based academic advising chatbot. We conducted demonstrations with students from the Computer Science (CMPS) and Engineering programs, followed by a detailed analysis of the feedback, case studies, and performance metrics. 4.1 Case Studies and User Scenarios To demonstrate the chatbot’s effectiveness, we present key use cases: Backend Testing with PyTest To ensure the reliability and correctness of our backend, we implemented a comprehensive testing suite to validate API responses, handling user queries, and verifying logical equivalency using GPT4. Specifically, we focused on: • Irrelevant Question Handling: The system is tested to reject non-academic queries by returning a predefined response. • Credit and Transfer Credit Inquiries: We verify that queries related to credit requirements and transfer credit evaluations yield accurate responses. • Graduation and Degree Planning: The chatbot is tested for inquiries regarding degree progression, graduation requirements, and document shipping details. • Major/Minor Declarations: The chatbot’s ability to provide correct information on major and minor self-declaration is assessed. • Conversation History Awareness: To test context retention, we simulate a multi-turn conversation where the chatbot should recall previous exchanges. To validate chatbot responses beyond simple string comparison, we employ GPT-4 for logical equivalence assessment. A dedicated Initial Usability Testing We evaluated the chatbot’s usability with students from Computer Science (CMPS) and Engineering, who interacted with it for tasks like checking graduation requirements, credit transfers, and program policies. 4.1.1 CMPS Student Feedback. CMPS students appreciated the chatbot’s clarity and quick retrieval of graduation requirements, particularly for electives and credit transfers. Some suggested more detailed explanations for complex policies, such as transfer credit applications. 4.1.2 Engineering Student Feedback. Engineering students valued the chatbot’s ability to present PDF advising sheets and answer course prerequisite questions. However, like the CMPS group, they desired more contextual explanations for special cases like course substitutions. 4 464 Enhancing Academic Advising with Generative AI: A Retrieval-Augmented Decision Support System WCCCE ’25, March 4, 2025, Vancouver, BC Table 1: Evaluation metrics for the RAG system. Metric Recall@5 MRR NDCG@5 BLEU Score ROUGE-L Response Time (ms) Token Efficiency (avg tokens/query) • Normalized Discounted Cumulative Gain (NDCG@5): Evaluates ranking quality by assigning higher importance to correctly ranked relevant documents, penalizing incorrect ordering in the top 5 results. Generation Quality • BLEU Score: A precision-based metric that compares generated responses to reference answers using n-gram overlap, commonly used in machine translation. • ROUGE-L: A recall-oriented metric that evaluates the longest common subsequence between generated and reference responses, measuring fluency and coherence. System Efficiency • Response Time (ms): The average time taken to generate a response after retrieving relevant documents. Lower values indicate faster response generation. • Token Efficiency: The average number of tokens processed per query, influencing API cost and latency. Lower values improve cost-effectiveness without degrading response quality. Figure 4: First Year Student User Case Scenario function queries GPT-4 to determine if two responses convey the same meaning, enhancing test robustness. Algorithm 1 Logical Equivalence Assessment 4.5 1: Input: Two response strings, 𝑠𝑡𝑟 1 and 𝑠𝑡𝑟 2 2: Output: Boolean value indicating logical equivalence 4: Parse response to extract decision string 5: if response contains "true" then return True 7: else if response contains "false" then 8: return False 9: else 10: return Error 4.4 Evaluation Metrics Discussion of Results EduRAG demonstrates strong retrieval and response accuracy, with Recall@5 of 87.2% and MRR of 0.79, ensuring relevant document retrieval. However, recall can be improved for complex queries. The high NDCG@5 (0.84) confirms effective ranking, while BLEU (41.6) and ROUGE-L (56.3) scores indicate coherent, high-quality responses. With an average response time of 920ms and token efficiency of 285 tokens per query, the system balances speed and cost-effectiveness. Student feedback highlights its success in diverse advising scenarios, though improvements in explanation depth and conversational flow are needed. Future work will focus on enhancing retrieval, refining responses, and improving interactivity. 3: Send system message and user query to GPT-4 6: Result 87.2% 0.79 0.84 41.6 56.3 920 285 To assess the performance of our RAG system, we evaluate retrieval accuracy, response quality, and system efficiency. The results are summarized in Table 1. Retrieval Performance • Recall@5: Measures the proportion of queries for which a relevant document appears in the top 5 retrieved results. Higher values indicate better retrieval effectiveness. • Mean Reciprocal Rank (MRR): Computes the average reciprocal rank of the first relevant document across queries, emphasizing how early relevant documents appear in ranked results. 5 DISCUSSION The academic advising chatbot developed in this study demonstrates several strengths, while also presenting a few limitations. In this section, we reflect on the overall development process, the challenges faced, and the strengths and limitations of the system. 5.1 Strengths The chatbot’s main strength lies in its ability to quickly retrieve and present accurate academic advising information from a vast dataset. Using the RAG approach, the chatbot is able to combine generative 5 WCCCE ’25, March 4, 2025, Vancouver, BC Arhaan Khaku and Abdallah Mohamed capabilities with a retrieval mechanism to provide responses that are both contextually relevant and grounded in the provided data. Some of the key strengths of the system include: These measures help improve reliability while maintaining AIassisted academic advising as a supportive tool rather than a standalone replacement for human expertise. • High accuracy and precision: The system consistently provides correct and relevant information, as demonstrated by the high accuracy and precision scores. • Quick response times: The chatbot processes queries rapidly, ensuring a smooth user experience with minimal delay. • Ability to handle complex advising scenarios: The chatbot excels in providing tailored advice for unique academic situations, such as transfer credit evaluation and study abroad program details. • Scalability: The RAG-based system is scalable and can handle a wide variety of queries related to graduation requirements, transfer credits, and major/minor information. 5.2 5.4 Limitations Despite its strengths, the chatbot has several limitations: • Limited contextual understanding: While effective for straightforward inquiries, the chatbot may struggle with complex, nuanced questions that require deep reasoning or domain expertise. • Reliance on data quality: The accuracy of responses depends on the quality and completeness of the dataset. Outdated or missing information can lead to incorrect or misleading answers. • AI bias and hallucinations: Large language models can inherit biases present in their training data and may generate hallucinated responses—plausible but incorrect information—potentially leading to misinformation. • Transparency and source attribution: The system does not always explicitly cite sources, making it difficult for users to verify the credibility of responses. Enhancing transparency through source attribution could improve trust. • Limited interactivity: The chatbot could benefit from improved conversational capabilities, such as dynamic followup questions and adaptive responses based on user input. 5.3 Challenges Encountered The development process was not without its challenges. Some of the key issues encountered during the development phase included: • Data formatting and cleaning: The process of transforming raw academic data into a structured format suitable for training the chatbot was time-consuming. Data cleaning, such as ensuring consistency and removing redundancies, was essential to improve the accuracy of the system. • Fine-tuning the model for specific tasks: The taskspecific nature of the queries required continuous finetuning of the model to ensure high accuracy in real-world scenarios. • System integration and testing: Integrating various components, such as the vector store (pgvector), similarity search methods, and chatbot interface, proved to be complex, requiring multiple iterations and debugging. 6 CONCLUSION AND FUTURE WORK This paper presented EduRAG, a chatbot-based academic advising system leveraging RAG architecture to provide accurate, contextaware guidance on graduation requirements, transfer credits, and major/minor selection. By integrating history-aware retrieval and vector-based storage (pgvector), the system enhances advising efficiency while reducing student wait times. Key areas for improvement include expanding data coverage, refining conversational flow, and broadening advising services. Future enhancements will focus on reinforcement learning for response generation, multi-modal interactions, and adaptive learning for personalized support. While EduRAG streamlines routine advising, it does not replace human advisors. Given LLM limitations, it serves as a complementary tool, improving accessibility and efficiency while ensuring students retain access to personalized guidance. REFERENCES [1] Nguyen, L., & Quan, T. (2025). URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots – A Case Study at HCMUT. arXiv. Retrieved from https://doi.org/10.48550/arXiv.2501.16276 [2] Peyton, K., Unnikrishnan, S., & Mulligan, B. (2025). A review of university chatbots for student support: FAQs and beyond. Discov Educ, 4, 21. https://doi.org/10.1007/ s44217-025-00397-7 [3] Osorio Cárdenas, D., & Guatibonza Briceño, P. A. (2025). AI-Assisted Learning: Intelligent Tutoring System for the Introduction to Programming Course. Technical Report, Fundación Universitaria de Ciencias de la Salud. Retrieved from https: //hdl.handle.net/1992/75502 [4] Galstyan, L., Martirosyan, H., Vardanyan, E., & Vahanyan, K. (2024). SmartAdvisor University Chatbot Spring 2024 [5] Ali, U. S. (2024). Ask your Transcript: LLM Driven Insights for Academic Advising. In Proceedings of the 2024 2nd International Conference on Computing and Data Analytics (ICCDA) (pp. 1-4). https://doi.org/10.1109/ICCDA64887.2024.10867349 [6] Doe, J., et al. (2020). From Traditional to Intelligent Academic Advising: A Systematic Literature Review of e-Academic Advising. International Journal of Advanced Computer Science and Applications, 11(4). https://doi.org/10.14569/IJACSA.2020. 0110467 [7] Aguila, A., Tran Ngoc, N., Nguyen, N. A. D., Huynh, K.-T., Mai, A., Le, T. D., & Tuyen, N. T. V. (2024). Large Language Model in Higher Education: Leveraging Llama2 for Effective Academic Advising. Journal Name, X(Y), ZZ-ZZ. https:// example.com Mitigating Biases in LLM-Based Advising To address biases and hallucinations, several mitigation strategies are employed: • Curated knowledge sources: The chatbot relies on vetted academic data to minimize exposure to biased or misleading information. • Human-in-the-loop oversight: While automated, the system allows human advisors to review and refine responses, ensuring accuracy in critical scenarios. • Bias detection mechanisms: Future iterations could integrate bias-detection techniques to flag and adjust responses that exhibit skewed perspectives. • Source transparency: Enhancing citation mechanisms within the chatbot will allow users to verify the origins of retrieved information. 6