×

Low Resource RAG: From Slide Data Processing to RAG Systems

For my Honors Thesis, I develop a retrieval-augmented generation system with Hyundai for automotive safety collision test questions using multimodal slides, finding that fine-tuned embedding models achieve the highest accuracy.

Technologies: Python, LangChain, Hugging Face, Fine-tuning, VLLM, LLM, SLURM

Thermal Image Data Processing and Analysis Tool

A Python annotator tool to process FLIR thermal images, extracting metadata, thermal analysis, aligning images, and generating binary masks for regions of interest.

Technologies: Python, cv2, Pillow, flyrpy, EXIF, numpy, Github

Cyberbullying Classification

A collection of models ranging from classical machine learning to fine-tuned LLMs to detect cyberbullying in text messages. Achieved 99% accuracy utilizing BERT and RoBERTa models for the classification task. Won Best Project Award in CS334: Machine Learning.

Technologies: Python, Hugging Face, PyTorch, Scikit-learn, Git

Student Dropout Prediction

Trained 4 different tabular deep learning networks to predict whether a student is likely to drop out or graduate based on 12 features generated and picked from over 36. Utilized feature engineering, hyper parameter tuning, and deep learning models to achieve 91% accuracy.

Technologies: Python, PyTorch, Jupyter, Scikit Learn, Feature Engineering, Git
© 2024 - 2025 Andrew Chung / Developed with SvelteKit, Vite, TypeScript, Figma / Inspired by oklama.com