Monica Nguyen

Applied Data Scientist and Researcher | Senior Data Analyst | AI Career Development Mentor

Projects

Name Role Achievements Summary Year
Personalized Feature Engineering for High-Risk Intensive Care Unit ICU Patients
Team Lead
Maintained high performance while having compact ML models.
Feature selection techniques (ReliefF and FCBF, and Mutual Information), were employed to identify important features. Given the complexity of ICU patient data, these methods were effectively select the most relevant features for survival prediction modeling. Notably, our machine learning models (SVM, XGBOOST, and Random Forest) demonstrated stable performances, exceeding 80% accuracy when using only 50% of the most important features for each patient type.
2025
Automating Literature Reviews: Leveraging Retrieval-Augmented Generation (RAG) for Improved Accuracy and Efficiency
Supervisor
Successfully combining vector-based exclusion, LLM-powered inclusion, and RAG-driven retrieval approaches to transform weeks of manual work into hours.
A set of three prompting styles (OpenAI, Gemini and Ollama) to efficiently analyze 808 scientific documents and extract relevant information on combining music and reminiscence therapy interventions for improving Elderly Well-being. Despite the growing recognition of these therapies as effective interventions, there is a need for more efficient and systematic analysis of existing literature. By automating this process, we aimed to accelerate the development of evidence-based interventions and enhance the Well-being of Elderly populations.
2025
AI-Powered Energy Optimizations through HVAC systems
Team Lead
Won Innovation Prize at the Applied Research Event at Langara College
Won the First Prize at the Present Around Vancouver (PAV) hosted by Chartered Engineers Pacific & The Institution of Engineering and Technology
Secured $39K funding from Langara College
HVAC optimization is challenging due to variations in building insulation, zone configurations, and unpredictable occupant behavior, leading to diverse heating and cooling demands. Effective management requires a deep understanding of hardware and software components, including cooling towers, chillers, and air distribution systems. Traditional optimization methods often fall short, but emerging AI technology offers new solutions for Energy Optimization and carbon reduction in air conditioning systems. Integrating AI with Direct Digital Control Building Management Systems can significantly optimize HVAC performance at Langara College.
2025
Occupancy Detection through the use of CO2 sensor
Supervisor
Successfully classified 4 classes of occupancy
The analyses are being incorporated into the scheduling operation.
This project analyzed CO₂ sensor data to identify the Occupancy status of different rooms for improving the Classroom Schedule operations of building management.
2025
Environmental Impacts on the Alfalfa Leafcutter Bee Performance
AnalystSupervisor
Incorporating weather data enables insight information of leafcutter bees' performance and TNT Pollination bee manager is able to improve current practices
By utilizing private and public data, useful insight findings related to possible causes to bees' performance was unraveled. Some statistical tests were applied.To better understand possible environmental factors influencing bee performance, both private and public datasets were leveraged. Through statistical tests, key variables were identified.
2025
Food Recommendation App
Facilitator
The work aims to simplify food choices, saving money, and reducing food waste
FlavorGraph was used to train on the private data in order to map the relationships of food ingredients and chemical compounds.
2025
Lower Energy Bills using Green Hydrogen Microgrids
Facilitator
Successfully created an interactive dashboard for the company to integrate into their product development
The project aimed to develop a reliable backup energy solution using green hydrogen, a renewable source. By combining this clean energy with advanced predictive analytics and machine learning, we aim to create an optimized system that minimizes costs and maximizes efficiency. This solution is particularly valuable in situations where solar power is unavailable or wind energy remains stagnant, ensuring a steady supply of dependable energy.
2025
Integrating AI-Powered Chatbots into Business Operations: A Prototype for Enhanced Customer Engagement
Facilitator
Successfully created a prototype for the company to integrate into their product development
To improve customer experience, a prototype chatbot was constructed to perform event-based recommendation tasks and is in the process of integrating with the existing business operation. This chatbot used OpenAPI for processing natural language. Content-based filtering , Apriori, Clustering were also used.
2024
Improving Respiratory Disease Diagnosis through Multi-Modal Analysis and Deep Learning
AnalystSupervisor
Trained machine and deep learning models demonstrated stable performances, exceeding 90% accuracy
To improve the predictive power, an algorithm that can integrate different modalities of data, such as text and speech, transformers were utilized to extract audio features along with three different machine learning classifiers in order to conduct a prediction task for respiratory disease. XGBOOST, LightGBM, CNN models, and Voice Data were used.
2024
Empowering Healthcare Professionals: A Cloud-Based System for Predicting ICU Patient Outcomes with High Accuracy
Team Lead
Web Application
Technical publication available to access: https://doi.org/10.3389/fmed.2024.1398565
The innovation in technologies related to health facilities today is increasingly helping to manage patients with different diseases. One of the most challenging issues is to accurately predict the survival outcome for mechanically ventilated patients in intensive care units (ICUs). The real challenge is to diagnose patients with more diagnostic accuracy and in a timely manner, followed by prescribing appropriate treatments and keeping prescription errors to a minimum. This project constructed machine learning models to predict patients’ outcomes and deployed a cloud-based ICU prediction system empowered with a set of machine learning models, including Bagging (BGG), eXtreme Gradient Boosting (XGB), and Decision Tree models, achieving over 90% accuracy. AWS EC2, Postgre SQL, and Airflow were used.
2024
A Collaborative Approach to Building a Machine Learning-Powered Waste Classification System
Team Lead
Web Application
An application was built to classify 4 different types of waste, namely organic, plastic, landfill, and refundable. Different teams were working from collecting data to train computer vision models to perform the classification tasks, from optimizing machine learning models to deploying the model on AWS cloud-based infrastructure. AWS EC2, Postgre SQL, Airflow, Streamlit, and Ultralytics Yolo were used.
2024
RFP Responses Automation: Leveraging Large Language Models for Efficient Q&A systems
Facilitator
Best performance of 65%.
T5, BART, BERT, RoBERTa models were selected to train on a private data. For model performance evaluation, F1 and BLEU scores were utilized for evaluating the performance of those trained models. Q&A systems and RFP Responses were the focus.
2024
Comparative Analysis of Hyperparameter Tuning Methods for Improved Classification Accuracy and Resource Efficiency
Facilitator
-
Hyperparameter methods, including Random Search, Bayesian Optimization, and Particle Swarm Optimization were examined for their impacts on the classification accuracy of machine learning models, namely XGBOOST, Random Forest, Support Vector Machine, and K-nearest neighbors. This gives experience to machine learning operators in ways to optimize ML models while utilizing less required resources.
2024
BERTopic Modelling on Well-being among Elderly in Canada
Team Lead
A near-real time and interactive dashboard
This project utilized BERTopic (a transformer-based model) to perform topic modeling tasks on a dataset of tweets without predefined labels. The data was extracted from Twitter and the focus was on the well-being among Elderly in Canada. To develop a near-real time dashboard, Apache Airflow was used to automate and schedule the entire process. Looker Studio was used to develop the dashboard.
2023
Skin Cancer Detection Application
Team Lead
2 peer-reviewed publications
A web-based prototype for detecting skin cancer was developed and deployed. The deep learning models were built using public datasets for the classification tasks.
2023
Reporting Dashboard to Improve Employees' Performance
Facilitator
A dashboard to optimize employees' KPI
Conceptualized and implemented a custom extract, transform and load (ETL) process to extract meta data related to scheduling and employee’s KPI from various sources (Twitter, Jira and Harvest) and ingest them to Airbyte and Snowflake, yielding 10MB of data storage every 2 days.
2022
Clinician Management Dashboard
CoordinatorFacilitator
A dashboard deployment
Developed a Power BI dashboard that automates the process by pre-processing data from PS Suite, managing it in a local/online database, and visualizing it through a clean user interface. Unified data under a single database and performed analytics on a single datastore to gain valuable insights for the clinic.
2022
Concussion Test Dataset Analysis
Facilitator
A synthetic data was successfully generated
Generated insights into the concussion test dataset by creating a synthetic dataset based on patient demographics, detailed concussion tests, and return-to-play data.
2022

Partners

Culinary Compass HCH Infostrux SBC Smartworks Trillabit Prism