Projects

Here you’ll find information about the projects I’ve completed across statistics, data science, machine learning, sports analytics, and related coursework. Together, they reflect both my technical growth and my interest in using data to solve meaningful problems.

Bias in Healthcare Algorithms

Independent research project
Date: November 2024
Tags: R Data Analysis Visualization Healthcare Public Health Ethics Data Simulation Statistical Analysis Predictive Modeling Policy RStudio

This project looks at how healthcare algorithms can end up treating people unfairly, even when they are supposed to help improve care. I focused on a model that used past healthcare spending to predict patient needs, which can be a problem because spending does not always reflect how sick someone actually is. Using R, I built a simulation to recreate the bias in the model and then looked at fairness measures like statistical parity, equalized odds, and disparate impact to see how the model performed. My results showed that Black patients were more likely to be underrated by the system, which could lead to them getting less care than they need. Overall, the project was about showing why accuracy is not the only thing that matters when algorithms are used in healthcare.

⬇ Download PDF

Forecasting NBA Regular-Season Outcomes: Spread, Total, and Offensive Rebounds

Course: Sports Analytics
Date: February 2026
Tags: Python Jupyter Notebook GitHub Sports Analytics Predictive Modeling Machine Learning Regression Feature Engineering Data Cleaning Time Series' 'Model Evaluation

This was a group sports analytics project where we built models to predict three NBA game outcomes: point spread, total points scored, and total offensive rebounds. We started by building a shared dataset that combined home and away team information into one row per game, then created pregame features based on things like team strength, recent form, pace, rest, shooting, and rebounding. We tested a mix of models, including linear models, ridge and lasso regression, random forest, XGBoost, neural networks, and Poisson regression, to see which ones gave the most accurate predictions without using any future information by mistake. A big part of the project was making sure the models only used information that would have been available before tipoff, so the predictions would be realistic. Overall, the project was about using data to better understand NBA games and build models that could make solid game predictions in a practical way..

⬇ Download PDF

Heartbeat Classification with Sequence Models

Course: Deep Learning Methods for Biomedical Applications with PyTorch
Date: February 2026
Tags: Python Deep Learning Sequence Modeling Time Series Neural Networks RNN LSTM GRU Healthcare

This project focused on classifying heartbeat signals from the MIT-BIH arrhythmia dataset using sequence models. We worked with ECG data where each heartbeat was represented as a sequence of values, and the goal was to predict the type of heartbeat as accurately as possible. To do this, we trained and compared models like RNNs, LSTMs, and GRUs in PyTorch, and evaluated them using measures such as accuracy, F1 score, AUC, and confusion matrices. The project helped show how sequence models can be used on real medical data, and it gave us practice with training deep learning models, tuning them, and interpreting how well they performed..

⬇ Download HTML