Skip to the content.
Data Science Portfolio
- Developed interactive, real-time dashboard of lifetime, weekly, and recent game data for Warzone players
- Retrieved statistics from Call of Duty API based on user gamertag and platform with data validation
- Publically available for current, past, and future Call of Duty: Warzone players
- Deployed web app via Flask for continuous uptime and availability

- Identified 500+ plausible hackers in GTA Online player base using insights from domain knowledge
- Built 5 supervised ML models and selected AdaBoost as best for scalability with AUC of 0.81
- Selected by ML professor to showcase project in machine learning lecture to 150+ students

- Built a visual interpretability tool for a university grant prediction model developed in a Kaggle competition (Nutritional Label)
- Engineered missing data and one-hot encoded categorial values
- Investigated attribute weights and disparities within Birth Country and Home Language features (Recipe & Diversity)
- Generated a statistical description across 12 different subpopulations (Ingredients)
- Determined the robustness of the prediction methodology on data (Stability)
- Measured the group fairness through disparate impact and statistical parity (Fairness)

- Created models to help meteorologists estimate air pollution measurements (RMSLE ~ 0.233)
- Trained over 7000 entries of weather information and sensor data measurements
- Performed time series decomposition to visualize seasonality, trend, and residual patterns in data
- Engineered lag features based on autocorrelation and partial autocorrelation patterns
- Optimized CatBoostRegressors to reach the best hyperparameter tuned models
- Placed in top 27% of Kaggle Tabular Playground Series - Jul 2021 competition

- Created a classifier to classify documents based on religious sentiment (accuracy ~ 91%)
- Preprocessed data using a TfidfVectorizer to transform text to feature vectors and filter out uncommon words
- Identified words that contributed to misclassification of documents and in which documents
- Generated visual explanations of feature values using SHAP
- Optimized SGDClassifier using chi-square filter feature selection to improve accuracy
