Featured Projects

A collection of professional and personal projects spanning data science, machine learning, and software development.

Real-time Monitoring and Software Development

Electric Power Outage Tracker

Real-time Outage Monitoring

Developed an automated system to track and analyze power outages in a major US county.

  • Implemented web scraping using Python and deployed using GitHub Actions
  • Created interactive visualizations using Plotly and Folium
  • Developed a dashboard to visualize outage patterns over time
Web Scraping Python GitHub Actions Plotly Folium

Osdag: Open Steel Design And Graphics

Software Development

Contributed to the development of Osdag, a cross-platform GUI software for designing steel structures following Indian design standards.

  • Designed and implemented features such as connection design modules, design preferences, installers, and automated structural design drawings
  • Implemented modules using PyQt5, PythonOCC, and sqlite3
  • Established best practices for version control, unit testing, coding standards, and documentation
  • Created cross-platform installers for Linux and Windows using conda, bash, and NSIS
  • Supported outreach activities by creating tutorials and conducting workshops
Python PyQt5 PythonOCC SQLite NSIS

 

Data Science Projects

Air Quality Prediction in Nairobi

Environmental Monitoring & Forecasting

Developed predictive models for air quality in Nairobi, Kenya, using time series analysis and machine learning.

  • Managed air quality data (particulate matter, temperature, and humidity) using MongoDB
  • Implemented ARIMA and SARIMA models for time series forecasting
  • Created an interactive map to visualize air quality data and forecasts using Folium and Flask
Time Series Analysis ARIMA SARIMA MongoDB Scikit-learn Statsmodels Flask Folium Python

Consumer Segmentation in the US

Market Analysis & Consumer Behavior

Built a consumer segmentation model to identify distinct consumer groups in the US market, enabling targeted marketing strategies and personalized customer experiences.

  • Analyzed data from 2019 Survey of Consumer Finances to identify distinct consumer groups
  • Performed exploratory data analysis on demographic and financial data
  • Leveraged Principal Component Analysis (PCA) and high variance features to select key predictor variables
  • Implemented K-means clustering for consumer segmentation
PCA KMeans Clustering Pandas Scikit-learn Seaborn Python

Nepal Earthquake Building Damage Level Classification

Post-Disaster Damage Classification

Designed ML models to predict level of building damage using structural data following the 2015 Gorkha earthquake in Nepal.

  • Preprocessed structural data on aspects of building location and construction using sqlite3 and pandas
  • Implemented and compared algorithms like logistic regression, decision trees, and random forest for building damage classification
  • Examined feature importance and produced actionable insights for disaster response teams
Logistic Regression Decision Tree Classifier Random Forest Sqlite3 Scikit-learn Pandas Seaborn Python

Bankruptcy Prediction for Polish Companies

Financial Risk Assessment

Created ML models to predict bankruptcy risk of Polish companies based on financial indicators.

  • Preprocessed financial data from UC Irvine Machine Learning Repository using pandas
  • Addressed class imbalance in bankruptcy data using random oversampling
  • Implemented and compared models like random forest and gradient boosting for bankruptcy prediction
  • Evaluated models using precision-recall metrics and ROC curves
Random Forest Gradient Boosting Imbalanced Data Pandas Scikit-learn Seaborn Python

 

Personal Interest Projects

Codenames AI Bot

Natural Language Processing & Game AI

Developed a bot to play the Codenames board game using NLP and word embeddings, capable of generating and interpreting strategic clues.

  • Leveraged Word2Vec embeddings and cosine similarity to generate and interpret clues
  • Designed an interactive game interface with Python and Flask
NLP Word Embeddings Word2Vec NLTK Flask Python
Let's Connect