Technical Portfolio

A selection of projects demonstrating my expertise in NLP, MLOps, and Predictive Analysis.

Categories
Technologies

Selected Works

6

Elementary-HPO

Efficient Hyperparameter Optimization tool using Quasi-Monte Carlo Sobol Sequences for creating low-discrepancy Sobol sequences.

Hyperparameter OptimizationPoetrySciPyGitHub Actions
View Details

Clinical Instruct API

Developed a specialized clinical reasoning model by fine-tuning GPT-2 (355M) with instruction-based datasets. The resulting model outperformed baselines, achieving a 68.33% success rate on complex conversational medical benchmarks.

GPT-2Fine-tuningNLPPython+1
View Details

ReddiTagger

ReddiTagger parses the chaos of social media. By deploying custom Named Entity Recognition (NER) pipelines, this tool extracts key entities from Reddit threads to visualize what topics are truly driving the conversation.

FlaskDashDockerNLP+1
View Details

CIBMTR-Equity Survival Predictions

Developed a robust survival analysis pipeline using LightGBM and lifelines to predict stem cell transplant outcomes. The solution handled censored data effectively, ranking in the top 26% of a global Kaggle competition.

PythonLightGBMSurvival AnalysisEnsemble Methods
View Details

Jane Street Kaggle Competition

Designed a custom Transformer architecture based on GPT-2 to analyze high-frequency financial time-series data. This deep learning approach achieved a 4x performance improvement over traditional gradient boosting methods.

PyTorchTransformerTime-SeriesDeep Learning
View Details

Spotify Song Clustering

Performed unsupervised learning on the Spotify Million Song Dataset using TF-IDF vectorization and K-Means clustering. Uncovered hidden lyrical patterns that accurately mapped to distinct musical sub-genres.

K-MeansTF-IDFClusteringData Mining
View Details

Want to see the code?

Check out my repositories for more detailed implementations and scripts.

Visit GitHub Profile