Live Analytics

Data Science Portfolio

Analytics &
Intelligence

Exploring |

Scroll to explore

Portfolio at a Glance

Key metrics that define my data science journey

0 Best AUC Score Direct Mail Response Model
0 ML Projects Forecasting, Classification, NLP, CV
0 Industries Covered Healthcare, Finance, Retail, Cyber, IoT
0 Degrees Earned M.S., MBA, B.Tech
0 Python Proficiency Pandas, NumPy, Scikit-learn
0 Certifications Azure DP-100, Python, ML

Skills & Growth Visualized

Interactive charts powered by Chart.js

Technical Skills

Projects by Domain

Technology Radar

Career Growth Timeline

Business Intelligence Dashboards

Real dashboards built with Power BI and Tableau

Power BI Dashboard

Interactive dashboard embed coming soon. In the meantime, explore my published dashboards below.

ML in Action: Smartphone Resale Pricing

Client-side linear regression with pre-computed coefficients

Input Features

Prediction Result

Estimated Resale Price $---
Model Confidence ---%

Factor Breakdown

FactorImpact
Click "Predict Price" to see results

Confusion Matrix Explorer

Adjust the classification threshold and watch metrics change in real-time

--Accuracy
--Precision
--Recall
--F1 Score

Confusion Matrix

Actual 1 Actual 0
Pred 1 Pred 0
TP: 0
FN: 0
FP: 0
TN: 0

ROC Curve AUC: --

K-Means Clustering Playground

Click on the canvas to place data points, then run K-Means to see clustering in action

Click canvas to add points

Decision Tree Builder

Watch a machine learning algorithm learn simple rules to predict whether to play tennis

How does a Decision Tree work? It asks a series of yes/no questions about the data (like "Is it sunny?") to make a prediction. The algorithm picks the question that best separates the outcomes at each step. Think of it like a game of 20 Questions, but the computer finds the most useful questions automatically.

Training Data: Should We Play Tennis?

OutlookTempHumidityWindPlay?
14 days of weather data with outcomes

Decision Rules Learned

Click "Build Tree Step-by-Step" to watch the algorithm learn rules from the data

Bias & Fairness Detector

Analyze ML model fairness across demographic groups in a loan approval scenario

-- Disparate Impact Ratio Should be 0.8 - 1.25
-- Statistical Parity Diff Should be within +/-0.1
-- Equal Opportunity Diff Should be within +/-0.1

Approval Rates by Group

True Positive Rates by Group

Regression Playground

Click to add data points, then see how different regression models fit the data

Equation: y = ? R²: --

Neural Network Playground

Build a network, pick a dataset, and watch it learn decision boundaries in real-time

What is a Neural Network? A neural network is layers of artificial "neurons" that learn to recognize patterns by adjusting the connections between them. Here, the network learns to separate orange dots from blue dots by drawing a decision boundary. Try "Spiral" for a challenge that requires more layers to solve. Watch how adding neurons and layers helps it learn more complex shapes.
Epoch: 0 Loss: -- Ready

Network Architecture

Live Sentiment Analyzer

Type any text and watch NLP break it down word by word

What is Sentiment Analysis? Sentiment analysis is how machines understand emotions in text. Each word carries a positive, negative, or neutral weight. The model also handles negation ("not good" flips the score). In real projects, I use this with BERT and transformer models for customer feedback analysis, social media monitoring, and brand tracking.

Gradient Descent Visualizer

Watch how gradient descent navigates a loss surface to find the minimum

What is Gradient Descent? Gradient descent is the engine behind almost all machine learning. Think of it like rolling a ball downhill to find the lowest point. The "loss surface" is a landscape where lower = better model performance. The learning rate controls step size: too big and it overshoots, too small and it takes forever. Try different values to see the tradeoff.
Steps: 0 Loss: -- Position: (--, --)

A/B Test Calculator

Enter your experiment data and determine statistical significance

Control (A)

VS

Variant (B)

--Rate A
--Rate B
--Relative Lift
--p-value
--Z-statistic
--95% CI

Monte Carlo Simulator

Simulate investment portfolio returns using random walks

--Mean Final Value
--Median Final Value
--5th Percentile
--95th Percentile
--Prob. of Loss

Bayesian Visualizer

See how prior beliefs update with evidence using Beta-Binomial conjugacy

Prior: Beta(alpha, beta)

Observed Evidence

Prior: Beta(2, 5)
Posterior: Beta(10, 17)
Posterior Mean: 0.370
Prior Likelihood Posterior

Time Series Decomposition

Decompose airline passenger data into trend, seasonal, and residual components

Original + Forecast

Trend

Seasonal

Residual

Correlation Heatmap Explorer

Explore variable relationships visually -- click any cell to see the scatter plot

Hover over cells to see correlation values. Click to view scatter plot.

Select a cell to view scatter plot

Data Story: Direct Mail Response Optimization

Scroll through a real ML project from problem to result

Problem Data Features Model Result

The Problem

A software company was spending heavily on direct mail campaigns but seeing diminishing returns. Only 5.1% of recipients were responding, wasting budget on uninterested prospects. The goal: build a model to predict who will respond, maximizing ROI.

5.1% Baseline Response Rate

The Data

Collected 9,517 customer records with 41 features including demographics, purchase history, web activity, and prior campaign responses. Cleaned missing values, handled outliers, and performed stratified sampling to maintain class balance.

9,517 Customer Records Analyzed

Feature Engineering

Applied PCA to reduce 41 features to 12 principal components capturing 87% of variance. Created interaction terms, applied log transforms to skewed distributions, and used clustering to segment customers into 4 behavioral groups.

87% Variance Captured (12 PCs)

The Model

Compared logistic regression, random forest, and gradient boosting. Tuned hyperparameters with 5-fold cross-validation. Logistic regression with PCA features delivered the best balance of interpretability and performance. Used decile ranking for targeting.

5-Fold Cross-Validation Strategy

The Result

Achieved 0.902 AUC, meaning the model correctly ranks responders above non-responders 90.2% of the time. The top 2 deciles captured 62% of all responders, allowing the company to cut mailing costs by 80% while retaining most conversions.

0.902 AUC Score Achieved

Query My Portfolio

Try SQL queries against my portfolio database

portfolio_db -- SQL Terminal
Welcome to Pratyusha's Portfolio SQL Terminal
Type HELP for available commands, or try the queries below.
---
portfolio_db>

Data Pipeline Simulator

Watch data flow through a real-world ETL pipeline with live processing metrics

Why did I build this? This is a simulation of the ETL (Extract, Transform, Load) pipelines I've built in real projects. In production, I use tools like Apache Airflow, Azure Data Factory, and Python to move data through these exact stages. This demo shows the flow visually: raw data gets ingested, cleaned (bad records dropped), transformed (feature engineering), validated (schema checks), and loaded into a warehouse. Watch the log for realistic events like schema errors and dropped records.
Ingest
0
Waiting
Clean
0
Waiting
Transform
0
Waiting
Validate
0
Waiting
Load
0
Waiting
0Records Processed
0Records Dropped
0Errors
0/sThroughput
0sRuntime

Pipeline Log

LeetCode Hugging Face