Kavana Manvi Krishnamurthy

Kavana Manvi Krishnamurthy

The secret ingredient behind my great coding skills is love — and a lot of discipline! I bring a solid foundation in software engineering from my time at Oracle and a Master's in Computer Science. I'm passionate about Machine Learning, Data Analytics, creating end-to-end Data Pipelines, and solving tough problems. I am also an avid learner who tries to learn everything from algorithmic design to how distributed systems work. I have a blog where I break down all the math behind ML algorithms - a lot of Probability, Statistics and Calculus!

I'd be a great fit for these roles

Data Analyst
Transform data into insights
SQL (3+ years) R Python (NumPy, Pandas) Excel Tableau
Data Engineer
Build data pipelines & Data Lakehouses
ETL Pipelines SQL (3+ years) Spark / PySpark AWS S3 GCP Snowflake
AI / ML Data Scientist
Develop predictive & AI models
SQL R Python (scikit-learn, TensorFlow/Keras, PyTorch) Predictive Modeling Statistical Analysis Generative AI LLMs Multi-Agents RAG Vector Databases

EDUCATION

Master of Science, Computer Science

DePaul University, Chicago  ·  GPA: 3.86

Foundational CS: Object Oriented Programming, Distributed Systems, Database Management, Algorithms & Data Structures

Foundational Data Science: Fundamentals of Data Science in R, Data Visualization, Data Regression & Analysis, Image Processing

Advanced Data Science: Programming ML Algorithms, Advanced Machine Learning, Computer Vision

Bachelor of Science, Computer Science

NIE, India

Courses: Mathematics, Statistics, Data Structures, Algorithms, Compiler Design, Cybersecurity, IoT, Microprocessor & Assembly, Java, J2EE, Database, Computer Architecture, OpenGL, Big Data, Data Mining, AI, Software Engineering, Unix

WORK EXPERIENCE

Software Engineer — Oracle

Software Intern — Oracle

SKILLS

I'm also an Expert in Data Structures, Algorithms and System Design

MACHINE LEARNING BLOG

ML Blog
As a Machine Learning practitioner, I translate complex data into actionable, intelligent solutions. My expertise spans the entire ML lifecycle — from rigorous data preparation and statistical modeling to implementing cutting-edge Deep Learning and Generative AI architectures. I focus on building and deploying scalable models, including Supervised, Unsupervised, and Reinforcement Learning systems, consistently driving innovation by turning theoretical concepts into robust, real-world predictive and creative applications.
Read Full Blog →

PROJECTS

Data Analysis & Visualization

Predicting Birth Weight
Built a Regression model for dataset with 36 explanatory variables and 108,082 observations. Applied forward and backward selection, interaction terms, second-order polynomials, and variable transformations.
RRegressionData AnalysisHypothesis Testing
Police Killing Dashboard
Multi-perspective visualizations on police killings. Sankey Plot for age/race/unarmed incidents, Bar Graph for unarmed killings by race, Star Plot for geographic risk, and Interactive Choropleth Map for dynamic exploration.
RTableauData Visualization

Machine Learning

Credit Risk
Engineered features and consolidated 30K+ loan purposes using TF-IDF + OpenAI API, enabling 94% accurate credit risk prediction. Trained and optimized Logistic Regression, Random Forest, and XGBoost using GridSearchCV.
Pythonscikit-learnOpenAI APINLPTF-IDF
Autism Screening
Used Mutual Information and Recursive Feature Elimination for feature selection. Evaluated XGBoost, SVM-RBF, Logistic Regression, Random Forest, and MLP. Achieved 97.2% CV accuracy and 1.0 sensitivity with SVM-RBF.
Pythonscikit-learnXGBoostSVMMLP
Generative AI
Hands-on with RBMs, VAEs, GANs, and Transformers using TensorFlow 2. Built projects including image generation, deepfakes, music composition, and game agent training with GAIL.
TensorFlow 2GANVAETransformersGAIL
Obesity Level Classification
Collected, cleaned, and pre-processed data. Applied PCA and K-Means clustering. Achieved 92.95% accuracy using Decision Trees with pruning and KNN tuned for optimal k values.
RPCAK-MeansDecision TreesKNN
Text Classification
Built a sentiment analysis pipeline using Bag of Words and TF-IDF with a Logistic Regression classifier. Trained on 2,000 samples and achieved 84.37% accuracy.
PythonNLPTF-IDFLogistic Regression
Heart Failure Prediction
Implemented SVM, Decision Trees, and ensemble methods. Developed a weighted voting model using Random Forest, Bagging, and Boosting, reaching 88% accuracy.
Pythonscikit-learnSVMEnsemble LearningRandom ForestBoosting

Software Development

Distributed File Retrieval Engine
Built a distributed client-server file retrieval system with Java and ZeroMQ. Implemented multithreaded dispatcher/worker server with TF-based ranked search on Chameleon Cloud.
JavaZeroMQMultithreadingChameleon CloudDistributed System

CERTIFICATES

Machine Learning

The course includes supervised learning (linear regression, logistic regression, neural networks), unsupervised learning (clustering, dimensionality reduction), and key concepts like model evaluation, bias-variance tradeoff, and regularization. Emphasizes intuition behind algorithms and uses Octave/MATLAB for hands-on practice.

ML Certificate

Developing Large Language Models (LLMs)

Completed a Skill Path in PyTorch and Deep Learning, covering tensors, neural networks, autograd, and optimization. Gained hands-on experience with GANs, transformers, and attention mechanisms, leading up to training and fine-tuning LLMs (e.g., BERT) for real-world NLP tasks.

LLM Certificate

System Design

Abstractions, consistency models, distributed computing. Building block architecture — CDN, DNS, Load Balancers, Key-Value Store, Databases, NoSQL, Cache, BLOB, Monitoring, Distributed Search, Message Queues, Logging, Sequencer, Pub/Sub, Shared Counter. Extensive system design of 10+ concepts: YouTube, Instagram, Quora, Google Maps, TikTok, Uber, Uber Eats etc.

System Design Certificate