Stuti Dumre

Data Scientist

Build, train, deploy, and optimize machine learning models - from data preprocessing and feature engineering to production deployment and continuous monitoring

GET IN TOUCH
Stuti Dumre

Machine Learning

With expertise in supervised and unsupervised learning, deep learning (CNNs, RNNs, Transformers), and model optimization, I build end-to-end ML pipelines from data preprocessing to production deployment. Experienced with TensorFlow, PyTorch, scikit-learn, and cloud-based model training on AWS.

Data Engineering

With expertise in supervised and unsupervised learning, deep learning (CNNs, RNNs, Transformers), and model optimization, I build end-to-end ML pipelines from data preprocessing to production deployment. Experienced with TensorFlow, PyTorch, scikit-learn, and cloud-based model training on AWS.

Visualization & Analysis

Proficient in designing and implementing robust ETL pipelines, data transformation workflows, and automated data processing systems. Experience with SQL, Python (Pandas, NumPy), AWS (S3, Lambda), and CI/CD automation for scalable data infrastructure.

About me

I am Stuti, a data scientist and machine learning engineer passionate about building intelligent systems that extract insights from data and solve real-world problems. From the first steps of data collection and exploration through feature engineering, model training, and hyperparameter optimization, all the way to production deployment and monitoring - I can support each step of the process to enable organizations in making evidence-driven decisions.

With 4+ years of experience in machine learning, deep learning, computer vision, and data engineering, I've worked on projects ranging from OCR systems using CNNs to recommendation engines using collaborative filtering, NLP applications with LLMs, and scalable ML pipelines on AWS. I'm proficient in Python, TensorFlow, PyTorch, scikit-learn, and cloud platforms, with a strong foundation in statistical analysis and A/B testing.

I am available for data science consulting, ML engineering projects, and part-time collaborations.

Portfolio

A sample of my work

Weather Forecasting Application

Real-time dashboard using Streamlit and Gemini API with predictive analytics and time series modeling, deployed on AWS EC2.

ETL Pipeline & ML Deployment

End-to-end data pipeline with ML models (Random Forest, XGBoost) and TinyLLaMA NLP integration, deployed on AWS with CI/CD automation.

AWS ML Infrastructure

Automated CI/CD pipelines for ML model deployment using AWS services (EC2, S3, Lambda, SageMaker), reducing deployment time by 40%.

Kaggle Competition Projects

Multiple competition entries using ensemble methods, model stacking, and advanced feature engineering. Documented on Medium and GitHub.

Experience

Data Scientist Junior

August 2022 – October 2024

Soft Crunch | Kathmandu, Nepal

  • Designed recommendation systems using collaborative filtering, increasing user retention by 15%
  • Built and deployed ML models on AWS (EC2, S3, Lambda, SageMaker) with CI/CD pipelines
  • Performed EDA and feature engineering, creating Tableau visualizations for stakeholders
  • Conducted A/B testing and statistical analysis to evaluate model performance

Data Scientist Junior (R&D)

April 2020 – July 2022

Cloudlaya LLC | San Francisco, CA

  • Developed CNN-based OCR system using TensorFlow/Keras for handwriting recognition
  • Conducted data collection, preprocessing, and image augmentation on diverse datasets
  • Optimized training workflows on AWS EC2, achieving 18% error reduction
  • Applied advanced regularization techniques and statistical analysis for model evaluation

Education

Master of Science - Data Science

Wright State University | Dayton, OH

2025 – Present

Bachelor of Science - Computer Science and Information Technology

Tribhuvan University | Kathmandu, Nepal

2017 – 2021

Technical Skills

Machine Learning & AI

TensorFlow, PyTorch, Keras, scikit-learn, OpenAI, LangChain, XGBoost, LightGBM, CNNs, RNNs, Transformers, NLP, Computer Vision

Data Engineering

Python, SQL, Pandas, NumPy, ETL Pipelines, Data Warehousing, AWS (EC2, S3, Lambda, SageMaker), Docker, CI/CD

Analysis & Visualization

Tableau, Power BI, Matplotlib, Seaborn, Plotly, Statistical Analysis, A/B Testing, Hypothesis Testing, EDA

Contact