Data Engineering & ML
Shikhar Gupta
Transforming Data into Intelligence
Data Engineer specializing in building scalable data platforms, lakehouse architectures, and AI-powered analytics solutions. Currently building production-grade pipelines with Databricks and Delta Lake.
About Me
AI Engineer & Data Architect building intelligent solutions
I'm an AI-focused engineer passionate about transforming data into intelligent systems. My expertise bridges machine learning, LLM applications, and enterprise data platforms, enabling organizations to extract insights and automate decision-making at scale.
I specialize in building RAG architectures, AI chatbots, and vector-powered search systems while maintaining robust data infrastructure. Having developed models with 96% accuracy and deployed production AI systems, I understand the full lifecycle from experimentation to deployment.
Beyond technical implementation, I'm deeply engaged in prompt engineering, fine-tuning, and exploring cutting-edge AI techniques. Continuous learning and staying ahead of AI trends is central to my professional journey.
Years in Data & AI
AI/Data Projects
Model Accuracy
AI Certifications
Machine Learning
Building intelligent models using TensorFlow, PyTorch, and scikit-learn for predictive analytics
AI/LLM Applications
Developing RAG systems, prompt engineering, and vector databases for next-gen AI solutions
Data Infrastructure
Architecting scalable pipelines in Databricks to power AI models at enterprise scale
AI Innovation
Exploring emerging AI trends - from transformers to generative AI applications
Professional Experience
Journey through my professional growth and impactful projects
Key Achievements
- ✓Improved sales forecast accuracy by 17% using XGBoost, LightGBM, and OPTUNA on 50K POS records
- ✓Conducted EDA and feature engineering on 25 features, cutting prediction error by 20%
- ✓Boosted model interpretability through advanced feature analysis
Technologies Used
Featured Projects
Showcasing my most impactful work in data engineering and AI

DeltaGrid – Smart Meter Data Platform
Jan 2026 - Feb 2026
Production-grade lakehouse architecture for smart meter data processing
Impact
500K+ records processed daily

NutriFit RAG – AI-Powered Nutrition Analyzer
Nov 2025 - Dec 2025
LLM-powered nutrition analysis with vector search and source citation
Impact
95% accuracy in nutrient recommendations

YouTube Extractor
Jun 2024 - Jul 2024
LLM-powered video summarization with high semantic accuracy
Impact
98% semantic accuracy
Technical Skills
A comprehensive toolkit built through hands-on experience
Data Engineering & Big Data
Databricks
Apache Spark
PySpark
Delta Lake
Data Pipelines
ETL/ELT
Medallion Architecture
Data Quality Checks
Programming & Databases
Python
SQL
PostgreSQL
MySQL
Pandas
NumPy
Git/GitHub
R
Machine Learning & AI
TensorFlow
PyTorch
Scikit-learn
Deep Learning
NLP
Computer Vision
XGBoost
Transfer Learning
Data Analytics & Visualization
Power BI
Tableau
Matplotlib
Seaborn
Plotly
Excel
KPI Dashboards
Statistical Analysis
Tools & Technologies
Jupyter Notebook
Google Colab
VS Code
AWS
Streamlit
FastAPI
REST APIs
JSON
Business & Analytics
Sales Forecasting
Predictive Analytics
Business Intelligence
Data Storytelling
ROI Analysis
Risk Analysis
Financial Modeling
Trend Analysis
Certifications & Courses
Let's Connect
Reach out to me directly for collaborations or opportunities