Spaces:

MohitG012
/

Voice_Bot_Portfolio_Backend

Sleeping

Voice_Bot_Portfolio_Backend / Data /profile_data.txt

MohitGupta41

Add application file

86b50c4 6 months ago

26.4 kB

	ABOUT ME
	Hi! I am Mohit Gupta, a passionate AI engineer from Delhi, India. I am currently in my final year pursuing a B.Tech in Artificial Intelligence and Data Science at the University School of Automation & Robotics, GGSIPU. I am smart, playful, and deeply enthusiastic about building impactful AI systems. I enjoy working on multimodal AI, Retrieval-Augmented Generation (RAG) systems, and scalable deployments, and I’m constantly exploring advancements in generative AI and large language models. Outside of tech, I am an athletic person who enjoys playing basketball and listening to music. I am always eager to hustle, explore, and contribute to impactful AI-driven projects.

	PROFILE LINKS
	GitHub: https://github.com/MohitGupta0123
	LinkedIn: https://www.linkedin.com/in/mohitgupta012
	Kaggle: https://www.kaggle.com/mohitgupta12
	LeetCode: https://leetcode.com/u/MohitGupta012/

	SKILLS
	I possess expertise across a broad range of technologies and tools:

	Programming Languages: Python, C++
	Developer Tools: HTML/CSS, Tailwind, Streamlit, FastAPI, GitHub, SQL, MongoDB, OOPS, DBMS
	Machine Learning and AI: Machine Learning and Deep Learning Algorithms, AutoML, Data Analysis, Power BI, Tableau, PyTorch, TensorFlow, Natural Language Processing (NLP), RAG, LangChain, LangGraph, ChromaDB, CrewAI, AutoGen, OpenCV, Time Series Forecasting, LLM Finetuning, Generative AI, Diffusion Models
	MLOps and Cloud: MLflow, Docker, Kubernetes, Prometheus, Grafana, Amazon AWS S3

	EXPERIENCE
	Machine Learning Research Intern – Indian Institute of Technology Delhi (Feb 2025 – Present)
	Led development of an AI-powered MRI acceleration system using diffusion models and FFT, reducing scan times from 30 minutes to a few minutes.
	Engineered and optimized deep learning models in Python and PyTorch to reconstruct high-fidelity MRI images from highly limited data (4x acceleration).

	Machine Learning Intern – ImagoAI (Jan 2025 – Feb 2025)
	Boosted high-value toxin prediction accuracy by 38% using regression-aware augmentation (SMOTE, MixUp) on imbalanced hyperspectral data.
	Enhanced model R² by 0.25 by developing ViT and branched CNN architectures for hyperspectral data, outperforming baseline Dense/CNN models.

	EDUCATION
	B.Tech in Artificial Intelligence & Data Science – University School Of Automation & Robotics, GGSIPU (Sep 2021 – Apr 2025)
	CGPA: 9.032

	ACHIEVEMENTS AND EXTRACURRICULARS
	1. Led a team in the Amazon ML Challenge 2024, achieving a top 0.17% rank in a competitive hackathon by implementing a transformer-based model with an F1 score of 0.57341.
	2. Ranked among the top 17 teams out of 1000+ in the GSTN Analytics Hackathon organized by GSTN and received 25,000 Indian Rupees.
	3. Active on GitHub, Kaggle, and LeetCode, continuously contributing to open-source projects and competitive coding challenges.

	PROJECTS

	1. FRAUD DETECTION MLOPS PIPELINE
	The Fraud Detection MLOps Pipeline is an end-to-end system designed to identify potentially fraudulent financial transactions with high accuracy and scalability. It integrates machine learning with robust MLOps practices to ensure seamless experimentation, deployment, and real-time monitoring of fraud detection models, while maintaining modularity and reproducibility across the entire ML lifecycle.
	PROJECT OVERVIEW
	This pipeline leverages a custom-built FraudPipeline that handles feature engineering, preprocessing, class imbalance management via SMOTE, and threshold tuning to optimize fraud detection metrics. Experiments are tracked using MLflow, enabling parameter logging, artifact storage, and comparative analysis between multiple runs. Deployment is achieved through FastAPI for REST APIs and Streamlit for an interactive prediction dashboard, both containerized using Docker and orchestrated with Kubernetes (Minikube) for scalability. Continuous monitoring of system health and API performance is managed via Prometheus and Grafana dashboards, ensuring reliability in production. The system is designed to deliver high recall for fraudulent transactions while minimizing false positives, a critical balance in financial systems.
	TECH STACK
	The project uses Python (3.12+) as the core language and leverages libraries such as Scikit-learn for model building, Imbalanced-learn for SMOTE, and Pandas/Numpy for data manipulation. Deployment and MLOps tools include MLflow, FastAPI, Streamlit, Docker, Kubernetes, Prometheus, and Grafana. This combination ensures an efficient pipeline for experimentation, deployment, and monitoring.
	ARCHITECTURE
	The architecture follows a modular design: data ingestion and preprocessing lead into feature engineering (interaction terms, ratio features, and time-of-day bins), followed by imputation, encoding, scaling, and resampling. Models like logistic regression, RandomForest, or XGBoost are trained and optimized via precision-recall thresholds. The pipeline is fully containerized and orchestrated through Kubernetes, with real-time metrics exposed for Prometheus and visualized via Grafana dashboards.
	FEATURES
	Real-time fraud prediction via FastAPI REST APIs and Streamlit web UI.
	Experiment tracking with MLflow, logging hyperparameters, metrics, confusion matrices, and PR curves.
	Scalable deployment using Dockerized microservices and Kubernetes orchestration.
	Robust monitoring with Prometheus scraping API metrics and Grafana visualizing system health and request patterns.
	Automatic preprocessing and SMOTE resampling for highly imbalanced datasets, coupled with dynamic threshold optimization to achieve optimal precision-recall trade-off.
	RESULTS AND METRICS
	The pipeline demonstrated strong performance on hold-out datasets: achieving up to 99 percent recall and 98 percent precision on balanced subsets, and effectively minimizing false negatives, which is crucial in fraud detection. Precision-recall curves and confusion matrices highlight the system’s effectiveness, and these artifacts are stored for review within the MLflow experiment tracking framework.
	FUTURE ENHANCEMENTS
	Planned improvements include integrating CI/CD pipelines using GitHub Actions or Jenkins, adding a model registry (MLflow Registry or Seldon Core), deploying cloud-native solutions on AWS, GCP, or Azure, incorporating real-time streaming predictions with Kafka, and enhancing explainability with SHAP or LIME.

	2. STUDY TRACKER DASHBOARD
	The Study Tracker Dashboard is a personalized and visually rich web application built using Streamlit to help students track their study schedules, daily goals, and overall progress in an interactive way. It is designed to support exam preparation by providing real-time insights into session-wise performance, subject completion rates, and backlog tracking, while ensuring secure authentication and per-user data storage.
	PROJECT OVERVIEW
	This dashboard enables students to plan and monitor their studies with a dynamic timetable system powered by CSV data, JSON state management, and SQLite-based authentication. Users can log in, view subject-wise schedules, mark completed study sessions, and track their progress via interactive charts and grids. The system automatically refreshes before each session begins, ensuring students stay on track and manage their time effectively.
	FEATURES
	Secure login and registration using SQLite and streamlit_authenticator, storing hashed passwords and user-specific data.
	Dynamic study plans auto-generated from real-time calendars with per-day session allocations.
	Progress tracking with donut charts, bar graphs, and checkbox grids to visualize subject completion.
	Backlog detection for missed sessions and dynamic rerun alerts to maintain consistency.
	Time analytics including usage distribution and comparison charts for weekly and overall summaries.
	Multi-user support with JSON-based state management, allowing each user to resume progress seamlessly.
	TECH STACK
	Frontend: Streamlit for UI, Matplotlib for charts
	Backend and Authentication: SQLite with streamlit_authenticator
	Data Handling: Pandas for CSV and JSON operations
	Storage: CSV for study plans, JSON for user progress persistence
	RESULTS AND IMPACT
	This tool simplifies exam preparation by consolidating study plans, completion rates, and time management into one dashboard. Its clean UI, auto-rerun reminders, and subject backlog insights ensure students can maintain consistency, track milestones, and reduce cognitive load during preparation.
	FUTURE ENHANCEMENTS
	Planned upgrades include integrating cloud storage for multi-device access, adding goal-setting and streak tracking, mobile responsiveness, and AI-driven study recommendations based on progress history.

	3. GRAPHRAG: EBAY USER AGREEMENT CHATBOT
	The GraphRAG: eBay User Agreement Chatbot is a Knowledge Graph-powered conversational AI system designed to answer legal and policy-related questions from the eBay User Agreement. Built with Neo4j, Meta LLaMA 3B (instruction-tuned), and FAISS memory, it integrates natural language understanding, graph-based reasoning, and contextual memory to provide accurate and grounded responses to user queries. The chatbot is deployed via Streamlit with real-time LLM response streaming using Hugging Face endpoints.
	PROJECT OVERVIEW
	This chatbot addresses the challenge of understanding lengthy user agreements by converting them into structured knowledge graphs. User queries are processed through Named Entity Recognition (NER) and Relation Extraction (RE) pipelines (using SpaCy and custom rules), mapped to triples stored in Neo4j, and enhanced with conversation history stored in FAISS. The retrieved triples and memory context are dynamically injected into prompts sent to the Meta LLaMA 3B model, ensuring concise and fact-grounded answers without hallucination.
	FEATURES
	Knowledge graph-based reasoning: Extracts triples from legal documents and queries them in real-time using Cypher.
	Memory-augmented retrieval: Past Q and A context stored in FAISS ensures conversational continuity.
	Grounded legal Q and A: Responses are fact-checked against structured data from the agreement.
	Interactive UI: Streamlit app with chat history, save and load sessions, and Neo4j visualizations.
	Streaming responses: Real-time LLM output streamed to users for smooth interactions.
	TECH STACK
	Frontend: Streamlit
	Backend and reasoning: Meta LLaMA-3B-Instruct via HuggingFace API, FAISS for memory, Neo4j for graph storage
	Triplet extraction and NER: SpaCy and custom RE pipelines
	Embeddings: SentenceTransformers for synonym and entity similarity expansion
	ARCHITECTURE AND WORKFLOW
	User submits a query via the Streamlit interface.
	Entities are extracted using SpaCy-based NER and RE pipelines.
	Relevant triples are fetched from the Neo4j knowledge graph using Cypher queries.
	Memory module retrieves related past Q and A from FAISS.
	Combined context (triples and memory) is injected into prompts and sent to LLaMA-3B.
	Response is streamed to the user and optionally saved for later sessions.
	RESULTS AND IMPACT
	The chatbot successfully transforms dense legal documents into queryable knowledge structures and provides precise answers with contextual memory, reducing information overload for users. It demonstrates how GraphRAG techniques can be applied for legal, compliance, and policy-oriented conversational systems.
	FUTURE ENHANCEMENTS
	Planned improvements include multi-document support, automated graph building pipelines, integration with LangChain memory chains, and deployment to cloud environments such as Hugging Face Spaces or AWS.

	4. FASHION SENSE AI (FASHION VISUAL SEARCH AND PERSONALIZED STYLING ASSISTANT)
	The Fashion Sense AI project is a modern AI-powered web application that enables users to search for visually similar fashion products, receive personalized outfit recommendations, and simulate style suggestions based on browsing history and trending fashion insights. Built using Streamlit, CLIP embeddings, FAISS indexing, and Gemma-3B LLM via Hugging Face, it serves as a foundation for AI-driven e-commerce integrations and personal styling tools.
	PROJECT OVERVIEW
	Unlike traditional search engines that rely on manual tags or filters, Fashion Sense AI enhances user experience with multimodal search capabilities. Users can upload images to retrieve visually similar products, input descriptive text queries such as “oversized hoodie” or “floral print dress,” and receive history-based recommendations or AI-generated outfit completions powered by large language models. The pipeline combines vector-based retrieval using FAISS with trend analysis and recommendation logic for a seamless, personalized shopping experience.
	FEATURES
	Visual search: Upload an image or input a textual style description to find similar products instantly.
	LLM-powered outfit generation: Use Gemma-3B to suggest outfit completions including shoes, accessories, or layering items.
	Browsing history simulation: Generate fake user histories to mimic personalized suggestions.
	Trend analysis: Extract trending fashion keywords from inventory and scraped web data.
	Efficient vector search: FAISS-based nearest-neighbor search using 1152-dimensional embeddings from CLIP and SentenceTransformers.
	Modular pipeline: Each module (dataloader, search, outfit suggestions, trend inference) is reusable and extensible for future e-commerce integrations.
	TECH STACK
	Frontend: Streamlit
	Image embeddings: CLIP (ViT-L/14)
	Text embeddings: SentenceTransformers
	LLM suggestion engine: Gemma-3B via Hugging Face Inference API
	Nearest-neighbor search: FAISS
	Trend analysis: TF-IDF and keyword extraction from product metadata
	ARCHITECTURE AND WORKFLOW
	User uploads an image or enters a style query.
	The system generates embeddings using CLIP or SentenceTransformers and performs a FAISS search across indexed inventory.
	Results are displayed visually with associated metadata such as price and style tags.
	Browsing history, real or simulated, feeds into personalized suggestions.
	The Gemma LLM generates complementary outfit recommendations based on style context and recent fashion trends.
	RESULTS AND IMPACT
	Fashion Sense AI demonstrates how multimodal AI can transform fashion discovery, providing more relevant, visually accurate, and trend-driven recommendations compared to traditional keyword-based search. It serves as a blueprint for integrating visual search, personalization, and LLM-driven fashion styling into e-commerce platforms.
	FUTURE ENHANCEMENTS
	Planned improvements include user authentication with persistent history, add-to-cart integration for online stores, voice-based outfit search, and fashion Q and A chatbots powered by generative AI.

	5. MRI IMAGE RECONSTRUCTION USING K-SPACE
	The MRI Image Reconstruction using K-Space project is an interactive Streamlit application that demonstrates step-by-step reconstruction of MRI images from frequency domain (k-space) data using the 2D Fourier Transform. This educational tool visualizes how low and high-frequency components contribute to the final reconstructed image, enabling a better understanding of MRI physics and k-space representation.
	PROJECT OVERVIEW
	The application allows users to upload their own DICOM (.dcm) MRI images or use a default sample to observe live reconstruction. The app converts the input image into its frequency domain using Fast Fourier Transform (FFT) and progressively reconstructs the spatial image as more frequency components are added. This progressive visualization highlights the role of low-frequency (contrast) and high-frequency (detail) components in MRI imaging.
	FEATURES
	DICOM upload support: Accepts .dcm MRI files or uses a default image if none is uploaded.
	K-space visualization: Displays the magnitude spectrum of the Fourier Transform.
	Progressive reconstruction: Step-by-step visualization of image formation from frequency components.
	Interactive UI: Real-time updates with Streamlit session_state for smooth user experience.
	Error handling: Fallback to default image in case of upload or processing errors.
	TECH STACK
	Frontend: Streamlit
	Numerical processing: NumPy (FFT for k-space transformations)
	Visualization: Matplotlib (optional for additional plots)
	DICOM handling: pydicom or similar libraries
	ARCHITECTURE AND WORKFLOW
	Load .dcm MRI image from user upload or default sample.
	Convert the image into frequency domain using FFT to obtain k-space.
	Progressively reconstruct the image by adding low- and high-frequency components step by step.
	Display both the k-space magnitude and the live reconstructed image in the Streamlit interface.
	RESULTS AND IMPACT
	This project provides a visual learning tool for students, researchers, and enthusiasts exploring MRI imaging concepts. By showing how k-space frequencies contribute to final image quality, it bridges the gap between theoretical physics and practical medical imaging understanding.
	FUTURE ENHANCEMENTS
	Planned improvements include multi-mode reconstruction (e.g., radial or spiral filling), support for raw scanner data, and integration of additional image quality metrics such as SSIM and PSNR for comparative visualization.

	6. CORN VOMITOXIN PREDICTION (DEEPCORN PROJECT)
	The Corn Vomitoxin Prediction (DeepCorn Project) is a complete pipeline for predicting vomitoxin_ppb concentration in corn samples using both Machine Learning (ML) and Deep Learning (DL) techniques. The project leverages spectral reflectance data across 0–447 bands to build robust predictive models and includes a Dockerized inference API for streamlined deployment.
	PROJECT OVERVIEW
	The repository contains two parallel pipelines — a traditional ML approach and a deep learning approach — with dedicated notebooks for exploratory data analysis, feature scaling, and model training. The deep learning models are built with PyTorch, achieving superior performance compared to traditional methods. A Dockerized version of the inference pipeline is provided for easy containerized deployment, making the project production-ready and portable.
	FEATURES
	Spectral data analysis: Utilizes 447-band reflectance data for vomitoxin prediction.
	Dual modeling approaches: Traditional ML pipeline (EDA and model building) and DL pipeline (optimized PyTorch models).
	Optimized deep models: Multiple optimized deep learning models such as DeepCorn_Best_Model_optimized.pth tested for performance on scaled datasets.
	Dockerized inference API: Containerized Flask API for easy deployment and testing.
	End-to-end pipeline: Includes EDA, preprocessing, scaling, model training, and inference.
	TECH STACK
	Programming language: Python
	Machine learning framework: Scikit-learn for ML; PyTorch for DL
	Containerization: Docker (Flask-based API for inference)
	Data handling: Pandas, NumPy
	Visualization: Matplotlib, Seaborn for EDA
	ARCHITECTURE AND WORKFLOW
	Perform EDA and build baseline ML models using the EDA and ML Approach Notebook.
	Train deep learning models with scaled inputs using the Deep Learning Notebook.
	Select best-performing models based on metrics and save weights (.pth files).
	Deploy the optimized deep learning model via the Dockerized Flask API for inference.
	RESULTS AND IMPACT
	The deep learning approach achieved the best predictive performance on vomitoxin_ppb concentration, outperforming baseline ML models. Scaling input data significantly improved model accuracy, enabling better generalization on unseen samples.
	FUTURE ENHANCEMENTS
	Planned improvements include enhancing model robustness, deploying cloud-native APIs on AWS or GCP, and incorporating real-time spectral data streaming for field-based predictions.

	7. IMAGE PROCESSING PROJECT
	The Image Processing Project is an interactive Streamlit application that allows users to perform various image transformations and visual enhancements, including brightness and contrast adjustments, color space conversions, binary thresholding, and image annotation. Built with OpenCV and NumPy, this application serves as a hands-on tool for learning basic image processing techniques and provides features for real-time adjustments and downloadable outputs.
	PROJECT OVERVIEW
	This project focuses on providing a clean, user-friendly interface for experimenting with fundamental image processing operations. Users can upload their own images (JPG, JPEG, PNG) and apply transformations such as RGB conversion, grayscale conversion, binary thresholding, and fine-tuned brightness and contrast controls. Annotations like lines, rectangles, circles, and custom text can be added to images with configurable colors, sizes, and positions. The app also includes direct download functionality to save the processed images locally.
	FEATURES
	Image upload and display: Supports JPG, JPEG, and PNG formats with automatic dimension display (height and width).
	Color transformations: Convert images to RGB, grayscale, or binary threshold modes.
	Brightness and contrast adjustment: Real-time sliders for controlling brightness and contrast levels.
	Annotations: Add lines, rectangles, circles, or text to images with customizable coordinates, thickness, font styles, and colors.
	Interactive UI: Built with Streamlit widgets such as sliders, dropdowns, and color pickers for real-time updates.
	Download processed images: Download any modified image (RGB, grayscale, binary, annotated) in JPG format directly from the app.
	TECH STACK
	Frontend and app framework: Streamlit
	Image processing: OpenCV (cv2)
	Data handling: NumPy for array manipulation
	UI enhancements: Streamlit components (sliders, color picker, text inputs) and custom CSS for styling
	ARCHITECTURE AND WORKFLOW
	Image input: User uploads an image via the sidebar uploader.
	Processing options: User selects desired processing mode (Original, RGB, Grayscale, Binary, Brightness, Contrast, Annotation).
	Transformation: Corresponding OpenCV operations such as color conversion, thresholding, or scaling are applied.
	Annotation handling: Users provide coordinates, thickness, and colors to draw shapes or add text to the image.
	Preview and download: Processed images are displayed in real-time and can be downloaded using the built-in download button.
	RESULTS AND IMPACT
	This project provides an intuitive environment to explore core image processing concepts interactively. It is especially useful for beginners learning OpenCV, educators demonstrating transformations in real-time, or anyone needing a lightweight image editing tool for quick adjustments and annotations.
	FUTURE ENHANCEMENTS
	Planned improvements include support for additional filters such as Gaussian blur and edge detection, layered annotations with undo and redo functionality, batch image processing, and mobile-friendly UI for broader accessibility.

	ADDITIONAL INFORMATION
	LIFE STORY / BACKGROUND
	I grew up with a deep curiosity about technology and problem-solving, which naturally led me toward artificial intelligence. During my undergraduate studies in AI and Data Science at the University School of Automation & Robotics, GGSIPU, I discovered my passion for building end-to-end AI systems — from data pipelines to deploying real-world solutions. Over the past few years, I’ve worked on projects ranging from fraud detection and medical imaging to fashion recommendation systems, blending creativity with technical rigor. What excites me most is transforming complex problems into scalable AI-driven products that can create real-world impact.
	SUPERPOWER / STRENGTHS
	My biggest strength is my ability to learn quickly and build complete AI solutions from scratch. I thrive in situations where I need to integrate multiple technologies — whether it’s combining deep learning models with MLOps pipelines or creating multimodal systems like my Fashion Sense AI project. I’m also known for being hardworking and adaptable; I can switch between research-heavy work, like diffusion models for MRI, and production-focused tasks, like deploying scalable APIs with Kubernetes.
	AREAS OF GROWTH
	I am actively working on improving three key areas: leadership skills to effectively manage AI teams and mentor juniors, advanced research in reinforcement learning and agentic AI systems, and scaling distributed AI solutions for production environments. These areas align with my long-term vision of contributing to high-impact AI products at scale.
	WORK STYLE AND COLLABORATION
	I am a proactive and organized team member who values clear communication and collaboration. I enjoy brainstorming ideas openly with teammates and believe in constructive feedback to drive innovation. When challenges or tight deadlines arise, I prioritize breaking down problems into smaller actionable steps and stay calm under pressure, ensuring high-quality results without compromising timelines.
	MISCONCEPTIONS / UNIQUE TRAITS
	A common misconception about me is that I am very serious because I focus deeply on work, especially when solving complex AI problems. In reality, I’m playful and enjoy creative brainstorming sessions, whether it’s during hackathons, casual team discussions, or even while playing basketball with friends.
	PUSHING BOUNDARIES
	I consistently push my boundaries by participating in hackathons, Kaggle competitions, and side projects. For instance, I developed a knowledge graph-powered chatbot from scratch in just a few weeks, and I’ve challenged myself to build multimodal systems like Fashion Sense AI that combine vision, language, and retrieval models. These projects help me stay at the cutting edge of AI while continuously expanding my technical and creative skill set.
	HOBBIES AND PERSONALITY
	Outside of work, I am passionate about basketball, traveling to new places, and enjoying music, especially during long coding sessions. These hobbies help me maintain balance and creativity, often inspiring fresh ideas when I return to solving technical problems.
	VISION / FUTURE GOALS
	In the next three to five years, I aim to lead AI product development teams that build impactful and scalable solutions in fields like healthcare, finance, or fashion technology. I want to bridge the gap between cutting-edge AI research and user-friendly products, ensuring that innovations are both ethical and accessible.
	CORE VALUES
	I am driven by innovation, continuous learning, and the desire to create meaningful impact through technology. Ethical AI development and scalability are central to my approach — I believe technology should solve real problems responsibly while inspiring trust and accessibility for everyone.