ahan bose commited on
Commit
351e529
·
0 Parent(s):

Initial commit: Ahan Bose AI Twin

Browse files
.gitignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ .env
2
+ __pycache__/
3
+ .streamlit/
app.py ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import streamlit as st
3
+ from dotenv import load_dotenv
4
+
5
+ # 1. NEW MODULAR IMPORTS (No 'langchain.chains' needed)
6
+ from langchain_community.document_loaders import TextLoader, DirectoryLoader
7
+ from langchain_text_splitters import RecursiveCharacterTextSplitter
8
+ from langchain_community.vectorstores import FAISS
9
+ from langchain_huggingface import HuggingFaceEmbeddings, HuggingFaceEndpoint, ChatHuggingFace
10
+ from langchain_core.prompts import ChatPromptTemplate
11
+ from langchain_core.runnables import RunnablePassthrough
12
+ from langchain_core.output_parsers import StrOutputParser
13
+ from sidebar import show_profile, generate_ai_summary
14
+
15
+ load_dotenv()
16
+
17
+ st.set_page_config(page_title="Ahan Bose - AI Twin", layout="wide")
18
+ st.title("🤖 Ahan Bose: AI Digital Twin ")
19
+ # CALL THE SIDEBAR FROM THE OTHER FILE
20
+ show_profile()
21
+
22
+
23
+ hf_token = os.getenv("HUGGINGFACEHUB_API_TOKEN")
24
+
25
+ @st.cache_resource
26
+ def setup_vector_db():
27
+ # Load and split docs
28
+ loader = DirectoryLoader('./knowledge_base/', glob="./*.txt", loader_cls=TextLoader)
29
+ docs = loader.load()
30
+ splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
31
+ splits = splitter.split_documents(docs)
32
+
33
+ # Setup Vector Store
34
+ embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
35
+ vectorstore = FAISS.from_documents(splits, embeddings)
36
+ return vectorstore.as_retriever()
37
+
38
+ if not hf_token:
39
+ st.error("Token missing!")
40
+ else:
41
+ try:
42
+ retriever = setup_vector_db()
43
+
44
+ # 2. SETUP LLM
45
+ llm_endpoint = HuggingFaceEndpoint(
46
+ repo_id="mistralai/Mistral-7B-Instruct-v0.2",
47
+ task = "conversational",
48
+ huggingfacehub_api_token=hf_token,
49
+ temperature=0.5
50
+ )
51
+ llm = ChatHuggingFace(llm=llm_endpoint)
52
+
53
+ # 3. DEFINE THE TEMPLATE
54
+ template = """You are Ahan Bose's AI Twin. Answer based only on the context provided:
55
+ {context}
56
+
57
+ Question: {question}
58
+ """
59
+ prompt = ChatPromptTemplate.from_template(template)
60
+
61
+ # 4. THE LCEL PIPE CHAIN (The Modern Replacement for RetrievalQA)
62
+ # This builds the chain without needing the 'langchain.chains' module
63
+ rag_chain = (
64
+ {"context": retriever, "question": RunnablePassthrough()}
65
+ | prompt
66
+ | llm
67
+ | StrOutputParser()
68
+ )
69
+ #AI SUMMARY
70
+ ai_summary = generate_ai_summary(llm)
71
+ st.write(ai_summary)
72
+
73
+ # 5. UI
74
+ query = st.text_input("Ask me something:")
75
+ if st.button("Submit") and query:
76
+ with st.spinner("Processing..."):
77
+ # Simply call invoke on the pipe
78
+ response = rag_chain.invoke(query)
79
+ st.markdown("### Answer:")
80
+ st.write(response)
81
+
82
+ except Exception as e:
83
+ st.error(f"Error: {e}")
knowledge_base/achievements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ Achievement: CAT 2024 - 98.11 percentile.
2
+ Achievement: Deloitte - Outstanding Performance & Applause Awards.
3
+ Achievement: Academic - CNR Rao & MRD Merit Scholarships.
4
+ Achievement: Extra-Curricular - Best Delegate (VIT MUN) and 3rd in WB Swimming Championship.
knowledge_base/experience.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ Experience: Deloitte USI (Consultant)
2
+ Duration: 35 Months (Jun 2022 - Jun 2025)
3
+ Key Impact: Promoted for technical leadership; ranked in top 1% of practitioners.
4
+
5
+ Experience: Continental Automotive (Intern)
6
+ Key Impact: Cut test time by 60% via automation of 15+ hardware test scripts.
7
+
8
+ Experience: PES MUN Society (Vice President)
9
+ Key Impact: Led National MUN with 200+ delegates and managed INR 1.14 Lakh budget.
knowledge_base/goals.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ Career Goal: Lead enterprise-wide digital transformations.
2
+ Career Goal: Specialize in finance data engineering and cloud automation.
3
+ Career Goal: Improve enterprise forecast precision and operational agility through AI.
knowledge_base/projects.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Project: Full-stack Cloud Data Pipelines
2
+ Role: Consultant
3
+ Details: Built pipelines for real-time P&L reporting; slashed TAT by 80%.
4
+
5
+ Project: Enterprise Profitability Planning
6
+ Role: Consultant
7
+ Details: Standardized 50+ master data elements for enterprise-wide planning.
8
+
9
+ Project: Diplomat Wars
10
+ Role: Founder/Vice President
11
+ Details: Launched intra-collegiate contest for 100+ participants.
knowledge_base/skills.txt ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Skill: Data Engineering & Cloud Pipelines
2
+ Level: Advanced
3
+ Used In: Deloitte USI - P&L reporting and financial data flows.
4
+
5
+ Skill: Automation (Logic-based)
6
+ Level: Expert
7
+ Used In: Automating 10K+ folders and cross-cloud financial data.
8
+
9
+ Skill: Financial Analytics
10
+ Level: Advanced
11
+ Used In: Inventory cost models, CAPEX visibility, and predictive analytics.
12
+
13
+ Skill: Technical Testing (CAN/Hardware)
14
+ Level: Intermediate
15
+ Used In: Continental Automotive - Airbag unit testing and script automation.
profile.txt ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Name: Ahan Bose
2
+ Email: pgp25.ahan@spjimr.org
3
+
4
+ Education:
5
+ - PGDM: S.P. Jain Institute of Management & Research (SPJIMR), Mumbai (Class of 2027)
6
+ - Bachelor of Technology (B.Tech): PES University (CGPA: 8.61/10)
7
+ - Class XII: FIITJEE PU College, Karnataka PU Board (87.67%)
8
+ - Class X: The Frank Anthony Public School, ICSE (93.67%)
9
+
10
+ Skills:
11
+ - Finance & Data Engineering: Full-stack cloud data pipelines, real-time P&L reporting, Power BI, finance planning systems integration
12
+ - Automation & Analytics: Logic-based automation, predictive analytics, cross-cloud financial data flows, CAPEX visibility
13
+ - Technical & Testing: SQL, Python, CAN protocol testing, hardware test script automation, RCA diagnostics
14
+ - Leadership: Stakeholder alignment, mentoring (8+ new hires), budget management (INR 1.14 Lakh), content strategy
15
+
16
+ Interests:
17
+ - FinTech and Data Architecture
18
+ - Model United Nations (MUN) and Debating
19
+ - Competitive Swimming
20
+
21
+ Projects:
22
+ - Full-stack Cloud Data Pipelines: Enabled real-time P&L reporting and automated financial data flows at Deloitte USI
23
+ - Enterprise Profitability Planning: Standardized 50+ master data elements for segment-wide planning
24
+ - Hardware Test Automation: Automated 15+ test scripts, reducing testing time by 60%
25
+ - Diplomat Wars: Launched an intra-collegiate contest engaging 100+ participants
26
+
27
+ Career Goals:
28
+ To leverage expertise in finance data engineering and cloud automation to lead large-scale digital transformations and improve enterprise forecast precision and operational agility.
29
+
30
+ Achievements:
31
+ - Professional: Promoted to Consultant at Deloitte USI; Ranked in top 1% of 1000+ practitioners; Outstanding Performance Award
32
+ - Academic: 98.11 percentile in CAT 2024; MRD and CNR Rao Merit Scholarships
33
+ - Extra-Curricular: Best Delegate at VIT Model UN; 3rd place in 50m and 100m breaststroke at WB District Swimming
34
+ Certifications:
35
+ - CAT 2024 (98.11%ile)
36
+ - Deloitte USI Consultant Promotion
37
+ - MRD Merit Scholarship (ECE Department)
requirements.txt ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Web Interface (Choose one, Streamlit is common for RAG)
2
+ streamlit
3
+
4
+ # RAG Framework
5
+ langchain
6
+ langchain-community
7
+ langchain-huggingface
8
+
9
+ # Embeddings and Vector Database
10
+ sentence-transformers
11
+ faiss-cpu
12
+
13
+ # Document Processing
14
+ pypdf
15
+ pandas
16
+
17
+ # Environment and API Management
18
+ python-dotenv
19
+ huggingface_hub
sidebar.py ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ from langchain_core.prompts import ChatPromptTemplate
3
+ from langchain_core.output_parsers import StrOutputParser
4
+
5
+
6
+ def generate_ai_summary(llm):
7
+ """Reads profile.txt and generates a 3-sentence professional summary."""
8
+ try:
9
+ with open("./profile.txt", "r") as f:
10
+ profile_text = f.read()
11
+
12
+ # Simple Summarization Prompt
13
+ prompt = ChatPromptTemplate.from_template(
14
+ "Summarize the following professional profile into short paragraph not more than 50 words, do not reveal email ID or phone number, and ensure to end"
15
+ "with few sentences highlighting key skills, experience, and projects. "
16
+ "Write in the third person. \n\nProfile: {text}"
17
+ )
18
+
19
+ # Fast LCEL Chain
20
+ summarizer = prompt | llm | StrOutputParser()
21
+ return summarizer.invoke({"text": profile_text})
22
+ except Exception as e:
23
+ return "AI Summary currently unavailable. Update profile.txt to enable."
24
+
25
+
26
+
27
+ def show_profile():
28
+ # --- CUSTOM CSS STYLING ---
29
+ st.markdown("""
30
+ <style>
31
+ /* Target the sidebar container */
32
+ [data-testid="stSidebar"] {
33
+ background-color: #f1f3f5 !important; /* Slightly darker grey for depth */
34
+ border-right: 2px solid #dee2e6;
35
+ }
36
+
37
+ /* Force all text in the sidebar to be Dark Grey/Black */
38
+ [data-testid="stSidebar"] .stText,
39
+ [data-testid="stSidebar"] p,
40
+ [data-testid="stSidebar"] li,
41
+ [data-testid="stSidebar"] span {
42
+ color: #212529 !important; /* Professional Dark Grey */
43
+ font-weight: 400;
44
+ }
45
+
46
+ /* Style the Subheaders specifically */
47
+ [data-testid="stSidebar"] h2,
48
+ [data-testid="stSidebar"] h3 {
49
+ color: #0d6efd !important; /* Blue for headers */
50
+ font-weight: 700 !important;
51
+ }
52
+
53
+ /* Style the Profile Name */
54
+ .profile-name {
55
+ font-size: 26px;
56
+ font-weight: 800;
57
+ color: #1a73e8 !important;
58
+ text-align: center;
59
+ margin-bottom: 10px;
60
+ }
61
+
62
+ /* Style the AI Summary Box */
63
+ .stInfo {
64
+ background-color: #ffffff !important;
65
+ color: #212529 !important;
66
+ border: 1px solid #ced4da !important;
67
+ }
68
+ </style>
69
+ """, unsafe_allow_html=True)
70
+
71
+ # --- SIDEBAR CONTENT ---
72
+ with st.sidebar:
73
+ # headshot image
74
+ st.image("https://media.licdn.com/dms/image/v2/D5603AQHnuwh4mMnwYg/profile-displayphoto-crop_800_800/B56ZjcS71_HUAI-/0/1756042608528?e=1772064000&v=beta&t=yer-pM8z72mJMF7Yg_nGDSeNCAT3YD2ybpj__AmxKaI", width=150)
75
+ st.markdown('<p class="profile-name">Ahan Bose</p>', unsafe_allow_html=True)
76
+
77
+ st.write("📍 **Mumbai, India**")
78
+ st.write("💼 **SPJIMR MBA**")
79
+
80
+ st.divider()
81
+
82
+ st.subheader("About Me")
83
+ st.caption("""
84
+ I build intelligent systems using LangChain and Hugging Face.
85
+ This Digital Twin is powered by a RAG pipeline to answer
86
+ questions about my career and projects.
87
+ """)
88
+
89
+ st.divider()
90
+
91
+ st.subheader("Connect")
92
+ st.markdown('<a href="www.linkedin.com/in/ahan-bose-spjimr" class="social-badge">LinkedIn</a>', unsafe_allow_html=True)
93
+
synthetic_data/synthetic_projects.txt ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Professional Case Study 1: Real-Time Financial Digital Twin for Enterprise P&L
2
+
3
+ Context
4
+ A global professional services firm faced delays in financial visibility due to fragmented finance systems and batch-based reporting. Leadership required a near–real-time view of profitability across service lines to improve forecasting accuracy and decision-making.
5
+
6
+ Approach
7
+ As part of the finance data engineering team, I designed a full-stack cloud data pipeline that ingested transactional, budgeting, and forecast data from multiple source systems. The solution standardized financial master data and enabled cross-cloud data synchronization. Automated validation checks and reconciliation logic ensured data accuracy before downstream consumption.
8
+
9
+ Outcome
10
+ The digital twin enabled real-time P&L reporting with significantly reduced manual intervention. Finance leaders gained faster insights into margin movements and cost drivers, improving forecast precision and enabling proactive cost optimization. The solution became a reference architecture for similar implementations across other business units.
11
+
12
+ Professional Case Study 2: Enterprise Profitability Planning & Forecast Standardization
13
+
14
+ Context
15
+ A large enterprise struggled with inconsistent profitability planning due to non-standard master data definitions across regions and business segments. This led to forecast mismatches and prolonged planning cycles.
16
+
17
+ Approach
18
+ I supported the design and implementation of a centralized profitability planning framework. Over 50 master data elements—covering cost centers, revenue categories, and allocation drivers—were standardized and integrated into the finance planning system. Automated data quality rules were embedded to flag inconsistencies at source.
19
+
20
+ Outcome
21
+ The standardized planning model reduced forecast variance and shortened planning cycles. Stakeholders gained confidence in scenario analysis outputs, enabling faster strategic decisions during quarterly reviews. The initiative materially improved enterprise-wide financial governance.
22
+
23
+ Synthetic Project Descriptions
24
+
25
+ Project 1: Cloud-Based Financial Digital Twin Architecture
26
+
27
+ Designed a scalable cloud data architecture to mirror enterprise financial operations in near real time. The solution integrated actuals, forecasts, and budgets to simulate financial outcomes under different business scenarios, supporting leadership decision-making and financial stress testing.
28
+
29
+ Project 2: Automated CAPEX Visibility & Forecast Analytics
30
+
31
+ Built an automated CAPEX tracking and analytics layer using SQL and Python to consolidate spend data across projects. Implemented predictive analytics to flag potential overruns, improving capital allocation discipline and reducing forecast surprises.
32
+
33
+ Project 3: Cross-Cloud Financial Data Orchestration
34
+
35
+ Developed logic-based automation to orchestrate financial data flows across multiple cloud environments. The system ensured consistency between finance planning tools and reporting dashboards, reducing manual reconciliations and improving reporting reliability.
synthetic_data/sythentic_experience.txt ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ SYNTHETIC DATA - LEADERSHIP EXPERIENCE
2
+ Role: Global Cross-Functional Task Force Lead
3
+ Organization: Deloitte Digital Transformation Group (Synthetic Extension)
4
+ Experience: Led a team of 10 junior engineers and 4 functional consultants to resolve a critical P&L integration blocker during a high-stakes migration.
5
+ Actions:
6
+ - Mentored 5 new hires on cloud data pipeline best practices to accelerate role transitions.
7
+ - Acted as the primary technical representative in "War Room" sessions, communicating risks directly to C-suite stakeholders.
8
+ - Standardized cross-functional communication protocols, leading to a 30% reduction in issue resolution time.