Thisaraa13 commited on
Commit
1f14da1
·
1 Parent(s): c6bdf71

initial commit

Browse files
Files changed (8) hide show
  1. Dockerfile +42 -0
  2. ragas_eval.py +0 -0
  3. ragas_results.txt +12 -0
  4. requirements.txt +191 -0
  5. server.py +53 -0
  6. sethu_agent.py +203 -0
  7. update_kb.py +43 -0
  8. vector_db.ipynb +924 -0
Dockerfile ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use the specific Python version you requested
2
+ FROM python:3.14.3-slim
3
+
4
+ # Set the working directory in the container
5
+ WORKDIR /code
6
+
7
+ # Copy the requirements file first to leverage Docker caching
8
+ COPY requirements.txt .
9
+
10
+ # Install dependencies
11
+ RUN pip install --no-cache-dir --upgrade -r requirements.txt
12
+
13
+ # Copy the rest of the application code
14
+ COPY . .
15
+
16
+ # Hugging Face Spaces requires the app to run on port 7860
17
+ # We use Gunicorn for a stable production server
18
+ # 'server:app' assumes your Flask object is named 'app' inside 'server.py'
19
+ CMD ["gunicorn", "-b", "0.0.0.0:7860", "server:app"]
20
+ # Use the current Python 3.14 stable image
21
+ FROM python:3.14-slim
22
+
23
+ # Set the working directory
24
+ WORKDIR /code
25
+
26
+ # In 2026, most ML libs still require libgomp for parallel processing (FAISS/Torch)
27
+ RUN apt-get update && apt-get install -y \
28
+ libgomp1 \
29
+ && rm -rf /var/lib/apt/lists/*
30
+
31
+ # Copy requirements
32
+ COPY requirements.txt .
33
+
34
+ # Upgrade pip and install (3.14 handles dependencies much faster!)
35
+ RUN pip install --no-cache-dir --upgrade -r requirements.txt
36
+
37
+ # Copy your server.py and other files
38
+ COPY . .
39
+
40
+ # Launch with Gunicorn
41
+ # Note: Ensure 'gunicorn' was added to your requirements.txt
42
+ CMD ["gunicorn", "--timeout", "120", "-b", "0.0.0.0:7860", "server:app"]
ragas_eval.py ADDED
File without changes
ragas_results.txt ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ========== RAGAS RUN @ 2026-05-05 14:46:29 ==========
3
+ {'faithfulness': 0.764, 'answer_relevancy': 0.9803, 'context_recall': 1.0000}
4
+ Per-question scores:
5
+ user_input retrieved_contexts response reference faithfulness answer_relevancy context_recall
6
+ What is Express Entry? [Canada.ca Immigration and citizenship Live in Canada \nExpress Entry\nExpress Entry is an online system that we use to manage immigration\napplications from skilled workers.\nThere are 3 immigration programs managed through Express Entry:\nCanadian Experience Class\nFederal Skilled Worker Program\nFederal Skilled Trades Program\nHow the Express Entry process works:\n1. Create a profile and enter the pool.\n2. We'll invite the candidates with the most points in rounds.\n3. If you're invited to apply, fill out the application.\n4. We'll review your application and make a decision.\nProcessing times\nVaries by program\nFees\nYour application: $CAN 1,525\nFees for your family members:\nSpouse: $CAN 1,525\nDependent child: $CAN 260\n29/03/2026, 11:06 Immigrate through Express Entry - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry.html 1/2, Date modified: 2025-08-21\n29/03/2026, 11:14 Immigrate through Express Entry: Documents - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/documents.html 3/3, Rounds of invitations\nDate modified: 2025-09-29\n\n29/03/2026, 11:17 Express Entry: Create your profile and enter the pool - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/create-profile.html 6/6, pool. It’s used to assess your:\nskills\neducation\nlanguage ability\nwork experience\nother factors\nC a l c u l a t e y o u r s c o r e\n29/03/2026, 11:12 Express Entry: Check your score - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/check-score.html 3/4, On this page\nImmigration programs\nKey differences in eligibility criteria\nCheck if you're eligible\nProvincial Nominee Program (PNP)\nImmigration programs\nExpress Entry is an online system that we use to manage immigration\napplications for 3 programs:\nKey differences in eligibility criteria\nBefore you compare programs, here are 3 terms you should know:\nCanadian Experience Class\nFor skilled workers with eligible Canadian work experience\n\nFederal Skilled Worker Program\nFor skilled workers with eligible foreign or Canadian work\nexperience\n\nFederal Skilled Trades Program\nFor skilled workers who are qualified in a skilled trade\n\n29/03/2026, 11:07 Express Entry: Who can apply - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/who-can-apply.html#programs 2/4] **Express Entry** is an online system that Canada uses to manage immigration applications from skilled workers. It handles three federal programs:\n\n1. **Canadian Experience Class** – for people who already have eligible work experience in Canada. \n2. **Federal Skilled Worker Program** – for skilled workers with foreign or Canadian work experience. \n3. **Federal Skilled Trades Program** – for skilled workers who are qualified in a skilled trade.\n\nWhen you create an Express Entry profile, you enter a pool of candidates. Your profile is scored with the **Comprehensive Ranking System (CRS)**, which looks at your skills, education, language ability, work experience, and other factors such as a provincial nomination or a job offer. Candidates with the highest CRS scores are invited to apply during periodic invitation rounds. If you receive an invitation, you submit a full application, and Immigration, Refugees and Citizenship Canada (IRCC) reviews it and makes a decision. Express Entry is an online system used by the Canadian government to manage immigration applications from skilled workers. It covers three programs: Canadian Experience Class, Federal Skilled Worker Program, and Federal Skilled Trades Program. NaN NaN NaN
7
+ What are the eligibility criteria for Express Entry? [On this page\nImmigration programs\nKey differences in eligibility criteria\nCheck if you're eligible\nProvincial Nominee Program (PNP)\nImmigration programs\nExpress Entry is an online system that we use to manage immigration\napplications for 3 programs:\nKey differences in eligibility criteria\nBefore you compare programs, here are 3 terms you should know:\nCanadian Experience Class\nFor skilled workers with eligible Canadian work experience\n\nFederal Skilled Worker Program\nFor skilled workers with eligible foreign or Canadian work\nexperience\n\nFederal Skilled Trades Program\nFor skilled workers who are qualified in a skilled trade\n\n29/03/2026, 11:07 Express Entry: Who can apply - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/who-can-apply.html#programs 2/4, Apply for permanent residence\nmeet the minimum criteria for Express Entry, including being eligible\nfor 1 of the 3 immigration programs it covers, and\nmeet all of the requirements in the instructions for that round\nWhen we hold a category-based round of invitations, we’ll:\nrank candidates in the pool who meet the category requirements,\nand\ninvite top-ranking candidates to apply for permanent residence\n* Find out who's eligible for each category: (required)\nFrench-language proficiency\nHealthcare and social services occupations\nScience, technology, engineering and math (STEM) occupations\nTrade occupations\nEducation occupations\nTransport occupations\nPhysicians with Canadian work experience\nSenior managers with Canadian work experience\nResearchers with Canadian work experience\nSkilled military recruits\n\n29/03/2026, 11:18 Express Entry: Category-based selection - Canada.ca, Entry Year-End Report\nmonitor and assess categories on a regular basis\nWho's eligible for each category\nTo be eligible for an invitation through a category-based round, you\nmust:\n29/03/2026, 11:18 Express Entry: Category-based selection - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/rounds-invitations/category-based-selection.html 3/5, a completed educational credential and\nan Educational Credential Assessment for immigration purposes\nSelection factors\nIf you meet all the minimum requirements, we'll use selection factor\npoints to help assess your eligibility.\nWe’ll assess your selection factors and assign an overall score out of 100.\nIf you score 67 points or higher, you may qualify for the Federal\nSkilled Worker Program. You can then submit a profile to the Express\nEntry pool.\nIf you score lower than 67 points, you won’t qualify for the\nprogram.\nThese points are different from the points we use to rank your profile in\nthe Express Entry pool.\nLanguage skills (maximum 28 points)\nEducation (maximum 25 points)\nSkilled work experience (maximum 15 points)\nAge (maximum 12 points)\nArranged employment in Canada (maximum 10\npoints)\nAdaptability (maximum 10 points)\n29/03/2026, 11:10 Express Entry: Federal Skilled Worker Program - Canada.ca, If you’re eligible for one of these programs, you can also apply through\nExpress Entry for the Provincial Nominee Program.\n29/03/2026, 11:07 Express Entry: Who can apply - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/who-can-apply.html#programs 3/4] **Eligibility criteria for Express Entry**\n\n1. **Be eligible for one of the three Express‑Entry‑covered programs** \n * Canadian Experience Class – skilled workers with eligible Canadian work experience \n * Federal Skilled Worker Program – skilled workers with eligible foreign or Canadian work experience \n * Federal Skilled Trades Program – skilled workers qualified in a skilled trade \n\n2. **If you want to apply under the Federal Skilled Worker Program** \n * You must score **at least 67 points** on the selection‑factor points system (out of 100). \n * Points are awarded in six areas: \n * Language skills – up to 28 points \n * Education – up to 25 points \n * Skilled work experience – up to 15 points \n * Age – up to 12 points \n * Arranged employment in Canada – up to 10 points \n * Adaptability – up to 10 points \n\n3. **For category‑based selection rounds** \n * You must meet the general eligibility above (i.e., be eligible for one of the three programs). \n * You must also satisfy the specific category requirements for that round (e.g., French‑language proficiency, STEM occupation, trade occupation, etc.). \n\nThese are the only criteria that determine whether you can enter the Express Entry pool. If you do not meet them, you will not be eligible to apply. To be eligible for Express Entry, candidates need skilled work experience, language proficiency in English or French, and must meet the requirements of at least one of the three federal economic immigration programs. NaN NaN NaN
8
+ What documents do I need for Express Entry? [Date modified: 2025-08-21\n29/03/2026, 11:14 Immigrate through Express Entry: Documents - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/documents.html 3/3, documents into English or French depending on where you settle.\nCommon documents include: \n• passports or travel documents\n• all education diplomas and certificates\n• English and/or French Language Proficiency scores\n• official vaccination records\n• medical records\n• dental and eye records\n• driver’s licences\n• birth and death certificates\n• marriage or divorce certificates\n• proof of work experience\nTo learn more about what documents to gather before arrival, consult: Crossing the border: documents you need (web page is \navailable in English and French).\nPrepare a list of the personal items you will be sending to Canada. To obtain the required form for completion, consult: Personal \nEffects Accounting Document (web page is available in English and French).\nIMPORTANT THINGS TO DO \nBEFORE YOU ARRIVE IN CANADA, Canada.ca Immigration and citizenship Live in Canada\nImmigrate through Express Entry Immigrate through Express Entry: Documents\nIRCC sign in\n \n \nExpress Entry\nWho can apply\nCheck your score\nGet your documents ready\nLanguage test results\nEducation credential assessment\nJob offer\nProof of funds\nPolice certificates\nCreate your profile\nRounds of invitations\nApply for permanent residence\nAfter you apply\nIf we approve your application\n29/03/2026, 11:16 Documents for Express Entry: Proof of funds - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/documents/proof-funds.html 1/5, Federal Skilled Worker Program\nFederal Skilled Trades Program\n\n29/03/2026, 11:16 Documents for Express Entry: Proof of funds - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/documents/proof-funds.html 2/5, a permanent resident (PR) visa (if you're from a country that needs a\nvisa)\nYour COPR will have information about who you are, as well as your\nphotograph.\nCheck to make sure your information is correct. It should be the same as\nthe information on your passport. If there is a mistake on your COPR,\ncontact us through your account.\nWe can’t extend your COPR, so you should make sure to travel to\nCanada before it expires.\nYou’ll need to show your COPR to the border services offer when you\narrive in Canada. If you’re already in Canada, you’ll use your COPR to\napply for government services.\nTell us where you applied from\nDate modified: 2025-08-21\nIn Canada\n\nOutside Canada\n\n29/03/2026, 11:20 If your Express Entry application is approved - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/application-approved.html 2/2] **Documents you’ll need for a Canadian Express Entry application**\n\n1. **Passport or other travel documents** – must be valid for the whole stay. \n2. **Language test results** – IELTS, CELPIP, TEF, or another approved test. \n3. **Educational Credential Assessment (ECA)** – if you earned a diploma or certificate outside Canada. \n4. **Proof of funds** – recent bank or investment statements that show you have enough money to support yourself and any family members. \n5. **Police certificates** – from every country where you lived for 6 months or more since age 18. \n6. **Job offer letter** – if you have one and want to include it in your profile. \n7. **Birth certificates** – for you and any dependents. \n8. **Marriage or divorce certificate** – if applicable. \n9. **Proof of work experience** – reference letters, employment contracts, or pay stubs. \n10. **Official vaccination records** – COVID‑19 and any other required vaccines. \n11. **Medical records** – if needed for the medical exam. \n12. **Dental and eye records** – if required. \n13. **Driver’s licence** – if you plan to drive in Canada (optional). \n\nThese are the documents listed in the official guidance for a complete Express Entry application. Documents needed include passport, educational credentials, language test results, proof of work experience, police clearance certificates, and medical examination results. NaN NaN NaN
9
+ How does the CRS score work? [Date modified: 2025-08-21\nGet your documents ready\nTo get an invitation to apply, your CRS score must be above the\nminimum points score of your round of invitations. Cut-off scores may\nvary each round.\nView the full CRS criteria and how points are calculated\nWhen to use this tool\nUse this tool if:\nyou’re eligible for at least 1 Express Entry program and you have not\nfilled out an Express Entry profile but you’d like to see what your CRS\nscore might be if you do, or\nyou were invited to apply for permanent residence, and you want to\nsee if a change to your profile will affect your CRS score\n\n29/03/2026, 11:12 Express Entry: Check your score - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/check-score.html 4/4, Express Entry profile so you stay eligible for these programs.\nCRS calculator\nThis tool will help you calculate your Comprehensive Ranking System\n(CRS) score based on the answers you provide below.\nLegal disclaimer\n\n29/03/2026, 11:12 Express Entry: Check your score - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/check-score.html 2/4, This tool is intended solely for general guidance and reference\npurposes.\nIn the event of any discrepancy between the results of this\nquestionnaire and that provided by the Express Entry electronic\nsystem, the results provided by the system shall govern, in\naccordance with provisions of the Immigration and Refugee\nProtection Act, the Immigration and Refugee Protection\nRegulations, and Minister's Instructions issued under IRPA\ns.10.3.\nThis tool will be updated from time to time in accordance with\nchanges to the Ministerial Instructions governing Express Entry.\n1) What is your marital status?\nSelect...\nWhat is a CRS score\nThe Comprehensive Ranking System (CRS) is a points-based system that\nwe use to assess and score your profile and rank it in the Express Entry\npool. It’s used to assess your:\nskills\neducation\nlanguage ability\nwork experience\nother factors\nC a l c u l a t e y o u r s c o r e\n29/03/2026, 11:12 Express Entry: Check your score - Canada.ca, What is a CRS score\nWhen to use this tool\nWe no longer give points for job offers\nAs of March 25, 2025, we’re removing job offer points from the\nComprehensive Ranking System for current and future candidates\nin the Express Entry pool. This includes job offers:\nclassified under Major Group 00 of the National Occupation\nClassification (for example, senior management\npositions) (200 points)\nin any other skilled occupation (50 points)\nIt may take a few days for your score to show correctly.\nWhat isn’t changing\nIf having a valid job offer is part of your eligibility criteria, this\nwon’t change. This includes:\nthe Federal Skilled Trades Program\nthe Federal Skilled Worker Program\nsome streams under provincial nominee programs\nYou should still include the details of any job offers in your\nExpress Entry profile so you stay eligible for these programs.\nCRS calculator\nThis tool will help you calculate your Comprehensive Ranking System\n(CRS) score based on the answers you provide below., The Comprehensive Ranking System (CRS) is a points-based system that\nwe use to assess and score your profile and rank it in the Express Entry\npool.\nOn this page\nCore/human capital factors\nSpouse or common-law partner factors\nSkill transferability factors\nAdditional points\nWe no longer give points for job offers\nAs of March 25, 2025, we’re removing job offer points from the\nComprehensive Ranking System for current and future candidates\nin the Express Entry pool. This includes job offers:\nclassified under Major Group 00 of the National Occupation\nClassification (for example, senior management\npositions) (200 points)\nin any other skilled occupation (50 points)\nIt may take a few days for your score to show correctly.\nWhat isn’t changing\nA. Core/human capital factors\nYou will earn points as if you don’t have a spouse or partner if:\nthey are not coming with you to Canada, or\nthey are a Canadian citizen or permanent resident\n] **How the CRS score works**\n\nThe Comprehensive Ranking System (CRS) is a points‑based system that ranks all candidates in the Express Entry pool. \nYour score is calculated from the following areas:\n\n1. **Skills, education, language ability, and work experience** – These are the main categories that give you points. \n2. **Other factors** – Such as a valid job offer (which no longer gives points but is still required for some programs), a provincial nomination, or a sibling in Canada. \n3. **Spouse or common‑law partner** – Points for a partner are counted only if the partner is coming to Canada or is not a Canadian citizen/PR. If the partner is a citizen/PR or not coming, the system treats it as if no spouse is included.\n\n**Key points to remember**\n\n- **Job‑offer points were removed on 25 March 2025**. \n - 200 points used to be awarded for offers in Major Group 00 (senior management). \n - 50 points used to be awarded for offers in any other skilled occupation. \n These points are no longer added to your CRS score, but a job offer is still a mandatory eligibility requirement for certain Express Entry programs.\n\n- **Cut‑off scores vary**. Each invitation round has a minimum CRS score. You must score **above** that cut‑off to receive an invitation to apply.\n\n- **Calculator vs. official system**. \n 1. Use the CRS calculator on Canada.ca to estimate your score. \n 2. Enter the same information into your Express Entry profile. \n 3. If the calculator’s result differs from the official Express Entry system, the system result is the one that counts (per IRPA s.10.3).\n\n- **Profile updates**. If you change any information (e.g., language test results, work experience), re‑run the calculator and update your profile. The system will recalculate your CRS score automatically.\n\n- **Invitation to apply**. Once you receive an invitation, the cut‑off score for that round will be the threshold you need to exceed.\n\nIn short, the CRS score is a total of points earned from your personal profile. It is compared against the current invitation threshold, and the official Express Entry system’s calculation is the final authority. The Comprehensive Ranking System (CRS) is a points-based system used to assess and score candidates in the Express Entry pool. Points are awarded for factors like age, education, work experience, and language skills. NaN NaN NaN
10
+ How do I create an Express Entry profile? [Create your profile\nHow we rank your profile\nKeep your profile up-to-date\nWhat to do while you're in the pool\nIf your work permit expires\nHow the pools work\nIf you want to come to Canada as a skilled immigrant, your first step is to\ncreate your profile and submit it to the Express Entry pool.\nIf you’re eligible, we’ll:\nplace you in a pool with others who are also eligible\ngive you a score and rank you using several factors\ninvite you to apply to be a permanent resident of Canada (if you’re\none of the top candidates)\nCompleting an online Express Entry profile or entering the pool doesn’t\nguarantee that we’ll invite you to apply for permanent residence. You\ncan only apply if we send you an invitation.\nIf we find that you gave us false information or left out important\ndetails about your history or experience, we can:\nrefuse your application\nfind you inadmissible\nbar you for 5 years from applying to come to Canada\nFind out more about consequences for immigration and citizenship\nfraud., Rounds of invitations\nDate modified: 2025-09-29\n\n29/03/2026, 11:17 Express Entry: Create your profile and enter the pool - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/create-profile.html 6/6, Canada.ca Immigration and citizenship Live in Canada\nImmigrate through Express Entry\nIRCC sign in\n \n\nCreate your profile and enter the\npool\nOn this page\nHow the pools work\nExpress Entry\nWho can apply\nCheck your score\nGet your documents ready\nCreate your profile\nRounds of invitations\nApply for permanent residence\nAfter you apply\nIf we approve your application\n29/03/2026, 11:17 Express Entry: Create your profile and enter the pool - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/create-profile.html 1/6, 2. Sign in to your IRCC secure account.\n3. Enter your personal reference code when it prompts you (if you used\nthe Come to Canada tool).\n4. Enter your personal details to create a profile.\nYou may exit the profile at any time. Your information will be\nsaved in your account.\nYou won’t be asked to enter information about your dependent\nchildren. We’ll only ask you for this information if you’re invited\nto apply.\n5. Submit your profile.\nYou have 60 days to complete and submit your Express Entry\nprofile. Otherwise, you’ll have to start again.\n6. Based on the information you enter, we’ll figure out which\nprogram(s) you’re eligible for and may be invited to apply to.\nIf you’re eligible for more than one program\nSign in to your IRCC secure account\nHow we rank your profile\nYou get points based on the information in your profile, including your\nskills and experience.\nTo be chosen from the pool and invited to apply, you need to have one of, Canada.ca Immigration and citizenship Live in Canada \nExpress Entry\nExpress Entry is an online system that we use to manage immigration\napplications from skilled workers.\nThere are 3 immigration programs managed through Express Entry:\nCanadian Experience Class\nFederal Skilled Worker Program\nFederal Skilled Trades Program\nHow the Express Entry process works:\n1. Create a profile and enter the pool.\n2. We'll invite the candidates with the most points in rounds.\n3. If you're invited to apply, fill out the application.\n4. We'll review your application and make a decision.\nProcessing times\nVaries by program\nFees\nYour application: $CAN 1,525\nFees for your family members:\nSpouse: $CAN 1,525\nDependent child: $CAN 260\n29/03/2026, 11:06 Immigrate through Express Entry - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry.html 1/2] **How to create an Express Entry profile**\n\n1. **Sign in to your IRCC secure account** – use the login that you set up for Canadian immigration services. \n2. **Enter your personal reference code** – if you used the “Come to Canada” tool before, type the code it gave you. \n3. **Fill in your personal details** – provide the information requested (name, date of birth, contact details, etc.). \n4. **Submit your profile** – you have 60 days from the start of this step to finish and submit. If you miss that window, you’ll need to start over. \n\nAfter you submit, IRCC will:\n\n- Check which of the three Express Entry‑managed programs (Canadian Experience Class, Federal Skilled Worker Program, or Federal Skilled Trades Program) you qualify for. \n- Add you to the pool of candidates. \n- If your Comprehensive Ranking System (CRS) score is high enough, you may receive an invitation to apply for permanent residence. \n\nOnce invited, you’ll need to complete the permanent‑resident application, gather the required documents (passport, language test, ECA report, work experience letters, proof of funds, etc.), and pay the fees. IRCC will then review your application and make a decision. To create an Express Entry profile, you need to answer questions about your skills, work experience, language ability, education, and other details. If you meet the criteria, you will be placed in the Express Entry pool. NaN NaN NaN
11
+ What is the Federal Skilled Worker Program? [The Federal Skilled Worker Program is for skilled workers who have\nforeign work experience and want to become permanent residents.\nOn this page\nSkilled workers\nMinimum requirements\nSelection factors\nProof of funds\nAdmissibility\nWhere you can live in Canada\nSkilled workers\nWe choose skilled workers based on education, work experience,\nlanguage skills, and other criteria. They include:\nmanagers and supervisors\npeople in jobs that usually need:\na university degree\na college diploma, or\napprenticeship training\nMinimum requirements\nSkilled work experience\nFirst, find out the National Occupational Classification (NOC) for your job.\nThen check if it meets the requirements for skilled work experience.\nFind your NOC\n29/03/2026, 11:10 Express Entry: Federal Skilled Worker Program - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/who-can-apply/federal-skilled-workers.html 2/7, Federal Skilled Worker Program\nFederal Skilled Trades Program\n\n29/03/2026, 11:16 Documents for Express Entry: Proof of funds - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/documents/proof-funds.html 2/5, The Federal Skilled Trades Program is for skilled workers who want to\nbecome permanent residents based on being qualified in a skilled trade.\nOn this page\nSkilled trade sectors\nSkilled trades work experience\nJob offer or certificate of qualification\nLanguage ability\nEducation\nProof of funds\nAdmissibility\nWhere you can live in Canada\nSkilled trade sectors\nSkilled trades need specialized knowledge and hands-on work. You’ll\nneed an apprenticeship for most trades.\nSkilled tradespeople may work in sectors such as:\nconstruction\ntransportation\nmanufacturing and industrial\nnatural resources and agriculture\nSkilled trades work experience\nFirst, find out the National Occupational Classification (NOC) for your job.\nThen check if it meets the requirements for skilled work experience.\nFind your NOC\n29/03/2026, 11:11 Express Entry: Federal Skilled Trades Program - Canada.ca, Then check if it meets the requirements for skilled work experience.\nFind your NOC\n29/03/2026, 11:11 Express Entry: Federal Skilled Trades Program - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/who-can-apply/federal-skilled-trades.html 2/7, to be eligible.\nChoose the NOC that most closely aligns with your work experience.\nThis will:\navoid processing delays\nmake sure you’re assessed against the correct program\n\n29/03/2026, 11:11 Express Entry: Federal Skilled Trades Program - Canada.ca\nhttps://www.canada.ca/en/immigration-refugees-citizenship/services/immigrate-canada/express-entry/who-can-apply/federal-skilled-trades.html 3/7] The Federal Skilled Worker Program is a way for skilled workers who have worked abroad to become permanent residents of Canada. \n\nKey points:\n\n- **Who can apply?** \n - Skilled workers with foreign work experience. \n - Includes managers, supervisors, and people in jobs that normally need a university degree, a college diploma, or apprenticeship training.\n\n- **What does the program look for?** \n - Skilled work experience that matches a National Occupational Classification (NOC) that the program accepts. \n - Education, language skills, and other criteria are also considered.\n\n- **How to start?** \n 1. Find the NOC code for your job. \n 2. Check that this NOC meets the program’s skilled‑work‑experience requirements.\n\nThe program is part of Canada’s Express Entry system and focuses on selecting candidates who can contribute to the Canadian economy. The Federal Skilled Worker Program is for skilled workers with foreign work experience who want to immigrate to Canada permanently based on their ability to become economically established. NaN 0.980285 1.0
12
+ ===============================================
requirements.txt ADDED
@@ -0,0 +1,191 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ aiohappyeyeballs==2.6.1
2
+ aiohttp==3.13.3
3
+ aiosignal==1.4.0
4
+ annotated-doc==0.0.4
5
+ annotated-types==0.7.0
6
+ anyio==4.12.1
7
+ appdirs==1.4.4
8
+ appnope==0.1.4
9
+ argon2-cffi==25.1.0
10
+ argon2-cffi-bindings==25.1.0
11
+ arrow==1.4.0
12
+ asttokens==3.0.1
13
+ async-lru==2.3.0
14
+ attrs==25.4.0
15
+ babel==2.18.0
16
+ beautifulsoup4==4.14.3
17
+ bleach==6.3.0
18
+ blinker==1.9.0
19
+ certifi==2026.2.25
20
+ cffi==2.0.0
21
+ charset-normalizer==3.4.5
22
+ click==8.3.1
23
+ comm==0.2.3
24
+ dataclasses-json==0.6.7
25
+ datasets==4.8.4
26
+ debugpy==1.8.20
27
+ decorator==5.2.1
28
+ deep-translator==1.11.4
29
+ defusedxml==0.7.1
30
+ dill==0.4.1
31
+ diskcache==5.6.3
32
+ distro==1.9.0
33
+ docstring_parser==0.17.0
34
+ executing==2.2.1
35
+ faiss-cpu==1.13.2
36
+ fastjsonschema==2.21.2
37
+ filelock==3.25.2
38
+ Flask==3.1.3
39
+ flask-cors==6.0.2
40
+ fqdn==1.5.1
41
+ frozenlist==1.8.0
42
+ fsspec==2026.2.0
43
+ groq==0.37.1
44
+ h11==0.16.0
45
+ hf-xet==1.4.2
46
+ httpcore==1.0.9
47
+ httpx==0.28.1
48
+ httpx-sse==0.4.3
49
+ huggingface_hub==1.7.1
50
+ idna==3.11
51
+ instructor==1.14.5
52
+ ipykernel==7.2.0
53
+ ipython==9.12.0
54
+ ipython_pygments_lexers==1.1.1
55
+ ipywidgets==8.1.8
56
+ isoduration==20.11.0
57
+ itsdangerous==2.2.0
58
+ jedi==0.19.2
59
+ Jinja2==3.1.6
60
+ jiter==0.11.1
61
+ joblib==1.5.3
62
+ json5==0.14.0
63
+ jsonpatch==1.33
64
+ jsonpointer==3.0.0
65
+ jsonschema==4.26.0
66
+ jsonschema-specifications==2025.9.1
67
+ jupyter==1.1.1
68
+ jupyter-console==6.6.3
69
+ jupyter-events==0.12.0
70
+ jupyter-lsp==2.3.0
71
+ jupyter_client==8.8.0
72
+ jupyter_core==5.9.1
73
+ jupyter_server==2.17.0
74
+ jupyter_server_terminals==0.5.4
75
+ jupyterlab==4.5.6
76
+ jupyterlab_pygments==0.3.0
77
+ jupyterlab_server==2.28.0
78
+ jupyterlab_widgets==3.0.16
79
+ langchain==1.2.12
80
+ langchain-classic==1.0.3
81
+ langchain-community==0.4.1
82
+ langchain-core==1.2.22
83
+ langchain-groq==1.1.2
84
+ langchain-huggingface==1.2.1
85
+ langchain-openai==1.1.12
86
+ langchain-text-splitters==1.1.1
87
+ langgraph==1.1.2
88
+ langgraph-checkpoint==4.0.1
89
+ langgraph-prebuilt==1.0.8
90
+ langgraph-sdk==0.3.11
91
+ langsmith==0.7.17
92
+ lark==1.3.1
93
+ markdown-it-py==4.0.0
94
+ MarkupSafe==3.0.3
95
+ marshmallow==3.26.2
96
+ matplotlib-inline==0.2.1
97
+ mdurl==0.1.2
98
+ mistune==3.2.0
99
+ mpmath==1.3.0
100
+ multidict==6.7.1
101
+ multiprocess==0.70.19
102
+ mypy_extensions==1.1.0
103
+ nbclient==0.10.4
104
+ nbconvert==7.17.0
105
+ nbformat==5.10.4
106
+ nest-asyncio==1.6.0
107
+ networkx==3.6.1
108
+ notebook==7.5.5
109
+ notebook_shim==0.2.4
110
+ numpy==2.4.4
111
+ openai==2.29.0
112
+ orjson==3.11.7
113
+ ormsgpack==1.12.2
114
+ packaging==26.0
115
+ pandas==3.0.1
116
+ pandocfilters==1.5.1
117
+ parso==0.8.6
118
+ pexpect==4.9.0
119
+ pillow==12.1.1
120
+ platformdirs==4.9.4
121
+ prometheus_client==0.24.1
122
+ prompt_toolkit==3.0.52
123
+ propcache==0.4.1
124
+ psutil==7.2.2
125
+ ptyprocess==0.7.0
126
+ pure_eval==0.2.3
127
+ pyarrow==23.0.1
128
+ pycparser==3.0
129
+ pydantic==2.12.5
130
+ pydantic-settings==2.13.1
131
+ pydantic_core==2.41.5
132
+ Pygments==2.19.2
133
+ pypdf==6.9.2
134
+ python-dateutil==2.9.0.post0
135
+ python-dotenv==1.2.2
136
+ python-json-logger==4.1.0
137
+ PyYAML==6.0.3
138
+ pyzmq==27.1.0
139
+ ragas==0.4.3
140
+ referencing==0.37.0
141
+ regex==2026.2.28
142
+ requests==2.32.5
143
+ requests-toolbelt==1.0.0
144
+ rfc3339-validator==0.1.4
145
+ rfc3986-validator==0.1.1
146
+ rfc3987-syntax==1.1.0
147
+ rich==14.3.3
148
+ rpds-py==0.30.0
149
+ safetensors==0.7.0
150
+ scikit-learn==1.8.0
151
+ scikit-network==0.33.5
152
+ scipy==1.17.1
153
+ Send2Trash==2.1.0
154
+ sentence-transformers==5.3.0
155
+ setuptools==82.0.1
156
+ shellingham==1.5.4
157
+ six==1.17.0
158
+ sniffio==1.3.1
159
+ soupsieve==2.8.3
160
+ SQLAlchemy==2.0.48
161
+ stack-data==0.6.3
162
+ sympy==1.14.0
163
+ tenacity==9.1.4
164
+ terminado==0.18.1
165
+ threadpoolctl==3.6.0
166
+ tiktoken==0.12.0
167
+ tinycss2==1.4.0
168
+ tokenizers==0.22.2
169
+ torch==2.10.0
170
+ tornado==6.5.5
171
+ tqdm==4.67.3
172
+ traitlets==5.14.3
173
+ transformers==5.3.0
174
+ typer==0.24.1
175
+ typing-inspect==0.9.0
176
+ typing-inspection==0.4.2
177
+ typing_extensions==4.15.0
178
+ tzdata==2025.3
179
+ uri-template==1.3.0
180
+ urllib3==2.6.3
181
+ uuid_utils==0.14.1
182
+ wcwidth==0.6.0
183
+ webcolors==25.10.0
184
+ webencodings==0.5.1
185
+ websocket-client==1.9.0
186
+ Werkzeug==3.1.6
187
+ widgetsnbextension==4.0.15
188
+ xxhash==3.6.0
189
+ yarl==1.23.0
190
+ zstandard==0.25.0
191
+ gunicorn==21.2.0
server.py ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from flask import Flask, request, jsonify
2
+ from flask_cors import CORS
3
+ from sethu_agent import agent_inference, translate_to_english
4
+
5
+ app = Flask(__name__)
6
+ CORS(app)
7
+
8
+
9
+ @app.get("/")
10
+ def health_check():
11
+ return jsonify({
12
+ "status": "ok",
13
+ "message": "Legal / Immigration Context Retrieval API is running"
14
+ })
15
+
16
+
17
+ @app.post("/chat")
18
+ def chat():
19
+ data = request.get_json(force=True)
20
+
21
+ question = data.get("question")
22
+ language = data.get("language", "en") # "en" or "si"
23
+
24
+ if not question:
25
+ return jsonify({"error": "Question is required"}), 400
26
+
27
+ try:
28
+ # If Sinhala, translate question to English for retrieval
29
+ backend_question = question
30
+ if language == "si":
31
+ backend_question = translate_to_english(question)
32
+
33
+ # Pass language so LLM responds in correct language directly
34
+ answer = agent_inference(backend_question, language)
35
+
36
+ print("ANSWER:", answer)
37
+
38
+ return jsonify({
39
+ "question": question,
40
+ "answer": str(answer)
41
+ })
42
+
43
+ except Exception as e:
44
+ import traceback
45
+ traceback.print_exc()
46
+ return jsonify({
47
+ "error": "Internal server error",
48
+ "details": str(e)
49
+ }), 500
50
+
51
+
52
+ if __name__ == "__main__":
53
+ app.run(host="0.0.0.0", port=5001, debug=True)
sethu_agent.py ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from dotenv import load_dotenv
3
+
4
+ from langchain_huggingface import HuggingFaceEmbeddings
5
+ from langchain_community.vectorstores import FAISS
6
+ from langchain_core.prompts import PromptTemplate
7
+ from langchain_core.output_parsers import StrOutputParser
8
+ from langchain_groq import ChatGroq
9
+ from deep_translator import GoogleTranslator
10
+
11
+ load_dotenv()
12
+
13
+ GROQ_API_KEY = os.getenv("GROQ_API_KEY")
14
+ if not GROQ_API_KEY:
15
+ raise ValueError("GROQ_API_KEY is missing in your .env file")
16
+
17
+ # =============================================================================
18
+ # Embeddings
19
+ # =============================================================================
20
+ model_name = "BAAI/bge-base-en"
21
+ encode_kwargs = {"normalize_embeddings": True}
22
+
23
+ bge_embeddings = HuggingFaceEmbeddings(
24
+ model_name=model_name,
25
+ model_kwargs={"device": "cpu"},
26
+ encode_kwargs=encode_kwargs,
27
+ )
28
+
29
+ # =============================================================================
30
+ # Load FAISS vector store
31
+ # =============================================================================
32
+ persist_directory = "faiss_index" # renamed from chroma_vector_db
33
+
34
+ vector_store = FAISS.load_local(
35
+ persist_directory,
36
+ bge_embeddings,
37
+ allow_dangerous_deserialization=True,
38
+ )
39
+
40
+ retriever = vector_store.as_retriever(search_kwargs={"k": 5})
41
+
42
+ # =============================================================================
43
+ # LLM
44
+ # =============================================================================
45
+ llm = ChatGroq(
46
+ model="openai/gpt-oss-20b",
47
+ temperature=0,
48
+ )
49
+
50
+ # =============================================================================
51
+ # LAYER 1 — Context Extraction Prompt
52
+ # =============================================================================
53
+ # This is the NEW intermediate step.
54
+ # Its only job is to read the raw retrieved chunks and produce a clean,
55
+ # structured breakdown — it does NOT write the final answer yet.
56
+ # =============================================================================
57
+ layer1_extraction_prompt = PromptTemplate(
58
+ template="""
59
+ You are a structured information extractor for Canadian immigration documents.
60
+ Your ONLY job is to read the retrieved text chunks below and extract the
61
+ relevant facts into a structured format. Do NOT write a conversational answer.
62
+
63
+ If the user is asking a general question about how the Express Entry system works,
64
+ summarize all relevant information found across all categories below.
65
+
66
+ User question: {question}
67
+
68
+ Retrieved document chunks:
69
+ {context}
70
+
71
+ Extract and organise the relevant information under these categories.
72
+ If a category has no relevant information, write "Not found in context."
73
+
74
+ ELIGIBILITY REQUIREMENTS:
75
+ <list any eligibility rules, criteria, or conditions found>
76
+
77
+ PROCEDURAL STEPS:
78
+ <list any step-by-step process or procedure found>
79
+
80
+ REQUIRED DOCUMENTS:
81
+ <list any documents, certificates, or forms mentioned>
82
+
83
+ KEY FACTS & DEFINITIONS:
84
+ <list any important numbers, scores, timelines, or definitions>
85
+
86
+ IMPORTANT WARNINGS OR EXCEPTIONS:
87
+ <list any warnings, exceptions, or special conditions>
88
+ """,
89
+ input_variables=["question", "context"],
90
+ )
91
+
92
+ # =============================================================================
93
+ # LAYER 2 — Response Generation Prompt
94
+ # =============================================================================
95
+ # This takes the STRUCTURED OUTPUT from Layer 1 (not raw chunks) and
96
+ # writes the final, readable answer for the user.
97
+ # =============================================================================
98
+ layer2_generation_prompt = PromptTemplate(
99
+ template="""
100
+ You are Setu, a bilingual immigration guidance assistant for Sri Lankan
101
+ applicants applying to Canadian Express Entry.
102
+
103
+ {language_instruction}
104
+
105
+ Using ONLY the structured context provided below, write a clear, helpful,
106
+ and accurate answer to the user's question. Do not invent any information.
107
+ If the structured context does not contain enough information to answer,
108
+ say: "I could not find specific information about this in my knowledge base."
109
+
110
+ User question: {question}
111
+
112
+ Structured context (extracted from official documents):
113
+ {structured_context}
114
+
115
+ Rules:
116
+ - Answer in clear, plain language suitable for a non-legal reader.
117
+ - Use a step-by-step format if the question is procedural.
118
+ - Use short paragraphs if the question is informational.
119
+ - Do not add information that is not in the structured context above.
120
+ """,
121
+ input_variables=["question", "structured_context", "language_instruction"],
122
+ )
123
+
124
+ # Chain each prompt with the LLM
125
+ layer1_chain = layer1_extraction_prompt | llm | StrOutputParser()
126
+ layer2_chain = layer2_generation_prompt | llm | StrOutputParser()
127
+
128
+
129
+ # =============================================================================
130
+ # Helper — format retrieved docs into plain text
131
+ # =============================================================================
132
+ def _format_docs(docs) -> str:
133
+ if not docs:
134
+ return "No retrieved context."
135
+ chunks = []
136
+ for i, doc in enumerate(docs, start=1):
137
+ source = doc.metadata.get("source", "Unknown source")
138
+ content = doc.page_content.strip()
139
+ chunks.append(f"[Document {i}] Source: {source}\n{content}")
140
+ return "\n\n".join(chunks)
141
+
142
+
143
+ # =============================================================================
144
+ # agent_inference — the public function called by server.py
145
+ # =============================================================================
146
+ def agent_inference(question: str, language: str = "en") -> str:
147
+ """
148
+ Dual-layer RAG pipeline:
149
+
150
+ Step 1 — Retrieve
151
+ Fetch the top-k relevant chunks from the FAISS vector store.
152
+
153
+ Step 2 — Layer 1: Context Extraction
154
+ Send the raw chunks to the LLM with a strict extraction prompt.
155
+ Output: structured categories (eligibility, procedures, documents…)
156
+
157
+ Step 3 — Layer 2: Response Generation
158
+ Send the STRUCTURED output (not the raw chunks) to the LLM with a
159
+ generation prompt. Output: the final user-facing answer.
160
+ """
161
+ if not question or not question.strip():
162
+ return "Please provide a valid question."
163
+
164
+ # ── Step 1: Retrieve ─────────────────────────────────────────────────────
165
+ docs = retriever.invoke(question)
166
+ raw_context = _format_docs(docs)
167
+
168
+ # ── Step 2: Layer 1 — Extract & Structure ────────────────────────────────
169
+ structured_context = layer1_chain.invoke(
170
+ {
171
+ "question": question,
172
+ "context": raw_context,
173
+ }
174
+ )
175
+
176
+ # ── Step 3: Layer 2 — Generate Final Answer ──────────────────────────────
177
+ final_answer = layer2_chain.invoke(
178
+ {
179
+ "question": question,
180
+ "structured_context": structured_context,
181
+ "language_instruction": "You MUST respond entirely in Sinhala language." if language == "si" else "Respond in English.",
182
+ }
183
+ )
184
+
185
+ return final_answer
186
+
187
+
188
+ __all__ = ["agent_inference", "translate_to_english", "translate_to_sinhala"]
189
+
190
+
191
+ def translate_to_english(text: str) -> str:
192
+ try:
193
+ return GoogleTranslator(source="si", target="en").translate(text)
194
+ except Exception as e:
195
+ print(f"Translation error (si→en): {e}")
196
+ return text
197
+
198
+ def translate_to_sinhala(text: str) -> str:
199
+ try:
200
+ return GoogleTranslator(source="en", target="si").translate(text)
201
+ except Exception as e:
202
+ print(f"Translation error (en→si): {e}")
203
+ return text
update_kb.py ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from langchain_community.document_loaders import PyPDFDirectoryLoader
3
+ from langchain_text_splitters import RecursiveCharacterTextSplitter
4
+ from langchain_huggingface import HuggingFaceEmbeddings
5
+ from langchain_community.vectorstores import FAISS
6
+
7
+ # ── Load all PDFs from KB folder ─────────────────────────────────────────────
8
+ print("Loading PDFs from KB folder...")
9
+
10
+ loader = PyPDFDirectoryLoader("KB")
11
+ docs = loader.load()
12
+
13
+ print(f"Loaded {len(docs)} pages from KB folder.")
14
+
15
+ # ── Split into chunks ─────────────────────────────────────────────────────────
16
+ print("Splitting into chunks...")
17
+
18
+ splitter = RecursiveCharacterTextSplitter(
19
+ chunk_size=1000,
20
+ chunk_overlap=200
21
+ )
22
+
23
+ all_chunks = splitter.split_documents(docs)
24
+ print(f"Created {len(all_chunks)} chunks.")
25
+
26
+ # ── Load embeddings ───────────────────────────────────────────────────────────
27
+ print("Loading embedding model...")
28
+
29
+ embeddings = HuggingFaceEmbeddings(
30
+ model_name="BAAI/bge-base-en",
31
+ model_kwargs={"device": "cpu"},
32
+ encode_kwargs={"normalize_embeddings": True},
33
+ )
34
+
35
+ # ── Build and save FAISS vector store ────────────────────────────────────────
36
+ print("Building vector store...")
37
+
38
+ persist_directory = "faiss_index"
39
+
40
+ vector_store = FAISS.from_documents(all_chunks, embeddings)
41
+ vector_store.save_local(persist_directory)
42
+
43
+ print(f"Done! Database saved to '{persist_directory}'")
vector_db.ipynb ADDED
@@ -0,0 +1,924 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "### Run"
8
+ ]
9
+ },
10
+ {
11
+ "cell_type": "code",
12
+ "execution_count": 1,
13
+ "metadata": {},
14
+ "outputs": [
15
+ {
16
+ "name": "stdout",
17
+ "output_type": "stream",
18
+ "text": [
19
+ "hello\n"
20
+ ]
21
+ }
22
+ ],
23
+ "source": [
24
+ "print(\"hello\")"
25
+ ]
26
+ },
27
+ {
28
+ "cell_type": "code",
29
+ "execution_count": 2,
30
+ "metadata": {},
31
+ "outputs": [
32
+ {
33
+ "name": "stdout",
34
+ "output_type": "stream",
35
+ "text": [
36
+ "Requirement already satisfied: langchain_community in e:\\sethu\\server\\.venv\\lib\\site-packages (0.4.1)\n",
37
+ "Requirement already satisfied: langchain in e:\\sethu\\server\\.venv\\lib\\site-packages (1.1.3)\n",
38
+ "Requirement already satisfied: langchain-core<2.0.0,>=1.0.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (1.2.0)\n",
39
+ "Requirement already satisfied: langchain-classic<2.0.0,>=1.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (1.0.0)\n",
40
+ "Requirement already satisfied: SQLAlchemy<3.0.0,>=1.4.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (2.0.45)\n",
41
+ "Requirement already satisfied: requests<3.0.0,>=2.32.5 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (2.32.5)\n",
42
+ "Requirement already satisfied: PyYAML<7.0.0,>=5.3.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (6.0.3)\n",
43
+ "Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (3.13.2)\n",
44
+ "Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (9.1.2)\n",
45
+ "Requirement already satisfied: dataclasses-json<0.7.0,>=0.6.7 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (0.6.7)\n",
46
+ "Requirement already satisfied: pydantic-settings<3.0.0,>=2.10.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (2.12.0)\n",
47
+ "Requirement already satisfied: langsmith<1.0.0,>=0.1.125 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (0.4.59)\n",
48
+ "Requirement already satisfied: httpx-sse<1.0.0,>=0.4.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (0.4.3)\n",
49
+ "Requirement already satisfied: numpy>=2.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain_community) (2.3.5)\n",
50
+ "Requirement already satisfied: aiohappyeyeballs>=2.5.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from aiohttp<4.0.0,>=3.8.3->langchain_community) (2.6.1)\n",
51
+ "Requirement already satisfied: aiosignal>=1.4.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from aiohttp<4.0.0,>=3.8.3->langchain_community) (1.4.0)\n",
52
+ "Requirement already satisfied: attrs>=17.3.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from aiohttp<4.0.0,>=3.8.3->langchain_community) (25.4.0)\n",
53
+ "Requirement already satisfied: frozenlist>=1.1.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from aiohttp<4.0.0,>=3.8.3->langchain_community) (1.8.0)\n",
54
+ "Requirement already satisfied: multidict<7.0,>=4.5 in e:\\sethu\\server\\.venv\\lib\\site-packages (from aiohttp<4.0.0,>=3.8.3->langchain_community) (6.7.0)\n",
55
+ "Requirement already satisfied: propcache>=0.2.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from aiohttp<4.0.0,>=3.8.3->langchain_community) (0.4.1)\n",
56
+ "Requirement already satisfied: yarl<2.0,>=1.17.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from aiohttp<4.0.0,>=3.8.3->langchain_community) (1.22.0)\n",
57
+ "Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from dataclasses-json<0.7.0,>=0.6.7->langchain_community) (3.26.1)\n",
58
+ "Requirement already satisfied: typing-inspect<1,>=0.4.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from dataclasses-json<0.7.0,>=0.6.7->langchain_community) (0.9.0)\n",
59
+ "Requirement already satisfied: langchain-text-splitters<2.0.0,>=1.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-classic<2.0.0,>=1.0.0->langchain_community) (1.0.0)\n",
60
+ "Requirement already satisfied: pydantic<3.0.0,>=2.7.4 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-classic<2.0.0,>=1.0.0->langchain_community) (2.12.5)\n",
61
+ "Requirement already satisfied: jsonpatch<2.0.0,>=1.33.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.1->langchain_community) (1.33)\n",
62
+ "Requirement already satisfied: packaging<26.0.0,>=23.2.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.1->langchain_community) (25.0)\n",
63
+ "Requirement already satisfied: typing-extensions<5.0.0,>=4.7.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.1->langchain_community) (4.15.0)\n",
64
+ "Requirement already satisfied: uuid-utils<1.0,>=0.12.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.1->langchain_community) (0.12.0)\n",
65
+ "Requirement already satisfied: jsonpointer>=1.9 in e:\\sethu\\server\\.venv\\lib\\site-packages (from jsonpatch<2.0.0,>=1.33.0->langchain-core<2.0.0,>=1.0.1->langchain_community) (3.0.0)\n",
66
+ "Requirement already satisfied: httpx<1,>=0.23.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.1.125->langchain_community) (0.28.1)\n",
67
+ "Requirement already satisfied: orjson>=3.9.14 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.1.125->langchain_community) (3.11.5)\n",
68
+ "Requirement already satisfied: requests-toolbelt>=1.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.1.125->langchain_community) (1.0.0)\n",
69
+ "Requirement already satisfied: zstandard>=0.23.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.1.125->langchain_community) (0.25.0)\n",
70
+ "Requirement already satisfied: anyio in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.1.125->langchain_community) (4.12.0)\n",
71
+ "Requirement already satisfied: certifi in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.1.125->langchain_community) (2025.11.12)\n",
72
+ "Requirement already satisfied: httpcore==1.* in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.1.125->langchain_community) (1.0.9)\n",
73
+ "Requirement already satisfied: idna in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.1.125->langchain_community) (3.11)\n",
74
+ "Requirement already satisfied: h11>=0.16 in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpcore==1.*->httpx<1,>=0.23.0->langsmith<1.0.0,>=0.1.125->langchain_community) (0.16.0)\n",
75
+ "Requirement already satisfied: annotated-types>=0.6.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-classic<2.0.0,>=1.0.0->langchain_community) (0.7.0)\n",
76
+ "Requirement already satisfied: pydantic-core==2.41.5 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-classic<2.0.0,>=1.0.0->langchain_community) (2.41.5)\n",
77
+ "Requirement already satisfied: typing-inspection>=0.4.2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-classic<2.0.0,>=1.0.0->langchain_community) (0.4.2)\n",
78
+ "Requirement already satisfied: python-dotenv>=0.21.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic-settings<3.0.0,>=2.10.1->langchain_community) (1.2.1)\n",
79
+ "Requirement already satisfied: charset_normalizer<4,>=2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from requests<3.0.0,>=2.32.5->langchain_community) (3.4.4)\n",
80
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from requests<3.0.0,>=2.32.5->langchain_community) (2.6.2)\n",
81
+ "Requirement already satisfied: greenlet>=1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from SQLAlchemy<3.0.0,>=1.4.0->langchain_community) (3.3.0)\n",
82
+ "Requirement already satisfied: mypy-extensions>=0.3.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7.0,>=0.6.7->langchain_community) (1.1.0)\n",
83
+ "Requirement already satisfied: langgraph<1.1.0,>=1.0.2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain) (1.0.5)\n",
84
+ "Requirement already satisfied: langgraph-checkpoint<4.0.0,>=2.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langgraph<1.1.0,>=1.0.2->langchain) (3.0.1)\n",
85
+ "Requirement already satisfied: langgraph-prebuilt<1.1.0,>=1.0.2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langgraph<1.1.0,>=1.0.2->langchain) (1.0.5)\n",
86
+ "Requirement already satisfied: langgraph-sdk<0.4.0,>=0.3.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langgraph<1.1.0,>=1.0.2->langchain) (0.3.0)\n",
87
+ "Requirement already satisfied: xxhash>=3.5.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langgraph<1.1.0,>=1.0.2->langchain) (3.6.0)\n",
88
+ "Requirement already satisfied: ormsgpack>=1.12.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langgraph-checkpoint<4.0.0,>=2.1.0->langgraph<1.1.0,>=1.0.2->langchain) (1.12.0)\n"
89
+ ]
90
+ }
91
+ ],
92
+ "source": [
93
+ "!pip install langchain_community langchain"
94
+ ]
95
+ },
96
+ {
97
+ "cell_type": "code",
98
+ "execution_count": 3,
99
+ "metadata": {},
100
+ "outputs": [
101
+ {
102
+ "name": "stdout",
103
+ "output_type": "stream",
104
+ "text": [
105
+ "Requirement already satisfied: pypdf in e:\\sethu\\server\\.venv\\lib\\site-packages (6.4.1)\n"
106
+ ]
107
+ }
108
+ ],
109
+ "source": [
110
+ "!pip install pypdf"
111
+ ]
112
+ },
113
+ {
114
+ "cell_type": "code",
115
+ "execution_count": 4,
116
+ "metadata": {},
117
+ "outputs": [
118
+ {
119
+ "name": "stdout",
120
+ "output_type": "stream",
121
+ "text": [
122
+ "Requirement already satisfied: langchain-huggingface in e:\\sethu\\server\\.venv\\lib\\site-packages (1.2.0)\n",
123
+ "Requirement already satisfied: sentence-transformers in e:\\sethu\\server\\.venv\\lib\\site-packages (5.2.0)\n",
124
+ "Requirement already satisfied: tf-keras in e:\\sethu\\server\\.venv\\lib\\site-packages (2.15.0)\n",
125
+ "Requirement already satisfied: huggingface-hub<1.0.0,>=0.33.4 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-huggingface) (0.36.0)\n",
126
+ "Requirement already satisfied: langchain-core<2.0.0,>=1.2.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-huggingface) (1.2.0)\n",
127
+ "Requirement already satisfied: tokenizers<1.0.0,>=0.19.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-huggingface) (0.22.1)\n",
128
+ "Requirement already satisfied: filelock in e:\\sethu\\server\\.venv\\lib\\site-packages (from huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (3.20.0)\n",
129
+ "Requirement already satisfied: fsspec>=2023.5.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (2025.12.0)\n",
130
+ "Requirement already satisfied: packaging>=20.9 in e:\\sethu\\server\\.venv\\lib\\site-packages (from huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (25.0)\n",
131
+ "Requirement already satisfied: pyyaml>=5.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (6.0.3)\n",
132
+ "Requirement already satisfied: requests in e:\\sethu\\server\\.venv\\lib\\site-packages (from huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (2.32.5)\n",
133
+ "Requirement already satisfied: tqdm>=4.42.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (4.67.1)\n",
134
+ "Requirement already satisfied: typing-extensions>=3.7.4.3 in e:\\sethu\\server\\.venv\\lib\\site-packages (from huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (4.15.0)\n",
135
+ "Requirement already satisfied: jsonpatch<2.0.0,>=1.33.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (1.33)\n",
136
+ "Requirement already satisfied: langsmith<1.0.0,>=0.3.45 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (0.4.59)\n",
137
+ "Requirement already satisfied: pydantic<3.0.0,>=2.7.4 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (2.12.5)\n",
138
+ "Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (9.1.2)\n",
139
+ "Requirement already satisfied: uuid-utils<1.0,>=0.12.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (0.12.0)\n",
140
+ "Requirement already satisfied: jsonpointer>=1.9 in e:\\sethu\\server\\.venv\\lib\\site-packages (from jsonpatch<2.0.0,>=1.33.0->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (3.0.0)\n",
141
+ "Requirement already satisfied: httpx<1,>=0.23.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (0.28.1)\n",
142
+ "Requirement already satisfied: orjson>=3.9.14 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (3.11.5)\n",
143
+ "Requirement already satisfied: requests-toolbelt>=1.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (1.0.0)\n",
144
+ "Requirement already satisfied: zstandard>=0.23.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (0.25.0)\n",
145
+ "Requirement already satisfied: anyio in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (4.12.0)\n",
146
+ "Requirement already satisfied: certifi in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (2025.11.12)\n",
147
+ "Requirement already satisfied: httpcore==1.* in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (1.0.9)\n",
148
+ "Requirement already satisfied: idna in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (3.11)\n",
149
+ "Requirement already satisfied: h11>=0.16 in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpcore==1.*->httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (0.16.0)\n",
150
+ "Requirement already satisfied: annotated-types>=0.6.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (0.7.0)\n",
151
+ "Requirement already satisfied: pydantic-core==2.41.5 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (2.41.5)\n",
152
+ "Requirement already satisfied: typing-inspection>=0.4.2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-core<2.0.0,>=1.2.0->langchain-huggingface) (0.4.2)\n",
153
+ "Requirement already satisfied: transformers<6.0.0,>=4.41.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from sentence-transformers) (4.57.3)\n",
154
+ "Requirement already satisfied: torch>=1.11.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from sentence-transformers) (2.9.1)\n",
155
+ "Requirement already satisfied: scikit-learn in e:\\sethu\\server\\.venv\\lib\\site-packages (from sentence-transformers) (1.8.0)\n",
156
+ "Requirement already satisfied: scipy in e:\\sethu\\server\\.venv\\lib\\site-packages (from sentence-transformers) (1.16.3)\n",
157
+ "Requirement already satisfied: numpy>=1.17 in e:\\sethu\\server\\.venv\\lib\\site-packages (from transformers<6.0.0,>=4.41.0->sentence-transformers) (2.3.5)\n",
158
+ "Requirement already satisfied: regex!=2019.12.17 in e:\\sethu\\server\\.venv\\lib\\site-packages (from transformers<6.0.0,>=4.41.0->sentence-transformers) (2025.11.3)\n",
159
+ "Requirement already satisfied: safetensors>=0.4.3 in e:\\sethu\\server\\.venv\\lib\\site-packages (from transformers<6.0.0,>=4.41.0->sentence-transformers) (0.7.0)\n",
160
+ "Requirement already satisfied: charset_normalizer<4,>=2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from requests->huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (3.4.4)\n",
161
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from requests->huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (2.6.2)\n",
162
+ "Requirement already satisfied: sympy>=1.13.3 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch>=1.11.0->sentence-transformers) (1.14.0)\n",
163
+ "Requirement already satisfied: networkx>=2.5.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch>=1.11.0->sentence-transformers) (3.6.1)\n",
164
+ "Requirement already satisfied: jinja2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch>=1.11.0->sentence-transformers) (3.1.6)\n",
165
+ "Requirement already satisfied: setuptools in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch>=1.11.0->sentence-transformers) (80.9.0)\n",
166
+ "Requirement already satisfied: mpmath<1.4,>=1.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from sympy>=1.13.3->torch>=1.11.0->sentence-transformers) (1.3.0)\n",
167
+ "Requirement already satisfied: colorama in e:\\sethu\\server\\.venv\\lib\\site-packages (from tqdm>=4.42.1->huggingface-hub<1.0.0,>=0.33.4->langchain-huggingface) (0.4.6)\n",
168
+ "Requirement already satisfied: MarkupSafe>=2.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from jinja2->torch>=1.11.0->sentence-transformers) (3.0.3)\n",
169
+ "Requirement already satisfied: joblib>=1.3.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from scikit-learn->sentence-transformers) (1.5.2)\n",
170
+ "Requirement already satisfied: threadpoolctl>=3.2.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from scikit-learn->sentence-transformers) (3.6.0)\n"
171
+ ]
172
+ }
173
+ ],
174
+ "source": [
175
+ "!pip install langchain-huggingface sentence-transformers tf-keras"
176
+ ]
177
+ },
178
+ {
179
+ "cell_type": "code",
180
+ "execution_count": 5,
181
+ "metadata": {},
182
+ "outputs": [
183
+ {
184
+ "name": "stdout",
185
+ "output_type": "stream",
186
+ "text": [
187
+ "Requirement already satisfied: pip in e:\\sethu\\server\\.venv\\lib\\site-packages (25.3)\n",
188
+ "Requirement already satisfied: setuptools in e:\\sethu\\server\\.venv\\lib\\site-packages (80.9.0)\n",
189
+ "Requirement already satisfied: wheel in e:\\sethu\\server\\.venv\\lib\\site-packages (0.45.1)\n"
190
+ ]
191
+ }
192
+ ],
193
+ "source": [
194
+ "!pip install --upgrade pip setuptools wheel"
195
+ ]
196
+ },
197
+ {
198
+ "cell_type": "code",
199
+ "execution_count": 6,
200
+ "metadata": {},
201
+ "outputs": [
202
+ {
203
+ "name": "stderr",
204
+ "output_type": "stream",
205
+ "text": [
206
+ " error: subprocess-exited-with-error\n",
207
+ " \n",
208
+ " × Preparing metadata (pyproject.toml) did not run successfully.\n",
209
+ " │ exit code: 1\n",
210
+ " ╰─> [21 lines of output]\n",
211
+ " + E:\\Sethu\\server\\.venv\\Scripts\\python.exe C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-kadb7ou_\\numpy_7dc8dc27f6d24a66bbea55a3e557354c\\vendored-meson\\meson\\meson.py setup C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-kadb7ou_\\numpy_7dc8dc27f6d24a66bbea55a3e557354c C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-kadb7ou_\\numpy_7dc8dc27f6d24a66bbea55a3e557354c\\.mesonpy-rwuwf6md -Dbuildtype=release -Db_ndebug=if-release -Db_vscrt=md --native-file=C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-kadb7ou_\\numpy_7dc8dc27f6d24a66bbea55a3e557354c\\.mesonpy-rwuwf6md\\meson-python-native-file.ini\n",
212
+ " The Meson build system\n",
213
+ " Version: 1.2.99\n",
214
+ " Source dir: C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-kadb7ou_\\numpy_7dc8dc27f6d24a66bbea55a3e557354c\n",
215
+ " Build dir: C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-kadb7ou_\\numpy_7dc8dc27f6d24a66bbea55a3e557354c\\.mesonpy-rwuwf6md\n",
216
+ " Build type: native build\n",
217
+ " Project name: NumPy\n",
218
+ " Project version: 1.26.4\n",
219
+ " WARNING: Failed to activate VS environment: Could not find C:\\Program Files (x86)\\Microsoft Visual Studio\\Installer\\vswhere.exe\n",
220
+ " \n",
221
+ " ..\\meson.build:1:0: ERROR: Unknown compiler(s): [['icl'], ['cl'], ['cc'], ['gcc'], ['clang'], ['clang-cl'], ['pgcc']]\n",
222
+ " The following exception(s) were encountered:\n",
223
+ " Running `icl \"\"` gave \"[WinError 2] The system cannot find the file specified\"\n",
224
+ " Running `cl /?` gave \"[WinError 2] The system cannot find the file specified\"\n",
225
+ " Running `cc --version` gave \"[WinError 2] The system cannot find the file specified\"\n",
226
+ " Running `gcc --version` gave \"[WinError 2] The system cannot find the file specified\"\n",
227
+ " Running `clang --version` gave \"[WinError 2] The system cannot find the file specified\"\n",
228
+ " Running `clang-cl /?` gave \"[WinError 2] The system cannot find the file specified\"\n",
229
+ " Running `pgcc --version` gave \"[WinError 2] The system cannot find the file specified\"\n",
230
+ " \n",
231
+ " A full log can be found at C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-kadb7ou_\\numpy_7dc8dc27f6d24a66bbea55a3e557354c\\.mesonpy-rwuwf6md\\meson-logs\\meson-log.txt\n",
232
+ " [end of output]\n",
233
+ " \n",
234
+ " note: This error originates from a subprocess, and is likely not a problem with pip.\n",
235
+ "error: metadata-generation-failed\n",
236
+ "\n",
237
+ "× Encountered error while generating package metadata.\n",
238
+ "╰─> numpy\n",
239
+ "\n",
240
+ "note: This is an issue with the package mentioned above, not pip.\n",
241
+ "hint: See above for details.\n"
242
+ ]
243
+ }
244
+ ],
245
+ "source": [
246
+ "!pip install -qU \"langchain-chroma>=0.1.2\""
247
+ ]
248
+ },
249
+ {
250
+ "cell_type": "code",
251
+ "execution_count": 7,
252
+ "metadata": {},
253
+ "outputs": [
254
+ {
255
+ "name": "stdout",
256
+ "output_type": "stream",
257
+ "text": [
258
+ "Requirement already satisfied: langchain-groq in e:\\sethu\\server\\.venv\\lib\\site-packages (1.1.1)\n",
259
+ "Requirement already satisfied: groq<1.0.0,>=0.30.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-groq) (0.37.1)\n",
260
+ "Requirement already satisfied: langchain-core<2.0.0,>=1.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-groq) (1.2.0)\n",
261
+ "Requirement already satisfied: anyio<5,>=3.5.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from groq<1.0.0,>=0.30.0->langchain-groq) (4.12.0)\n",
262
+ "Requirement already satisfied: distro<2,>=1.7.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from groq<1.0.0,>=0.30.0->langchain-groq) (1.9.0)\n",
263
+ "Requirement already satisfied: httpx<1,>=0.23.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from groq<1.0.0,>=0.30.0->langchain-groq) (0.28.1)\n",
264
+ "Requirement already satisfied: pydantic<3,>=1.9.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from groq<1.0.0,>=0.30.0->langchain-groq) (2.12.5)\n",
265
+ "Requirement already satisfied: sniffio in e:\\sethu\\server\\.venv\\lib\\site-packages (from groq<1.0.0,>=0.30.0->langchain-groq) (1.3.1)\n",
266
+ "Requirement already satisfied: typing-extensions<5,>=4.10 in e:\\sethu\\server\\.venv\\lib\\site-packages (from groq<1.0.0,>=0.30.0->langchain-groq) (4.15.0)\n",
267
+ "Requirement already satisfied: idna>=2.8 in e:\\sethu\\server\\.venv\\lib\\site-packages (from anyio<5,>=3.5.0->groq<1.0.0,>=0.30.0->langchain-groq) (3.11)\n",
268
+ "Requirement already satisfied: certifi in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->groq<1.0.0,>=0.30.0->langchain-groq) (2025.11.12)\n",
269
+ "Requirement already satisfied: httpcore==1.* in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->groq<1.0.0,>=0.30.0->langchain-groq) (1.0.9)\n",
270
+ "Requirement already satisfied: h11>=0.16 in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpcore==1.*->httpx<1,>=0.23.0->groq<1.0.0,>=0.30.0->langchain-groq) (0.16.0)\n",
271
+ "Requirement already satisfied: jsonpatch<2.0.0,>=1.33.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.1.0->langchain-groq) (1.33)\n",
272
+ "Requirement already satisfied: langsmith<1.0.0,>=0.3.45 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.1.0->langchain-groq) (0.4.59)\n",
273
+ "Requirement already satisfied: packaging<26.0.0,>=23.2.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.1.0->langchain-groq) (25.0)\n",
274
+ "Requirement already satisfied: pyyaml<7.0.0,>=5.3.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.1.0->langchain-groq) (6.0.3)\n",
275
+ "Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.1.0->langchain-groq) (9.1.2)\n",
276
+ "Requirement already satisfied: uuid-utils<1.0,>=0.12.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.1.0->langchain-groq) (0.12.0)\n",
277
+ "Requirement already satisfied: jsonpointer>=1.9 in e:\\sethu\\server\\.venv\\lib\\site-packages (from jsonpatch<2.0.0,>=1.33.0->langchain-core<2.0.0,>=1.1.0->langchain-groq) (3.0.0)\n",
278
+ "Requirement already satisfied: orjson>=3.9.14 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.1.0->langchain-groq) (3.11.5)\n",
279
+ "Requirement already satisfied: requests-toolbelt>=1.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.1.0->langchain-groq) (1.0.0)\n",
280
+ "Requirement already satisfied: requests>=2.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.1.0->langchain-groq) (2.32.5)\n",
281
+ "Requirement already satisfied: zstandard>=0.23.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.1.0->langchain-groq) (0.25.0)\n",
282
+ "Requirement already satisfied: annotated-types>=0.6.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3,>=1.9.0->groq<1.0.0,>=0.30.0->langchain-groq) (0.7.0)\n",
283
+ "Requirement already satisfied: pydantic-core==2.41.5 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3,>=1.9.0->groq<1.0.0,>=0.30.0->langchain-groq) (2.41.5)\n",
284
+ "Requirement already satisfied: typing-inspection>=0.4.2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3,>=1.9.0->groq<1.0.0,>=0.30.0->langchain-groq) (0.4.2)\n",
285
+ "Requirement already satisfied: charset_normalizer<4,>=2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from requests>=2.0.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.1.0->langchain-groq) (3.4.4)\n",
286
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from requests>=2.0.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.1.0->langchain-groq) (2.6.2)\n"
287
+ ]
288
+ }
289
+ ],
290
+ "source": [
291
+ "!pip install langchain-groq"
292
+ ]
293
+ },
294
+ {
295
+ "cell_type": "code",
296
+ "execution_count": 8,
297
+ "metadata": {},
298
+ "outputs": [],
299
+ "source": [
300
+ "!pip install -qU langchain-community faiss-cpu"
301
+ ]
302
+ },
303
+ {
304
+ "cell_type": "code",
305
+ "execution_count": 9,
306
+ "metadata": {},
307
+ "outputs": [
308
+ {
309
+ "name": "stdout",
310
+ "output_type": "stream",
311
+ "text": [
312
+ "Requirement already satisfied: torch in e:\\sethu\\server\\.venv\\lib\\site-packages (2.9.1)\n",
313
+ "Requirement already satisfied: torchvision in e:\\sethu\\server\\.venv\\lib\\site-packages (0.24.1)\n",
314
+ "Requirement already satisfied: torchaudio in e:\\sethu\\server\\.venv\\lib\\site-packages (2.9.1)\n",
315
+ "Requirement already satisfied: filelock in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch) (3.20.0)\n",
316
+ "Requirement already satisfied: typing-extensions>=4.10.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch) (4.15.0)\n",
317
+ "Requirement already satisfied: sympy>=1.13.3 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch) (1.14.0)\n",
318
+ "Requirement already satisfied: networkx>=2.5.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch) (3.6.1)\n",
319
+ "Requirement already satisfied: jinja2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch) (3.1.6)\n",
320
+ "Requirement already satisfied: fsspec>=0.8.5 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch) (2025.12.0)\n",
321
+ "Requirement already satisfied: setuptools in e:\\sethu\\server\\.venv\\lib\\site-packages (from torch) (80.9.0)\n",
322
+ "Requirement already satisfied: numpy in e:\\sethu\\server\\.venv\\lib\\site-packages (from torchvision) (2.3.5)\n",
323
+ "Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from torchvision) (12.0.0)\n",
324
+ "Requirement already satisfied: mpmath<1.4,>=1.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from sympy>=1.13.3->torch) (1.3.0)\n",
325
+ "Requirement already satisfied: MarkupSafe>=2.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from jinja2->torch) (3.0.3)\n"
326
+ ]
327
+ }
328
+ ],
329
+ "source": [
330
+ "!pip install torch torchvision torchaudio\n"
331
+ ]
332
+ },
333
+ {
334
+ "cell_type": "code",
335
+ "execution_count": 10,
336
+ "metadata": {},
337
+ "outputs": [
338
+ {
339
+ "name": "stdout",
340
+ "output_type": "stream",
341
+ "text": [
342
+ "Collecting langchain-chroma\n",
343
+ " Using cached langchain_chroma-1.1.0-py3-none-any.whl.metadata (1.9 kB)\n",
344
+ "Requirement already satisfied: langchain-groq in e:\\sethu\\server\\.venv\\lib\\site-packages (1.1.1)\n",
345
+ "Collecting chromadb<2.0.0,>=1.3.5 (from langchain-chroma)\n",
346
+ " Using cached chromadb-1.3.7-cp39-abi3-win_amd64.whl.metadata (7.3 kB)\n",
347
+ "Requirement already satisfied: langchain-core<2.0.0,>=1.1.3 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-chroma) (1.2.0)\n",
348
+ "Requirement already satisfied: numpy>=2.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-chroma) (2.3.5)\n",
349
+ "Collecting build>=1.0.3 (from chromadb<2.0.0,>=1.3.5->langchain-chroma)\n",
350
+ " Using cached build-1.3.0-py3-none-any.whl.metadata (5.6 kB)\n",
351
+ "Requirement already satisfied: pydantic>=1.9 in e:\\sethu\\server\\.venv\\lib\\site-packages (from chromadb<2.0.0,>=1.3.5->langchain-chroma) (2.12.5)\n",
352
+ "Collecting pybase64>=1.4.1 (from chromadb<2.0.0,>=1.3.5->langchain-chroma)\n",
353
+ " Using cached pybase64-1.4.3-cp314-cp314-win_amd64.whl.metadata (9.1 kB)\n",
354
+ "Collecting uvicorn>=0.18.3 (from uvicorn[standard]>=0.18.3->chromadb<2.0.0,>=1.3.5->langchain-chroma)\n",
355
+ " Using cached uvicorn-0.38.0-py3-none-any.whl.metadata (6.8 kB)\n",
356
+ "Collecting posthog<6.0.0,>=2.4.0 (from chromadb<2.0.0,>=1.3.5->langchain-chroma)\n",
357
+ " Using cached posthog-5.4.0-py3-none-any.whl.metadata (5.7 kB)\n",
358
+ "Requirement already satisfied: typing-extensions>=4.5.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from chromadb<2.0.0,>=1.3.5->langchain-chroma) (4.15.0)\n",
359
+ "INFO: pip is looking at multiple versions of chromadb to determine which version is compatible with other requirements. This could take a while.\n",
360
+ "Collecting chromadb<2.0.0,>=1.3.5 (from langchain-chroma)\n",
361
+ " Using cached chromadb-1.3.6-cp39-abi3-win_amd64.whl.metadata (7.3 kB)\n",
362
+ " Using cached chromadb-1.3.5-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
363
+ "Collecting langchain-chroma\n",
364
+ " Using cached langchain_chroma-1.0.0-py3-none-any.whl.metadata (1.9 kB)\n",
365
+ "Collecting chromadb<2.0.0,>=1.0.20 (from langchain-chroma)\n",
366
+ " Using cached chromadb-1.3.4-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
367
+ " Using cached chromadb-1.3.3-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
368
+ "INFO: pip is still looking at multiple versions of chromadb to determine which version is compatible with other requirements. This could take a while.\n",
369
+ " Using cached chromadb-1.3.2-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
370
+ " Using cached chromadb-1.3.0-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
371
+ " Using cached chromadb-1.2.2-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
372
+ " Using cached chromadb-1.2.1-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
373
+ " Using cached chromadb-1.2.0-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
374
+ "INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. See https://pip.pypa.io/warnings/backtracking for guidance. If you want to abort this run, press Ctrl + C.\n",
375
+ " Using cached chromadb-1.1.1-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
376
+ " Using cached chromadb-1.1.0-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
377
+ " Using cached chromadb-1.0.21-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
378
+ " Using cached chromadb-1.0.20-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
379
+ "Collecting langchain-chroma\n",
380
+ " Using cached langchain_chroma-0.2.6-py3-none-any.whl.metadata (1.1 kB)\n",
381
+ " Using cached langchain_chroma-0.2.5-py3-none-any.whl.metadata (1.1 kB)\n",
382
+ "Collecting chromadb>=1.0.9 (from langchain-chroma)\n",
383
+ " Using cached chromadb-1.0.19-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
384
+ " Using cached chromadb-1.0.18-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
385
+ " Using cached chromadb-1.0.17-cp39-abi3-win_amd64.whl.metadata (7.4 kB)\n",
386
+ " Using cached chromadb-1.0.16-cp39-abi3-win_amd64.whl.metadata (7.5 kB)\n",
387
+ " Using cached chromadb-1.0.15-cp39-abi3-win_amd64.whl.metadata (7.1 kB)\n",
388
+ " Using cached chromadb-1.0.13-cp39-abi3-win_amd64.whl.metadata (7.1 kB)\n",
389
+ "Collecting posthog>=2.4.0 (from chromadb>=1.0.9->langchain-chroma)\n",
390
+ " Using cached posthog-7.0.1-py3-none-any.whl.metadata (6.0 kB)\n",
391
+ "Collecting chromadb>=1.0.9 (from langchain-chroma)\n",
392
+ " Using cached chromadb-1.0.12-cp39-abi3-win_amd64.whl.metadata (7.0 kB)\n",
393
+ "Collecting fastapi==0.115.9 (from chromadb>=1.0.9->langchain-chroma)\n",
394
+ " Using cached fastapi-0.115.9-py3-none-any.whl.metadata (27 kB)\n",
395
+ "Collecting chromadb>=1.0.9 (from langchain-chroma)\n",
396
+ " Using cached chromadb-1.0.11-cp39-abi3-win_amd64.whl.metadata (7.0 kB)\n",
397
+ " Using cached chromadb-1.0.10-cp39-abi3-win_amd64.whl.metadata (7.0 kB)\n",
398
+ " Using cached chromadb-1.0.9-cp39-abi3-win_amd64.whl.metadata (7.0 kB)\n",
399
+ "Collecting langchain-chroma\n",
400
+ " Using cached langchain_chroma-0.2.4-py3-none-any.whl.metadata (1.1 kB)\n",
401
+ " Using cached langchain_chroma-0.2.3-py3-none-any.whl.metadata (1.1 kB)\n",
402
+ "Collecting chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0 (from langchain-chroma)\n",
403
+ " Using cached chromadb-0.6.3-py3-none-any.whl.metadata (6.8 kB)\n",
404
+ "Collecting chroma-hnswlib==0.7.6 (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma)\n",
405
+ " Using cached chroma_hnswlib-0.7.6.tar.gz (32 kB)\n",
406
+ " Installing build dependencies: started\n",
407
+ " Installing build dependencies: finished with status 'done'\n",
408
+ " Getting requirements to build wheel: started\n",
409
+ " Getting requirements to build wheel: finished with status 'done'\n",
410
+ " Preparing metadata (pyproject.toml): started\n",
411
+ " Preparing metadata (pyproject.toml): finished with status 'done'\n",
412
+ "Collecting fastapi>=0.95.2 (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma)\n",
413
+ " Using cached fastapi-0.124.4-py3-none-any.whl.metadata (30 kB)\n",
414
+ "Collecting chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0 (from langchain-chroma)\n",
415
+ " Using cached chromadb-0.6.2-py3-none-any.whl.metadata (6.8 kB)\n",
416
+ " Using cached chromadb-0.6.1-py3-none-any.whl.metadata (6.8 kB)\n",
417
+ " Using cached chromadb-0.6.0-py3-none-any.whl.metadata (6.8 kB)\n",
418
+ " Using cached chromadb-0.5.23-py3-none-any.whl.metadata (6.8 kB)\n",
419
+ " Using cached chromadb-0.5.21-py3-none-any.whl.metadata (6.8 kB)\n",
420
+ " Using cached chromadb-0.5.20-py3-none-any.whl.metadata (6.8 kB)\n",
421
+ " Using cached chromadb-0.5.18-py3-none-any.whl.metadata (6.8 kB)\n",
422
+ " Using cached chromadb-0.5.17-py3-none-any.whl.metadata (6.8 kB)\n",
423
+ " Using cached chromadb-0.5.16-py3-none-any.whl.metadata (6.8 kB)\n",
424
+ " Using cached chromadb-0.5.15-py3-none-any.whl.metadata (6.8 kB)\n",
425
+ " Using cached chromadb-0.5.13-py3-none-any.whl.metadata (6.8 kB)\n",
426
+ " Using cached chromadb-0.5.3-py3-none-any.whl.metadata (6.8 kB)\n",
427
+ "Requirement already satisfied: requests>=2.28 in e:\\sethu\\server\\.venv\\lib\\site-packages (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma) (2.32.5)\n",
428
+ "Collecting chroma-hnswlib==0.7.3 (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma)\n",
429
+ " Using cached chroma-hnswlib-0.7.3.tar.gz (31 kB)\n",
430
+ " Installing build dependencies: started\n",
431
+ " Installing build dependencies: finished with status 'done'\n",
432
+ " Getting requirements to build wheel: started\n",
433
+ " Getting requirements to build wheel: finished with status 'done'\n",
434
+ " Preparing metadata (pyproject.toml): started\n",
435
+ " Preparing metadata (pyproject.toml): finished with status 'done'\n",
436
+ "Collecting chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0 (from langchain-chroma)\n",
437
+ " Using cached chromadb-0.5.2-py3-none-any.whl.metadata (6.8 kB)\n",
438
+ " Using cached chromadb-0.5.1-py3-none-any.whl.metadata (6.8 kB)\n",
439
+ " Using cached chromadb-0.5.0-py3-none-any.whl.metadata (7.3 kB)\n",
440
+ " Using cached chromadb-0.4.24-py3-none-any.whl.metadata (7.3 kB)\n",
441
+ " Using cached chromadb-0.4.23-py3-none-any.whl.metadata (7.3 kB)\n",
442
+ " Using cached chromadb-0.4.22-py3-none-any.whl.metadata (7.3 kB)\n",
443
+ " Using cached chromadb-0.4.21-py3-none-any.whl.metadata (7.3 kB)\n",
444
+ " Using cached chromadb-0.4.20-py3-none-any.whl.metadata (7.3 kB)\n",
445
+ " Using cached chromadb-0.4.19-py3-none-any.whl.metadata (7.3 kB)\n",
446
+ " Using cached chromadb-0.4.18-py3-none-any.whl.metadata (7.4 kB)\n",
447
+ " Using cached chromadb-0.4.17-py3-none-any.whl.metadata (7.3 kB)\n",
448
+ " Using cached chromadb-0.4.16-py3-none-any.whl.metadata (7.3 kB)\n",
449
+ " Using cached chromadb-0.4.15-py3-none-any.whl.metadata (7.2 kB)\n",
450
+ " Using cached chromadb-0.4.14-py3-none-any.whl.metadata (7.0 kB)\n",
451
+ " Using cached chromadb-0.4.13-py3-none-any.whl.metadata (7.0 kB)\n",
452
+ " Using cached chromadb-0.4.12-py3-none-any.whl.metadata (7.0 kB)\n",
453
+ "Collecting pydantic<2.0,>=1.9 (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma)\n",
454
+ " Using cached pydantic-1.10.24-py3-none-any.whl.metadata (154 kB)\n",
455
+ "Collecting fastapi<0.100.0,>=0.95.2 (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma)\n",
456
+ " Using cached fastapi-0.99.1-py3-none-any.whl.metadata (23 kB)\n",
457
+ "Collecting chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0 (from langchain-chroma)\n",
458
+ " Using cached chromadb-0.4.11-py3-none-any.whl.metadata (7.0 kB)\n",
459
+ " Using cached chromadb-0.4.10-py3-none-any.whl.metadata (7.0 kB)\n",
460
+ " Using cached chromadb-0.4.9-py3-none-any.whl.metadata (7.0 kB)\n",
461
+ "Collecting chroma-hnswlib==0.7.2 (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma)\n",
462
+ " Using cached chroma-hnswlib-0.7.2.tar.gz (31 kB)\n",
463
+ " Installing build dependencies: started\n",
464
+ " Installing build dependencies: finished with status 'done'\n",
465
+ " Getting requirements to build wheel: started\n",
466
+ " Getting requirements to build wheel: finished with status 'done'\n",
467
+ " Preparing metadata (pyproject.toml): started\n",
468
+ " Preparing metadata (pyproject.toml): finished with status 'done'\n",
469
+ "Collecting chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0 (from langchain-chroma)\n",
470
+ " Using cached chromadb-0.4.8-py3-none-any.whl.metadata (6.9 kB)\n",
471
+ " Using cached chromadb-0.4.7-py3-none-any.whl.metadata (6.9 kB)\n",
472
+ " Using cached chromadb-0.4.6-py3-none-any.whl.metadata (6.8 kB)\n",
473
+ " Using cached chromadb-0.4.5-py3-none-any.whl.metadata (6.8 kB)\n",
474
+ " Using cached chromadb-0.4.4-py3-none-any.whl.metadata (6.8 kB)\n",
475
+ " Using cached chromadb-0.4.3-py3-none-any.whl.metadata (6.9 kB)\n",
476
+ "Collecting pandas>=1.3 (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma)\n",
477
+ " Using cached pandas-2.3.3-cp314-cp314-win_amd64.whl.metadata (19 kB)\n",
478
+ "Collecting chroma-hnswlib==0.7.1 (from chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0->langchain-chroma)\n",
479
+ " Using cached chroma-hnswlib-0.7.1.tar.gz (30 kB)\n",
480
+ " Installing build dependencies: started\n",
481
+ " Installing build dependencies: finished with status 'done'\n",
482
+ " Getting requirements to build wheel: started\n",
483
+ " Getting requirements to build wheel: finished with status 'done'\n",
484
+ " Preparing metadata (pyproject.toml): started\n",
485
+ " Preparing metadata (pyproject.toml): finished with status 'done'\n",
486
+ "Collecting chromadb!=0.5.10,!=0.5.11,!=0.5.12,!=0.5.4,!=0.5.5,!=0.5.7,!=0.5.9,<0.7.0,>=0.4.0 (from langchain-chroma)\n",
487
+ " Using cached chromadb-0.4.2-py3-none-any.whl.metadata (6.9 kB)\n",
488
+ " Using cached chromadb-0.4.1-py3-none-any.whl.metadata (6.9 kB)\n",
489
+ " Using cached chromadb-0.4.0-py3-none-any.whl.metadata (6.9 kB)\n",
490
+ "Collecting langchain-chroma\n",
491
+ " Using cached langchain_chroma-0.2.2-py3-none-any.whl.metadata (1.3 kB)\n",
492
+ "Collecting langchain-core!=0.3.0,!=0.3.1,!=0.3.10,!=0.3.11,!=0.3.12,!=0.3.13,!=0.3.14,!=0.3.2,!=0.3.3,!=0.3.4,!=0.3.5,!=0.3.6,!=0.3.7,!=0.3.8,!=0.3.9,<0.4.0,>=0.2.43 (from langchain-chroma)\n",
493
+ " Using cached langchain_core-0.3.80-py3-none-any.whl.metadata (3.2 kB)\n",
494
+ "Collecting numpy<2.0.0,>=1.26.2 (from langchain-chroma)\n",
495
+ " Using cached numpy-1.26.4.tar.gz (15.8 MB)\n",
496
+ " Installing build dependencies: started\n",
497
+ " Installing build dependencies: finished with status 'done'\n",
498
+ " Getting requirements to build wheel: started\n",
499
+ " Getting requirements to build wheel: finished with status 'done'\n",
500
+ " Installing backend dependencies: started\n",
501
+ " Installing backend dependencies: finished with status 'done'\n",
502
+ " Preparing metadata (pyproject.toml): started\n",
503
+ " Preparing metadata (pyproject.toml): finished with status 'error'\n"
504
+ ]
505
+ },
506
+ {
507
+ "name": "stderr",
508
+ "output_type": "stream",
509
+ "text": [
510
+ " error: subprocess-exited-with-error\n",
511
+ " \n",
512
+ " × Preparing metadata (pyproject.toml) did not run successfully.\n",
513
+ " │ exit code: 1\n",
514
+ " ╰─> [21 lines of output]\n",
515
+ " + E:\\Sethu\\server\\.venv\\Scripts\\python.exe C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-felxd0x6\\numpy_aa9d24349cad45e09f0869ed93e74c3e\\vendored-meson\\meson\\meson.py setup C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-felxd0x6\\numpy_aa9d24349cad45e09f0869ed93e74c3e C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-felxd0x6\\numpy_aa9d24349cad45e09f0869ed93e74c3e\\.mesonpy-wq6ry_x4 -Dbuildtype=release -Db_ndebug=if-release -Db_vscrt=md --native-file=C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-felxd0x6\\numpy_aa9d24349cad45e09f0869ed93e74c3e\\.mesonpy-wq6ry_x4\\meson-python-native-file.ini\n",
516
+ " The Meson build system\n",
517
+ " Version: 1.2.99\n",
518
+ " Source dir: C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-felxd0x6\\numpy_aa9d24349cad45e09f0869ed93e74c3e\n",
519
+ " Build dir: C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-felxd0x6\\numpy_aa9d24349cad45e09f0869ed93e74c3e\\.mesonpy-wq6ry_x4\n",
520
+ " Build type: native build\n",
521
+ " Project name: NumPy\n",
522
+ " Project version: 1.26.4\n",
523
+ " WARNING: Failed to activate VS environment: Could not find C:\\Program Files (x86)\\Microsoft Visual Studio\\Installer\\vswhere.exe\n",
524
+ " \n",
525
+ " ..\\meson.build:1:0: ERROR: Unknown compiler(s): [['icl'], ['cl'], ['cc'], ['gcc'], ['clang'], ['clang-cl'], ['pgcc']]\n",
526
+ " The following exception(s) were encountered:\n",
527
+ " Running `icl \"\"` gave \"[WinError 2] The system cannot find the file specified\"\n",
528
+ " Running `cl /?` gave \"[WinError 2] The system cannot find the file specified\"\n",
529
+ " Running `cc --version` gave \"[WinError 2] The system cannot find the file specified\"\n",
530
+ " Running `gcc --version` gave \"[WinError 2] The system cannot find the file specified\"\n",
531
+ " Running `clang --version` gave \"[WinError 2] The system cannot find the file specified\"\n",
532
+ " Running `clang-cl /?` gave \"[WinError 2] The system cannot find the file specified\"\n",
533
+ " Running `pgcc --version` gave \"[WinError 2] The system cannot find the file specified\"\n",
534
+ " \n",
535
+ " A full log can be found at C:\\Users\\User\\AppData\\Local\\Temp\\pip-install-felxd0x6\\numpy_aa9d24349cad45e09f0869ed93e74c3e\\.mesonpy-wq6ry_x4\\meson-logs\\meson-log.txt\n",
536
+ " [end of output]\n",
537
+ " \n",
538
+ " note: This error originates from a subprocess, and is likely not a problem with pip.\n",
539
+ "error: metadata-generation-failed\n",
540
+ "\n",
541
+ "× Encountered error while generating package metadata.\n",
542
+ "╰─> numpy\n",
543
+ "\n",
544
+ "note: This is an issue with the package mentioned above, not pip.\n",
545
+ "hint: See above for details.\n"
546
+ ]
547
+ }
548
+ ],
549
+ "source": [
550
+ "!pip install langchain-chroma langchain-groq"
551
+ ]
552
+ },
553
+ {
554
+ "cell_type": "code",
555
+ "execution_count": 11,
556
+ "metadata": {},
557
+ "outputs": [
558
+ {
559
+ "name": "stdout",
560
+ "output_type": "stream",
561
+ "text": [
562
+ "Requirement already satisfied: langchain-text-splitters in e:\\sethu\\server\\.venv\\lib\\site-packages (1.0.0)\n",
563
+ "Requirement already satisfied: langchain-core<2.0.0,>=1.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-text-splitters) (1.2.0)\n",
564
+ "Requirement already satisfied: jsonpatch<2.0.0,>=1.33.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (1.33)\n",
565
+ "Requirement already satisfied: langsmith<1.0.0,>=0.3.45 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (0.4.59)\n",
566
+ "Requirement already satisfied: packaging<26.0.0,>=23.2.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (25.0)\n",
567
+ "Requirement already satisfied: pydantic<3.0.0,>=2.7.4 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (2.12.5)\n",
568
+ "Requirement already satisfied: pyyaml<7.0.0,>=5.3.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (6.0.3)\n",
569
+ "Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.1.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (9.1.2)\n",
570
+ "Requirement already satisfied: typing-extensions<5.0.0,>=4.7.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (4.15.0)\n",
571
+ "Requirement already satisfied: uuid-utils<1.0,>=0.12.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (0.12.0)\n",
572
+ "Requirement already satisfied: jsonpointer>=1.9 in e:\\sethu\\server\\.venv\\lib\\site-packages (from jsonpatch<2.0.0,>=1.33.0->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (3.0.0)\n",
573
+ "Requirement already satisfied: httpx<1,>=0.23.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (0.28.1)\n",
574
+ "Requirement already satisfied: orjson>=3.9.14 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (3.11.5)\n",
575
+ "Requirement already satisfied: requests-toolbelt>=1.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (1.0.0)\n",
576
+ "Requirement already satisfied: requests>=2.0.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (2.32.5)\n",
577
+ "Requirement already satisfied: zstandard>=0.23.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (0.25.0)\n",
578
+ "Requirement already satisfied: anyio in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (4.12.0)\n",
579
+ "Requirement already satisfied: certifi in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (2025.11.12)\n",
580
+ "Requirement already satisfied: httpcore==1.* in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (1.0.9)\n",
581
+ "Requirement already satisfied: idna in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (3.11)\n",
582
+ "Requirement already satisfied: h11>=0.16 in e:\\sethu\\server\\.venv\\lib\\site-packages (from httpcore==1.*->httpx<1,>=0.23.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (0.16.0)\n",
583
+ "Requirement already satisfied: annotated-types>=0.6.0 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (0.7.0)\n",
584
+ "Requirement already satisfied: pydantic-core==2.41.5 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (2.41.5)\n",
585
+ "Requirement already satisfied: typing-inspection>=0.4.2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from pydantic<3.0.0,>=2.7.4->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (0.4.2)\n",
586
+ "Requirement already satisfied: charset_normalizer<4,>=2 in e:\\sethu\\server\\.venv\\lib\\site-packages (from requests>=2.0.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (3.4.4)\n",
587
+ "Requirement already satisfied: urllib3<3,>=1.21.1 in e:\\sethu\\server\\.venv\\lib\\site-packages (from requests>=2.0.0->langsmith<1.0.0,>=0.3.45->langchain-core<2.0.0,>=1.0.0->langchain-text-splitters) (2.6.2)\n"
588
+ ]
589
+ }
590
+ ],
591
+ "source": [
592
+ "!pip install langchain-text-splitters"
593
+ ]
594
+ },
595
+ {
596
+ "cell_type": "markdown",
597
+ "metadata": {},
598
+ "source": []
599
+ },
600
+ {
601
+ "cell_type": "code",
602
+ "execution_count": null,
603
+ "metadata": {},
604
+ "outputs": [],
605
+ "source": [
606
+ "import os\n",
607
+ "from dotenv import load_dotenv\n",
608
+ "\n",
609
+ "load_dotenv()\n",
610
+ "\n",
611
+ "\n",
612
+ "GROQ_API_KEY = os.getenv('GROQ_API_KEY') #get api key from groq platform to create the LLM"
613
+ ]
614
+ },
615
+ {
616
+ "cell_type": "markdown",
617
+ "metadata": {},
618
+ "source": []
619
+ },
620
+ {
621
+ "cell_type": "code",
622
+ "execution_count": null,
623
+ "metadata": {},
624
+ "outputs": [
625
+ {
626
+ "name": "stderr",
627
+ "output_type": "stream",
628
+ "text": [
629
+ "e:\\Sethu\\server\\.venv\\Lib\\site-packages\\langchain_core\\_api\\deprecation.py:26: UserWarning: Core Pydantic V1 functionality isn't compatible with Python 3.14 or greater.\n",
630
+ " from pydantic.v1.fields import FieldInfo as FieldInfoV1\n",
631
+ "e:\\Sethu\\server\\.venv\\Lib\\site-packages\\tqdm\\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
632
+ " from .autonotebook import tqdm as notebook_tqdm\n"
633
+ ]
634
+ },
635
+ {
636
+ "data": {
637
+ "text/plain": [
638
+ "33"
639
+ ]
640
+ },
641
+ "execution_count": 13,
642
+ "metadata": {},
643
+ "output_type": "execute_result"
644
+ }
645
+ ],
646
+ "source": [
647
+ "from langchain_community.document_loaders import PyPDFDirectoryLoader\n",
648
+ "\n",
649
+ "lpdf = PyPDFDirectoryLoader( #load pdf in kb folder\n",
650
+ " \"KB\"\n",
651
+ ")\n",
652
+ "\n",
653
+ "loaded_docs = lpdf.load()\n",
654
+ "len(loaded_docs) #get the files length"
655
+ ]
656
+ },
657
+ {
658
+ "cell_type": "code",
659
+ "execution_count": null,
660
+ "metadata": {},
661
+ "outputs": [
662
+ {
663
+ "data": {
664
+ "text/plain": [
665
+ "87"
666
+ ]
667
+ },
668
+ "execution_count": 14,
669
+ "metadata": {},
670
+ "output_type": "execute_result"
671
+ }
672
+ ],
673
+ "source": [
674
+ "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
675
+ "\n",
676
+ "tsplitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)\n",
677
+ "splitted_text = tsplitter.split_documents(loaded_docs) #split the document data into chunks\n",
678
+ "\n",
679
+ "len(splitted_text) #get splitted data length"
680
+ ]
681
+ },
682
+ {
683
+ "cell_type": "code",
684
+ "execution_count": null,
685
+ "metadata": {},
686
+ "outputs": [
687
+ {
688
+ "name": "stderr",
689
+ "output_type": "stream",
690
+ "text": [
691
+ "C:\\Users\\User\\AppData\\Local\\Temp\\ipykernel_20848\\2708684703.py:6: LangChainDeprecationWarning: The class `HuggingFaceBgeEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the `langchain-huggingface package and should be used instead. To use it run `pip install -U `langchain-huggingface` and import as `from `langchain_huggingface import HuggingFaceEmbeddings``.\n",
692
+ " bge_embeddings = HuggingFaceBgeEmbeddings(\n"
693
+ ]
694
+ }
695
+ ],
696
+ "source": [
697
+ "from langchain_community.embeddings import HuggingFaceBgeEmbeddings\n",
698
+ "\n",
699
+ "model_name = \"BAAI/bge-base-en\" #using the model data is tokened to understand LLM \n",
700
+ "encode_kwargs = {'normalize_embeddings': True}\n",
701
+ "\n",
702
+ "bge_embeddings = HuggingFaceBgeEmbeddings(\n",
703
+ " model_name=model_name,\n",
704
+ " model_kwargs={'device': 'cpu'},\n",
705
+ " encode_kwargs=encode_kwargs #load the above model\n",
706
+ ")"
707
+ ]
708
+ },
709
+ {
710
+ "cell_type": "code",
711
+ "execution_count": null,
712
+ "metadata": {},
713
+ "outputs": [],
714
+ "source": [
715
+ "from langchain_community.vectorstores import FAISS\n",
716
+ "\n",
717
+ "vector_store = FAISS.from_documents( #get splitted text & tokened model data into database\n",
718
+ " documents=splitted_text,\n",
719
+ " embedding=bge_embeddings\n",
720
+ ")\n",
721
+ "\n",
722
+ "persist_directory = 'chroma_vector_db'\n",
723
+ "\n",
724
+ "vector_store.save_local(persist_directory)"
725
+ ]
726
+ },
727
+ {
728
+ "cell_type": "code",
729
+ "execution_count": null,
730
+ "metadata": {},
731
+ "outputs": [],
732
+ "source": [
733
+ "new_vector_store = FAISS.load_local( #load the database\n",
734
+ " persist_directory, bge_embeddings, allow_dangerous_deserialization=True\n",
735
+ ")"
736
+ ]
737
+ },
738
+ {
739
+ "cell_type": "code",
740
+ "execution_count": null,
741
+ "metadata": {},
742
+ "outputs": [],
743
+ "source": [
744
+ "retriever = new_vector_store.as_retriever(search_kwargs={\"k\": 5}) #get queries and send the top 5 related documents"
745
+ ]
746
+ },
747
+ {
748
+ "cell_type": "code",
749
+ "execution_count": null,
750
+ "metadata": {},
751
+ "outputs": [],
752
+ "source": [
753
+ "from langchain_core.prompts import ChatPromptTemplate, PromptTemplate\n",
754
+ "from langchain_core.runnables import RunnablePassthrough\n",
755
+ "\n",
756
+ "from langchain_core.output_parsers import StrOutputParser\n",
757
+ "from langchain_core.output_parsers import JsonOutputParser\n",
758
+ "\n",
759
+ "from langchain_groq import ChatGroq\n",
760
+ "\n",
761
+ "opmodel = ChatGroq( #start the agent from retriver\n",
762
+ " model=\"openai/gpt-oss-20b\", #load the gpt LLM model using groq api key\n",
763
+ " )\n",
764
+ "\n",
765
+ "rag_context_agent_prompt = PromptTemplate( #create the prompt to send rag model\n",
766
+ " template=\"\"\"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n",
767
+ "You are the Context Retrieval Agent for Legal/Immigration Assistance. \n",
768
+ "Your job is ONLY to extract and organize relevant information from the retrieved CONTEXT. \n",
769
+ "You do NOT generate advice, legal guidance, recommendations, or solutions. \n",
770
+ "You only filter, extract, and structure raw factual information.\n",
771
+ "\n",
772
+ "Your task:\n",
773
+ "1. Read all retrieved CONTEXT.\n",
774
+ "2. Extract information about:\n",
775
+ " - target user group (e.g., migrants, asylum seekers)\n",
776
+ " - challenges or pain points\n",
777
+ " - existing digital tools or services\n",
778
+ " - legal or administrative procedures\n",
779
+ " - gaps, inefficiencies, or barriers\n",
780
+ "3. Remove irrelevant content.\n",
781
+ "4. Do NOT rewrite, interpret, or suggest solutions. Only include raw context fragments.\n",
782
+ "5. Do NOT add any new information.\n",
783
+ "\n",
784
+ "Return the extracted information in this structure:\n",
785
+ "\n",
786
+ "**Target User Group:**\n",
787
+ "- Extract exact mentions from context.\n",
788
+ "\n",
789
+ "**Challenges / Pain Points:**\n",
790
+ "- Bullet points of challenges or difficulties described.\n",
791
+ "\n",
792
+ "**Existing Digital Tools / Services:**\n",
793
+ "- List tools or services exactly as mentioned.\n",
794
+ "\n",
795
+ "**Legal / Administrative Procedures Mentioned:**\n",
796
+ "- Extract exact lines mentioning procedures.\n",
797
+ "\n",
798
+ "**Gaps or Barriers:**\n",
799
+ "- Include lines describing inefficiencies, limitations, or obstacles.\n",
800
+ "\n",
801
+ "**Contextual Quotes / Examples (raw context only):**\n",
802
+ "- Include verbatim context lines that illustrate challenges or processes.\n",
803
+ "\n",
804
+ "If no relevant information is found for any section, write:\n",
805
+ "\"Not available in retrieved context.\"\n",
806
+ "\n",
807
+ "If nothing matches at all, write:\n",
808
+ "\"No relevant information found in retrieved context.\"\n",
809
+ "\n",
810
+ "If something in the CONTEXT does not mention immigration challenges or user groups, say:\n",
811
+ "\"there is no mentioning about immigration challenges for the specified user group.\"\n",
812
+ "\n",
813
+ "<|eot_id|><|start_header_id|>user<|end_header_id|>\n",
814
+ "CONTEXT:\n",
815
+ "{context}\n",
816
+ "\n",
817
+ "Extract only the relevant pieces and return the structured summary above.\n",
818
+ "\n",
819
+ "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n",
820
+ "\"\"\",\n",
821
+ " input_variables=[\"context\"],\n",
822
+ ")"
823
+ ]
824
+ },
825
+ {
826
+ "cell_type": "code",
827
+ "execution_count": null,
828
+ "metadata": {},
829
+ "outputs": [
830
+ {
831
+ "name": "stdout",
832
+ "output_type": "stream",
833
+ "text": [
834
+ "**Target User Group:**\n",
835
+ "- newcomers\n",
836
+ "- people moving to Canada\n",
837
+ "- family members (of newcomers)\n",
838
+ "\n",
839
+ "**Challenges / Pain Points:**\n",
840
+ "- “Moving to Canada may come with opportunities and challenges. It is important to begin preparing for the challenges you may face on arrival in Canada.”\n",
841
+ "- “You will need to collect important documents, arrange where you will stay after arrival, get clothing for the weather, practice the language and other preparations.”\n",
842
+ "- “Collect all important original documents that belong to you and your family members. You may need to translate these documents into English or French depending on where you settle.”\n",
843
+ "- “BEGIN THE FOREIGN CREDENTIAL RECOGNITION PROCESS… Depending on your field of work, this process can take months or even years, and could come with associated costs.”\n",
844
+ "- “Apply for your Social Insurance Number (SIN). It is needed to work in Canada and access government services and benefits. You can get a temporary SIN for a short period. Do not give your SIN to anyone unless you are sure there is a legal reason. If someone asks for your SIN, ask why they need it.”\n",
845
+ "- “Driving in Canada. Learn about provincial or territorial laws to get your driver’s license, or get an International Driving Permit.”\n",
846
+ "- “Improve your English or French. There are many language training options and some have no cost.”\n",
847
+ "- “Make connections and build relationships in your community. Your community can help with your settlement.”\n",
848
+ "\n",
849
+ "**Existing Digital Tools / Services:**\n",
850
+ "- “About pre-arrival services” webpage (web page is available in English and French)\n",
851
+ "- “Crossing the border: documents you need” webpage\n",
852
+ "- “Personal Effects Accounting Document” webpage\n",
853
+ "- “Find free newcomer services near you” webpage (web page is available in English and French)\n",
854
+ "- “Social Insurance Number– Overview” webpage (web page is available in English and French)\n",
855
+ "- “Language Training Options for Newcomers to Canada” video (available in English and French)\n",
856
+ "- “Your Rights and Freedoms in Canada” webpage (web page is available in English and French)\n",
857
+ "\n",
858
+ "**Legal / Administrative Procedures Mentioned:**\n",
859
+ "- “Apply for your Social Insurance Number (SIN).”\n",
860
+ "- “You can get a temporary SIN for a short period.”\n",
861
+ "- “Driving in Canada. Learn about provincial or territorial laws to get your driver’s license, or get an International Driving Permit.”\n",
862
+ "- “BEGIN THE FOREIGN CREDENTIAL RECOGNITION PROCESS… you will need a special licence or certificate before you can work in those jobs.”\n",
863
+ "- “Collect all important original documents that belong to you and your family members. You may need to translate these documents into English or French depending on where you settle.”\n",
864
+ "- “Gather documents such as passports, diplomas, language proficiency scores, vaccination records, medical records, dental and eye records, driver’s licences, birth and death certificates, marriage or divorce certificates, proof of work experience.”\n",
865
+ "\n",
866
+ "**Gaps or Barriers:**\n",
867
+ "- “Depending on your field of work, this process can take months or even years, and could come with associated costs.”\n",
868
+ "- “You may need to translate these documents into English or French depending on where you settle.”\n",
869
+ "- “You will need a special licence or certificate before you can work in those jobs.”\n",
870
+ "- “Do not give your SIN to anyone unless you are sure there is a legal reason.”\n",
871
+ "\n",
872
+ "**Contextual Quotes / Examples (raw context only):**\n",
873
+ "- “Moving to Canada may come with opportunities and challenges. It is important to begin preparing for the challenges you may face on arrival in Canada.”\n",
874
+ "- “You will need to collect important documents, arrange where you will stay after arrival, get clothing for the weather, practice the language and other preparations.”\n",
875
+ "- “Collect all important original documents that belong to you and your family members. You may need to translate these documents into English or French depending on where you settle.”\n",
876
+ "- “BEGIN THE FOREIGN CREDENTIAL RECOGNITION PROCESS In Canada, some jobs are regulated, which means you will need a special licence or certificate before you can work in those jobs. It is better to start the process of having your credentials recognized as early as possible, to work in a regulated profession in Canada. You can start this process even before you leave your country. Depending on your field of work, this process can take months or even years, and could come with associated costs.”\n",
877
+ "- “Apply for your Social Insurance Number (SIN). It is needed to work in Canada and access government services and benefits. You can get a temporary SIN for a short period. Do not give your SIN to anyone unless you are sure there is a legal reason. If someone asks for your SIN, ask why they need it.”\n",
878
+ "- “Improve your English or French. There are many language training options and some have no cost. To learn more, consult Language Training Options for Newcomers to Canada (video is available in English and French) or visit the settlement provider organization in your community.”\n",
879
+ "- “Driving in Canada. Learn about provincial or territorial laws to get your driver’s license, or get an International Driving Permit.”\n",
880
+ "- “Learn about Canadian laws and your rights and civic responsibilities as a resident of Canada. To learn more, consult: Your Rights and Freedoms in Canada (web page is available in English and French).”\n"
881
+ ]
882
+ }
883
+ ],
884
+ "source": [
885
+ "rag_prompt_chain = rag_context_agent_prompt | opmodel | StrOutputParser() #content get from rag\n",
886
+ "\n",
887
+ "QUESTION = \"Extract all relevant information about the challenges, tools, and barriers for migrants navigating immigration processes to canada.\"\n",
888
+ "CONTEXT = retriever.invoke(QUESTION) #past the content to the prompt\n",
889
+ "\n",
890
+ "result = rag_prompt_chain.invoke({\"context\":CONTEXT}) #display the final prompt content \n",
891
+ "\n",
892
+ "print(result)"
893
+ ]
894
+ },
895
+ {
896
+ "cell_type": "code",
897
+ "execution_count": null,
898
+ "metadata": {},
899
+ "outputs": [],
900
+ "source": []
901
+ }
902
+ ],
903
+ "metadata": {
904
+ "kernelspec": {
905
+ "display_name": ".venv",
906
+ "language": "python",
907
+ "name": "python3"
908
+ },
909
+ "language_info": {
910
+ "codemirror_mode": {
911
+ "name": "ipython",
912
+ "version": 3
913
+ },
914
+ "file_extension": ".py",
915
+ "mimetype": "text/x-python",
916
+ "name": "python",
917
+ "nbconvert_exporter": "python",
918
+ "pygments_lexer": "ipython3",
919
+ "version": "3.14.2"
920
+ }
921
+ },
922
+ "nbformat": 4,
923
+ "nbformat_minor": 2
924
+ }