Sarah Bentley commited on
Commit
abf7d79
·
0 Parent(s):

Initial commit

Browse files
.gitignore ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ venv/
2
+ __pycache__/
3
+ .env
4
+ *.pyc
5
+ .ipynb_checkpoints/
README.md ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Boston Public School Selection Chatbot
2
+
3
+ This is a skeleton repo you can use to design a school choice chatbot. Feel free to change it however you'd like! The end goal: make the chatbot and upload it to a huggingface space. Instructions for doing so are below.
4
+
5
+ ## Setup
6
+
7
+ 1. Install the required dependencies:
8
+ ```bash
9
+ pip install -r requirements.txt
10
+ ```
11
+
12
+ 2. Get access to the LLaMA model:
13
+ - Visit [Hugging Face](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
14
+ - Request access to the LLaMA 2 model
15
+ - Once approved, log in to Hugging Face:
16
+ ```bash
17
+ huggingface-cli login
18
+ ```
19
+
20
+ 3. Run the chatbot:
21
+ ```bash
22
+ python app.py
23
+ ```
24
+
25
+ ## Deploying to Hugging Face
26
+
27
+ To deploy your chatbot as a free web interface using Hugging Face Spaces:
28
+
29
+ 1. Create a Hugging Face Space:
30
+ - Go to [Hugging Face Spaces](https://huggingface.co/spaces)
31
+ - Click "Create new Space"
32
+ - Choose a name for your space (e.g., "boston-school-chatbot")
33
+ - Select "Gradio" as the SDK
34
+ - Choose "CPU" as the hardware (free tier)
35
+ - Make it "Public" so others can use your chatbot
36
+
37
+ 2. Prepare your files:
38
+ Your repository should already have all needed files:
39
+ ```
40
+ 6.so41-midterm/
41
+ ├── README.md # Description of your chatbot
42
+ ├── app.py # Your Gradio interface
43
+ ├── requirements.txt # Already set up with needed dependencies
44
+ └── src/ # Your implementation files
45
+ ```
46
+
47
+ 3. Push your code to the Space:
48
+ ```bash
49
+ git init
50
+ git add .
51
+ git commit -m "Initial commit"
52
+ git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
53
+ git push -u origin main
54
+ ```
55
+
56
+ 4. Important Free Tier Considerations:
57
+ - Use TinyLlama model (already configured in model.py)
58
+ - Free CPU spaces have 2GB RAM limit
59
+ - Responses might be slower than local testing
60
+ - The interface might queue requests when multiple users access it
61
+
62
+ 5. After Deployment:
63
+ - Your chatbot will be available at: `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`
64
+ - Anyone can use it through their web browser
65
+ - You can update the deployment anytime by pushing changes:
66
+ ```bash
67
+ git add .
68
+ git commit -m "Update chatbot"
69
+ git push
70
+ ```
71
+
72
+ 6. Troubleshooting:
73
+ - Check the Space's logs if the chatbot isn't working
74
+ - Make sure you're using TinyLlama model
75
+ - Verify the chatbot works locally before deploying
76
+ - Remember free tier has limited resources
77
+
78
+ Your chatbot should now be accessible to anyone through their web browser!
79
+
80
+ ## Repository Organization
81
+
82
+ ```
83
+ boston-school-chatbot/
84
+ ├── app.py # Gradio web interface - implement the chat function
85
+ ├── requirements.txt # Python dependencies
86
+ ├── chatbot_development.ipynb # Notebook for developing and testing your chatbot
87
+ ├── chatbot_conversation_example.txt # Example conversation we might want to have with this chatbot
88
+ └── src/
89
+ ├── model.py # Model loading/saving (already implemented)
90
+ └── chat.py # SchoolChatbot class (implement this)
91
+ ```
92
+
93
+ ### Key Files:
94
+
95
+ - **app.py**: Creates the web interface using Gradio. You only need to implement the `chat` function that generates responses.
96
+
97
+ - **model.py**: Handles loading and saving of LLaMA models. This is already implemented.
98
+
99
+ - **chat.py**: Contains the `SchoolChatbot` class where you'll implement:
100
+ - `format_prompt`: Format user input into proper prompts
101
+ - `get_response`: Generate responses using the model
102
+
103
+ - **chatbot_development.ipynb**: Jupyter notebook for:
104
+ - Loading and testing your model
105
+ - Experimenting with the chatbot
106
+ - Trying different approaches
107
+ - Testing responses before deployment
108
+
109
+ ### What You Need to Implement:
110
+
111
+ 1. In `chat.py`:
112
+ - Complete the `SchoolChatbot` class methods
113
+ - Design how the chatbot formats prompts
114
+ - Implement response generation
115
+
116
+ 2. In `app.py`:
117
+ - Implement the `chat` function to work with Gradio
118
+ - The rest of the file is already set up
119
+
120
+ 3. Use `chatbot_development.ipynb` to:
121
+ - Develop and test your implementation
122
+ - Try different approaches
123
+ - Verify everything works before deployment
124
+
app.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Gradio Web Interface for Boston School Chatbot
3
+
4
+ This script creates a web interface for your chatbot using Gradio.
5
+ You only need to implement the chat function.
6
+
7
+ Key Features:
8
+ - Creates a web UI for your chatbot
9
+ - Handles conversation history
10
+ - Provides example questions
11
+ - Can be deployed to Hugging Face Spaces
12
+
13
+ Example Usage:
14
+ # Run locally:
15
+ python app.py
16
+
17
+ # Access in browser:
18
+ # http://localhost:7860
19
+ """
20
+
21
+ import gradio as gr
22
+ from src.model import load_model, generate_response
23
+ from src.chat import SchoolChatbot
24
+
25
+ def create_chatbot():
26
+ """
27
+ Creates and configures the chatbot interface.
28
+ """
29
+ model, tokenizer = load_model()
30
+ chatbot = SchoolChatbot(model, tokenizer)
31
+
32
+ def chat(message, history):
33
+ """
34
+ TODO:Generate a response for the current message in a Gradio chat interface.
35
+
36
+ This function is called by Gradio's ChatInterface every time a user sends a message.
37
+ You only need to generate and return the assistant's response - Gradio handles the
38
+ chat display and history management automatically.
39
+
40
+ Args:
41
+ message (str): The current message from the user
42
+ history (list): List of previous message pairs, where each pair is
43
+ [user_message, assistant_message]
44
+ Example:
45
+ [
46
+ ["What schools offer Spanish?", "The Hernandez School..."],
47
+ ["Where is it located?", "The Hernandez School is in Roxbury..."]
48
+ ]
49
+
50
+ Returns:
51
+ str: The assistant's response to the current message.
52
+
53
+
54
+ Note:
55
+ - Gradio automatically:
56
+ - Displays the user's message
57
+ - Displays your returned response
58
+ - Updates the chat history
59
+ - Maintains the chat interface
60
+ - You only need to:
61
+ - Generate an appropriate response to the current message
62
+ - Return that response as a string
63
+ """
64
+ # TODO: Generate and return response
65
+ pass
66
+
67
+ # Create Gradio interface
68
+ demo = gr.ChatInterface(
69
+ chat,
70
+ title="Boston Public School Selection Assistant",
71
+ description="Ask me anything about Boston public schools!",
72
+ examples=[
73
+ "What schools in Jamaica Plain offer Spanish programs?",
74
+ "How do I schedule a tour of the Hernandez School?"
75
+ ]
76
+ )
77
+
78
+ return demo
79
+
80
+ if __name__ == "__main__":
81
+ demo = create_chatbot()
82
+ demo.launch()
chatbot_conversation_example.txt ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ User: Hi, I'm looking for a public school in Boston for my child who will be starting kindergarten next year.
2
+
3
+ Bot: Hello! I'd be happy to help you find the right public school in Boston. To better assist you, could you tell me:
4
+ 1. Which neighborhood do you live in?
5
+ 2. Do you have any specific preferences (like language programs, special education services, or after-school programs)?
6
+ 3. Would you prefer a traditional public school or a charter school?
7
+
8
+ User: We live in Jamaica Plain, and I'm interested in schools with strong language programs. My child is already showing interest in learning Spanish.
9
+
10
+ Bot: Great! I can help you with that. There are several excellent options in Jamaica Plain and nearby areas. Let me suggest a few schools that might be a good fit:
11
+
12
+ 1. The Hernandez K-8 School in Roxbury (about 1.5 miles from JP) offers a Spanish-English dual language program starting from kindergarten. They have a strong academic reputation and diverse student body.
13
+
14
+ 2. The Mozart Elementary School in JP has a Spanish language immersion program and is known for its strong arts integration.
15
+
16
+ 3. The Mission Hill K-8 School in Roxbury offers Spanish language classes and has a progressive education model.
17
+
18
+ Would you like to know more about any of these schools? I can provide details about:
19
+ - Class sizes
20
+ - Test scores
21
+ - Special programs
22
+ - Parent reviews
23
+ - Transportation options
24
+
25
+ User: Could you tell me more about the Hernandez School? I've heard good things about it.
26
+
27
+ Bot: Certainly! The Hernandez K-8 School has several notable features:
28
+
29
+ - Their dual language program is one of the strongest in the district, with 50% of instruction in Spanish and 50% in English
30
+ - Average class size is 22 students
31
+ - They offer free breakfast and lunch programs
32
+ - The school has a strong parent-teacher organization
33
+ - They provide after-school programs until 6 PM
34
+ - The school is accessible by public transportation (Orange Line and several bus routes)
35
+
36
+ Would you like to know about:
37
+ 1. Their enrollment process?
38
+ 2. How to schedule a school tour?
39
+ 3. Their academic performance metrics?
40
+ 4. Specific programs for kindergarten students?
41
+
42
+ User: Yes, how do I schedule a tour?
43
+
44
+ Bot: You can schedule a tour of the Hernandez School in two ways:
45
+
46
+ 1. Call the main office at (617) 635-XXXX during school hours (8:00 AM - 4:00 PM)
47
+ 2. Visit the Boston Public Schools website and use their online tour scheduling system
48
+
49
+ Tours are typically offered on Tuesday and Thursday mornings at 9:00 AM. They last about 45 minutes and include:
50
+ - A classroom visit
51
+ - Meeting with the principal
52
+ - Q&A session
53
+ - Overview of the dual language program
54
+
55
+ Would you like me to provide the exact contact information for scheduling a tour?
56
+
57
+ User: Yes, please.
58
+
59
+ Bot: Here's the contact information for the Hernandez School:
60
+
61
+ Main Office: (617) 635-XXXX
62
+ Email: hernandez@bostonpublicschools.org
63
+ Address: 61 School Street, Roxbury, MA 02119
64
+
65
+ The school's website is: www.bostonpublicschools.org/hernandez
66
+
67
+ Is there anything else you'd like to know about the school or other options in the area?
68
+
69
+ User: No, that's very helpful! Thank you.
70
+
71
+ Bot: You're welcome! Good luck with your school search. If you have any more questions about Boston public schools or need help with the enrollment process, feel free to ask. Have a great day!
chatbot_development.ipynb ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "# Chatbot Development\n",
8
+ "\n",
9
+ "Use this notebook to load the model and then initialize, update, and test the chatbot."
10
+ ]
11
+ },
12
+ {
13
+ "cell_type": "markdown",
14
+ "metadata": {},
15
+ "source": [
16
+ "### Setup and Imports"
17
+ ]
18
+ },
19
+ {
20
+ "cell_type": "code",
21
+ "execution_count": null,
22
+ "metadata": {},
23
+ "outputs": [
24
+ {
25
+ "ename": "",
26
+ "evalue": "",
27
+ "output_type": "error",
28
+ "traceback": [
29
+ "\u001b[1;31mRunning cells with 'Python 3.11.6' requires the ipykernel package.\n",
30
+ "\u001b[1;31mRun the following command to install 'ipykernel' into the Python environment. \n",
31
+ "\u001b[1;31mCommand: '/usr/local/bin/python3 -m pip install ipykernel -U --user --force-reinstall'"
32
+ ]
33
+ }
34
+ ],
35
+ "source": [
36
+ "import torch\n",
37
+ "from huggingface_hub import login\n",
38
+ "\n",
39
+ "\n",
40
+ "from model import load_model, save_model\n",
41
+ "from chat import SchoolChatbot"
42
+ ]
43
+ },
44
+ {
45
+ "cell_type": "code",
46
+ "execution_count": null,
47
+ "metadata": {},
48
+ "outputs": [],
49
+ "source": [
50
+ "\"\"\"\n",
51
+ "TODO: Add your Hugging Face token\n",
52
+ "Options:\n",
53
+ "1. Use login() and enter token when prompted\n",
54
+ "2. Set environment variable HUGGINGFACE_TOKEN\n",
55
+ "3. Pass token directly (not recommended for shared notebooks)\n",
56
+ "\"\"\"\n",
57
+ "\n",
58
+ "# login() # Uncomment this line and add your token\n",
59
+ "\n"
60
+ ]
61
+ },
62
+ {
63
+ "cell_type": "markdown",
64
+ "metadata": {},
65
+ "source": [
66
+ "### Load model and tokenizer"
67
+ ]
68
+ },
69
+ {
70
+ "cell_type": "code",
71
+ "execution_count": null,
72
+ "metadata": {},
73
+ "outputs": [],
74
+ "source": [
75
+ "\"\"\"\n",
76
+ "Load the model using functions from model.py\n",
77
+ "Note: This might take a few minutes depending on your hardware\n",
78
+ "\"\"\"\n",
79
+ "\n",
80
+ "model, tokenizer = load_model()\n",
81
+ "\n",
82
+ "# Test model loading\n",
83
+ "print(\"Model loaded:\", type(model))\n",
84
+ "print(\"Tokenizer loaded:\", type(tokenizer))\n"
85
+ ]
86
+ },
87
+ {
88
+ "cell_type": "markdown",
89
+ "metadata": {},
90
+ "source": [
91
+ "### Initialize and test chatbot"
92
+ ]
93
+ },
94
+ {
95
+ "cell_type": "code",
96
+ "execution_count": null,
97
+ "metadata": {},
98
+ "outputs": [],
99
+ "source": [
100
+ "\"\"\"\n",
101
+ "Create chatbot instance using chat.py\n",
102
+ "\"\"\"\n",
103
+ "chatbot = SchoolChatbot(model, tokenizer)"
104
+ ]
105
+ },
106
+ {
107
+ "cell_type": "code",
108
+ "execution_count": null,
109
+ "metadata": {},
110
+ "outputs": [],
111
+ "source": [
112
+ "\"\"\"\n",
113
+ "Test out generating some responses from the chatbot\n",
114
+ "\"\"\"\n",
115
+ "test_questions = [\n",
116
+ " \"What schools in Jamaica Plain offer Spanish programs?\",\n",
117
+ " \"How do I schedule a tour of the Hernandez School?\"\n",
118
+ "]\n",
119
+ "\n",
120
+ "for question in test_questions:\n",
121
+ " print(f\"\\nQuestion: {question}\")\n",
122
+ " response = chatbot.get_response(question)\n",
123
+ " print(f\"Response: {response}\")\n"
124
+ ]
125
+ },
126
+ {
127
+ "cell_type": "markdown",
128
+ "metadata": {},
129
+ "source": [
130
+ "# TODO: Update pre-trained Llama to be a school choice chatbot\n",
131
+ "\n",
132
+ "This part is up to you! You might want to finetune the model, simply make a really good system prompt, use RAG, provide it boston school choice data somehow, etc. Be creative! You can also feel free to do this in another script and then evaluate the model here."
133
+ ]
134
+ },
135
+ {
136
+ "cell_type": "code",
137
+ "execution_count": null,
138
+ "metadata": {},
139
+ "outputs": [],
140
+ "source": [
141
+ "# If you update the model, you can use the `save_model` function from model.py to save the new model\n",
142
+ "save_model(model, tokenizer)\n"
143
+ ]
144
+ }
145
+ ],
146
+ "metadata": {
147
+ "kernelspec": {
148
+ "display_name": "Python 3",
149
+ "language": "python",
150
+ "name": "python3"
151
+ },
152
+ "language_info": {
153
+ "name": "python",
154
+ "version": "3.11.6"
155
+ }
156
+ },
157
+ "nbformat": 4,
158
+ "nbformat_minor": 2
159
+ }
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ torch>=2.0.0
2
+ transformers>=4.30.0
3
+ datasets>=2.12.0
4
+ accelerate>=0.20.0
5
+ sentencepiece>=0.1.99
6
+ gradio>=3.40.0
src/chat.py ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ class SchoolChatbot:
2
+ """
3
+ This class is extra scaffolding around a model. Modify this class to specify how the model recieves prompts and generates responses.
4
+
5
+ Example usage:
6
+ model, tokenizer = load_model()
7
+ chatbot = SchoolChatbot(model, tokenizer)
8
+ response = chatbot.get_response("What schools offer Spanish programs?")
9
+ """
10
+
11
+ def __init__(self, model, tokenizer):
12
+ """
13
+ Initialize the chatbot with a model and tokenizer.
14
+ You don't need to modify this method.
15
+ """
16
+ self.model = model
17
+ self.tokenizer = tokenizer
18
+
19
+ def format_prompt(self, user_input):
20
+ """
21
+ TODO: Implement this method to format the user's input into a proper prompt.
22
+
23
+ This method should:
24
+ 1. Add any necessary system context or instructions
25
+ 2. Format the user's input appropriately
26
+ 3. Add any special tokens or formatting the model expects
27
+
28
+ Args:
29
+ user_input (str): The user's question about Boston schools
30
+
31
+ Returns:
32
+ str: A formatted prompt ready for the model
33
+
34
+ Example prompt format:
35
+ "You are a helpful assistant that specializes in Boston schools...
36
+ User: {user_input}
37
+ Assistant:"
38
+ """
39
+ pass
40
+
41
+ def get_response(self, user_input):
42
+ """
43
+ TODO: Implement this method to generate responses to user questions.
44
+
45
+ This method should:
46
+ 1. Use format_prompt() to prepare the input
47
+ 2. Generate a response using the model
48
+ 3. Clean up and return the response
49
+
50
+ Args:
51
+ user_input (str): The user's question about Boston schools
52
+
53
+ Returns:
54
+ str: The chatbot's response
55
+
56
+ Implementation tips:
57
+ - Use self.tokenizer to convert text to tokens
58
+ - Use self.model.generate() for text generation
59
+ - Consider parameters like temperature and max_length
60
+ - Clean up the response before returning it
61
+ """
62
+ pass
src/model.py ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ This module handles loading and saving of LLaMA models with efficient quantization.
3
+ This is already implemented and ready to use -- you don't need to modify this file.
4
+
5
+ Key Features:
6
+ - Loads LLaMA models from Hugging Face or local storage
7
+ - Implements 4-bit quantization for memory efficiency
8
+ - Provides save/load functionality for model persistence
9
+ - Handles model loading errors gracefully
10
+
11
+ Example Usage:
12
+ from model import load_model, save_model
13
+
14
+ # Load a model (will download if not found locally)
15
+ model, tokenizer = load_model("meta-llama/Llama-2-7b-chat-hf")
16
+
17
+ # Save model after making changes
18
+ save_model(model, tokenizer)
19
+ """
20
+
21
+ import os
22
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
23
+ import torch
24
+
25
+ # Choose a model
26
+ MODEL_NAME = "TinyLlama/TinyLlama-1.1B-Chat-v1.0" # Change this to your preferred model
27
+ # Other options:
28
+ # MODEL_NAME = "meta-llama/Llama-2-7b-chat-hf"
29
+ # MODEL_NAME = "openlm-research/open_llama_3b"
30
+
31
+ # Path to save and load models
32
+ MODEL_SAVE_PATH = "models/school_chatbot"
33
+
34
+
35
+ def save_model(model, tokenizer, save_directory="models/school_chatbot"):
36
+ """
37
+ Save the model and tokenizer to a local directory
38
+ """
39
+ # Create directory if it doesn't exist
40
+ os.makedirs(save_directory, exist_ok=True)
41
+
42
+ # Save model and tokenizer
43
+ model.save_pretrained(save_directory)
44
+ tokenizer.save_pretrained(save_directory)
45
+
46
+ print(f"Model and tokenizer saved to {save_directory}")
47
+
48
+
49
+ def load_model():
50
+ """
51
+ Load the model with 4-bit quantization
52
+ """
53
+ try:
54
+ # Use quantization to reduce memory usage
55
+ quantization_config = BitsAndBytesConfig(
56
+ load_in_4bit=True, # Enable 4-bit quantization
57
+ bnb_4bit_compute_dtype=torch.float16, # Compute dtype
58
+ bnb_4bit_quant_type="nf4", # Normalized float 4 format
59
+ bnb_4bit_use_double_quant=True # Use nested quantization
60
+ )
61
+
62
+ if os.path.exists(MODEL_SAVE_PATH):
63
+ print("Loading quantized model from local storage...")
64
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_SAVE_PATH)
65
+ model = AutoModelForCausalLM.from_pretrained(
66
+ MODEL_SAVE_PATH,
67
+ quantization_config=quantization_config,
68
+ device_map="auto"
69
+ )
70
+ else:
71
+ print("Downloading and quantizing model from Hugging Face...")
72
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
73
+ model = AutoModelForCausalLM.from_pretrained(
74
+ MODEL_NAME,
75
+ quantization_config=quantization_config,
76
+ device_map="auto"
77
+ )
78
+ # Save for future use
79
+ save_model(model, tokenizer)
80
+
81
+ return model, tokenizer
82
+
83
+ except Exception as e:
84
+ print(f"Error loading model: {e}")
85
+ return None, None
86
+