AI-Talent-Force Claude Sonnet 4.5 commited on
Commit
e1c2a9e
·
0 Parent(s):

Add 4-bit quantization and audioop-lts for Python 3.13

Browse files

- Use BitsAndBytesConfig for 4-bit quantization to fit in GPU memory
- Load LoRA adapter from HuggingFace model repo
- Add audioop-lts dependency for Python 3.13 compatibility
- Gradio 6.5.1 with minimal dependency conflicts

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Files changed (5) hide show
  1. .gitattributes +6 -0
  2. .gitignore +12 -0
  3. README.md +65 -0
  4. app.py +153 -0
  5. requirements.txt +9 -0
.gitattributes ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
2
+ *.bin filter=lfs diff=lfs merge=lfs -text
3
+ *.h5 filter=lfs diff=lfs merge=lfs -text
4
+ *.pt filter=lfs diff=lfs merge=lfs -text
5
+ *.pth filter=lfs diff=lfs merge=lfs -text
6
+ *.gguf filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ceo-voice-lora/
2
+ .DS_Store
3
+ __pycache__/
4
+ *.pyc
5
+ *.pyo
6
+ *.pyd
7
+ .Python
8
+ *.so
9
+ *.egg
10
+ *.egg-info/
11
+ dist/
12
+ build/
README.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: CEO AI Executive
3
+ emoji: 🎯
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 6.5.1
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ models:
12
+ - unsloth/qwen3-30b-a3b
13
+ tags:
14
+ - chatbot
15
+ - lora
16
+ - qwen3
17
+ - fine-tuning
18
+ ---
19
+
20
+ # CEO AI Executive 🎯
21
+
22
+ An AI chatbot that responds like your CEO, trained on their blog posts and writings.
23
+
24
+ ## Features
25
+
26
+ - 💬 Natural conversation interface
27
+ - 🧠 Powered by fine-tuned Qwen3-30B model
28
+ - 🎨 Clean and intuitive UI
29
+ - ⚡ Fast response generation
30
+
31
+ ## How It Works
32
+
33
+ This application uses:
34
+ 1. **Base Model**: Qwen3-30B (unsloth/qwen3-30b-a3b)
35
+ 2. **Fine-tuning**: LoRA adapter trained on CEO's blog posts
36
+ 3. **Interface**: Gradio chatbot UI
37
+
38
+ The model has been fine-tuned to capture the CEO's:
39
+ - Writing style and tone
40
+ - Perspectives on business and leadership
41
+ - Communication patterns
42
+ - Domain expertise
43
+
44
+ ## Usage
45
+
46
+ Simply type your question in the chat box and the AI will respond in the CEO's voice and style.
47
+
48
+ ### Example Questions
49
+
50
+ - "What's your vision for the company?"
51
+ - "How do you approach leadership?"
52
+ - "What are your thoughts on innovation?"
53
+ - "Can you share your perspective on team building?"
54
+ - "What drives your business strategy?"
55
+
56
+ ## Technical Details
57
+
58
+ - **Model**: Qwen3-30B with LoRA fine-tuning
59
+ - **Framework**: Transformers, PEFT
60
+ - **Interface**: Gradio 6.5.1
61
+ - **Hardware**: GPU-accelerated (Hugging Face Spaces GPU)
62
+
63
+ ## Disclaimer
64
+
65
+ This AI is trained on historical writings and represents patterns learned from the CEO's public content. Responses should not be considered as official statements or advice from the actual CEO.
app.py ADDED
@@ -0,0 +1,153 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import torch
3
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
4
+ from peft import PeftModel, PeftConfig
5
+ import spaces
6
+
7
+ # Model configuration
8
+ BASE_MODEL = "unsloth/qwen3-30b-a3b"
9
+ LORA_ADAPTER_PATH = "AI-Talent-Force/ceo-voice-lora-qwen3-30b"
10
+
11
+ # Load model and tokenizer
12
+ @spaces.GPU
13
+ def load_model():
14
+ """Load the base model and apply LoRA adapter"""
15
+ print("Loading tokenizer...")
16
+ tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
17
+
18
+ print("Loading base model...")
19
+ # Use 4-bit quantization to fit in GPU memory
20
+ quantization_config = BitsAndBytesConfig(
21
+ load_in_4bit=True,
22
+ bnb_4bit_compute_dtype=torch.bfloat16,
23
+ bnb_4bit_use_double_quant=True,
24
+ bnb_4bit_quant_type="nf4"
25
+ )
26
+
27
+ model = AutoModelForCausalLM.from_pretrained(
28
+ BASE_MODEL,
29
+ quantization_config=quantization_config,
30
+ device_map="auto",
31
+ trust_remote_code=True
32
+ )
33
+
34
+ print("Loading LoRA adapter...")
35
+ model = PeftModel.from_pretrained(model, LORA_ADAPTER_PATH)
36
+ model.eval()
37
+
38
+ print("Model loaded successfully!")
39
+ return model, tokenizer
40
+
41
+ # Initialize model and tokenizer
42
+ print("Initializing CEO AI Executive...")
43
+ model, tokenizer = load_model()
44
+
45
+ @spaces.GPU
46
+ def chat_with_ceo(message, history):
47
+ """
48
+ Chat function that responds like the CEO
49
+ Args:
50
+ message: User's current message
51
+ history: List of previous messages [[user_msg, bot_msg], ...]
52
+ """
53
+ # Build conversation context
54
+ conversation = []
55
+ for user_msg, bot_msg in history:
56
+ conversation.append({"role": "user", "content": user_msg})
57
+ conversation.append({"role": "assistant", "content": bot_msg})
58
+
59
+ conversation.append({"role": "user", "content": message})
60
+
61
+ # Apply chat template
62
+ prompt = tokenizer.apply_chat_template(
63
+ conversation,
64
+ tokenize=False,
65
+ add_generation_prompt=True
66
+ )
67
+
68
+ # Tokenize
69
+ inputs = tokenizer(prompt, return_tensors="pt", truncate=True, max_length=4096)
70
+ inputs = {k: v.to(model.device) for k, v in inputs.items()}
71
+
72
+ # Generate response
73
+ with torch.no_grad():
74
+ outputs = model.generate(
75
+ **inputs,
76
+ max_new_tokens=512,
77
+ temperature=0.7,
78
+ top_p=0.9,
79
+ do_sample=True,
80
+ repetition_penalty=1.1,
81
+ pad_token_id=tokenizer.pad_token_id,
82
+ eos_token_id=tokenizer.eos_token_id
83
+ )
84
+
85
+ # Decode response
86
+ response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
87
+ return response
88
+
89
+ # Create Gradio interface
90
+ with gr.Blocks(theme=gr.themes.Soft()) as demo:
91
+ gr.Markdown(
92
+ """
93
+ # 🎯 CEO AI Executive
94
+
95
+ Chat with an AI trained on your CEO's writing style and thoughts.
96
+ Ask questions about business strategy, leadership, technology, or any topic your CEO writes about.
97
+
98
+ **Note:** This AI responds based on patterns learned from the CEO's blog posts and writings.
99
+ """
100
+ )
101
+
102
+ chatbot = gr.Chatbot(
103
+ height=500,
104
+ label="Chat with CEO AI",
105
+ show_label=True,
106
+ avatar_images=(None, "🎯")
107
+ )
108
+
109
+ with gr.Row():
110
+ msg = gr.Textbox(
111
+ label="Your Message",
112
+ placeholder="Ask me anything...",
113
+ show_label=False,
114
+ scale=4
115
+ )
116
+ submit = gr.Button("Send", variant="primary", scale=1)
117
+
118
+ with gr.Row():
119
+ clear = gr.Button("Clear Chat")
120
+
121
+ gr.Examples(
122
+ examples=[
123
+ "What's your vision for the company?",
124
+ "How do you approach leadership?",
125
+ "What are your thoughts on innovation?",
126
+ "Can you share your perspective on team building?",
127
+ "What drives your business strategy?"
128
+ ],
129
+ inputs=msg,
130
+ label="Example Questions"
131
+ )
132
+
133
+ gr.Markdown(
134
+ """
135
+ ---
136
+ ### About This AI
137
+ This chatbot uses a fine-tuned Qwen3-30B language model trained on the CEO's blog posts and writings.
138
+ It attempts to replicate their writing style, thinking patterns, and perspectives on various topics.
139
+ """
140
+ )
141
+
142
+ # Event handlers
143
+ msg.submit(chat_with_ceo, inputs=[msg, chatbot], outputs=chatbot)
144
+ submit.click(chat_with_ceo, inputs=[msg, chatbot], outputs=chatbot)
145
+ clear.click(lambda: None, None, chatbot, queue=False)
146
+
147
+ # Clear message box after submission
148
+ msg.submit(lambda: "", None, msg)
149
+ submit.click(lambda: "", None, msg)
150
+
151
+ if __name__ == "__main__":
152
+ demo.queue()
153
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ gradio==6.5.1
2
+ transformers>=4.50.0
3
+ torch==2.5.1
4
+ peft==0.18.1
5
+ accelerate==1.2.1
6
+ safetensors==0.4.5
7
+ spaces==0.30.3
8
+ bitsandbytes==0.45.0
9
+ audioop-lts