Capitaller commited on
Commit
d438efa
·
verified ·
1 Parent(s): d7cca55

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -20
README.md CHANGED
@@ -1,31 +1,71 @@
1
  ---
 
 
2
  tags:
3
- - gguf
4
- - llama.cpp
5
  - unsloth
6
- - vision-language-model
 
 
 
 
7
  ---
8
 
9
- # gemma_4E4B-it_finetune : GGUF
10
 
11
- This model was finetuned and converted to GGUF format using [Unsloth](https://github.com/unslothai/unsloth).
12
 
13
- **Example usage**:
14
- - For text only LLMs: `llama-cli -hf Capitaller/gemma_4E4B-it_finetune --jinja`
15
- - For multimodal models: `llama-mtmd-cli -hf Capitaller/gemma_4E4B-it_finetune --jinja`
16
 
17
- ## Available Model files:
18
- - `gemma-4-e4b-it.Q8_0.gguf`
19
- - `gemma-4-e4b-it.F16-mmproj.gguf`
 
 
 
20
 
21
- ## ⚠️ Ollama Note for Vision Models
22
- **Important:** Ollama currently does not support separate mmproj files for vision models.
23
 
24
- To create an Ollama model from this vision model:
25
- 1. Place the `Modelfile` in the same directory as the finetuned bf16 merged model
26
- 3. Run: `ollama create model_name -f ./Modelfile`
27
- (Replace `model_name` with your desired name)
28
 
29
- This will create a unified bf16 model that Ollama can use.
30
- This was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
31
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: google/gemma-4-e4b-it
3
+ library_name: transformers
4
  tags:
 
 
5
  - unsloth
6
+ - instruction-tuning
7
+ - aws
8
+ - agentic-ai
9
+ - lorA
10
+ - conversational
11
  ---
12
 
13
+ # Gemma-4-E4B-it Agentic AI on AWS (Instruct)
14
 
15
+ This model is an instruction-tuned (Supervised Fine-Tuning) version of Google's [Gemma-4-E4B-it](https://huggingface.co/google/gemma-4-e4b-it). It has been specialized to act as a conversational assistant, answering complex architectural questions regarding Agentic AI systems, frameworks, and protocols on Amazon Web Services (AWS).
16
 
17
+ The model was fine-tuned using **[Unsloth](https://github.com/unslothai/unsloth)** to enhance its ability to reason about AWS architectures and provide actionable, structured guidance based on official AWS prescriptive documentation.
 
 
18
 
19
+ ## Model Details
20
+ * **Base Model:** `google/gemma-4-e4b-it` (via `unsloth/gemma-4-E4B-it`)
21
+ * **Training Type:** Supervised Fine-Tuning (Instruction/Chat)
22
+ * **Domain focus:** AWS Architecture, Agentic AI, Frameworks, and Protocols (MCP, etc.)
23
+ * **Language:** English
24
+ * **Library:** Unsloth / Hugging Face Transformers
25
 
26
+ ## Dataset
27
+ The model was trained on instruction-response pairs sourced from **AWS Prescriptive Guidance: Agentic AI frameworks, platforms, protocols, and tools on AWS**. It has been taught to answer queries concisely and provide highly technical, context-aware AWS architecture advice based on modern Agentic standards.
28
 
29
+ ## Training Configuration
30
+ Unlike a base model, this model already understands conversational flow. Fine-tuning was constrained to the attention and MLP layers to smoothly adapt its persona and specific technical knowledge without causing catastrophic forgetting.
 
 
31
 
32
+ * **Method:** PEFT / LoRA
33
+ * **LoRA Rank (r):** 16 (Standard for Instruct tuning)
34
+ * **LoRA Alpha:** 16 (or 32, scaled for optimal learning)
35
+ * **Target Modules:** Attention and MLP modules (`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`)
36
+ * **Precision:** 4-bit quantization (QLoRA) during training
37
+ * **Optimizer:** Paged AdamW 8-bit
38
+
39
+ ## How to Use
40
+ Because this is an Instruct model, you **must use the standard Gemma chat template** when querying it. You can interact with it exactly like a standard chatbot.
41
+
42
+ You can load it using Transformers or Unsloth:
43
+
44
+ ```python
45
+ from transformers import AutoModelForCausalLM, AutoTokenizer
46
+
47
+ model_name = "Capitaller/gemma_4E4B-it_finetune"
48
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
49
+ model = AutoModelForCausalLM.from_pretrained(model_name)
50
+
51
+ # Ensure you use the proper chat format
52
+ messages = [
53
+ {"role": "user", "content": "How should I design an Agentic AI architecture on AWS that uses the Model Context Protocol (MCP)?"},
54
+ ]
55
+
56
+ inputs = tokenizer.apply_chat_template(
57
+ messages,
58
+ add_generation_prompt=True,
59
+ tokenize=True,
60
+ return_tensors="pt"
61
+ )
62
+
63
+ outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7)
64
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
65
+ print(response)
66
+ ```
67
+
68
+ ### Prompting Tips
69
+ This model is designed to be highly instructional and responsive to direct questions.
70
+ * Ask clear, technical questions: `"What are the recommended AWS compute platforms for hosting an MCP server?"`
71
+ * Request specific structures: `"Write a brief step-by-step guide on securing an AI agent communication channel on AWS."`