Update README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ tags:
|
|
| 13 |
---
|
| 14 |
# Gemma 3 4B – Claude Edition
|
| 15 |
|
| 16 |
-
Gemma 3 4B (Claude Edition) is a fine-tuned version of the Gemma 3 model, trained on the Claude dataset to enhance its English writing style. The goal of this release is to produce outputs that are more natural, creative, and coherent across a wide range of use cases.
|
| 17 |
|
| 18 |
## Overview
|
| 19 |
This variant benefits from Claude’s diverse English-language text and code examples, improving fluency and expressiveness while maintaining the stable performance Gemma models are known for.
|
|
@@ -25,11 +25,14 @@ This variant benefits from Claude’s diverse English-language text and code exa
|
|
| 25 |
- Conversational AI and chatbots
|
| 26 |
|
| 27 |
## Limitations
|
| 28 |
-
- The model may generate inaccurate or outdated information. Always double-check important details before using outputs in production.
|
|
|
|
|
|
|
|
|
|
| 29 |
- Built-in content filters may limit creativity or restrict certain topics.
|
| 30 |
- Non-English translations are tuned for natural-sounding English rather than strict literal accuracy.
|
| 31 |
- The model is not specialized for math or code generation.
|
| 32 |
-
- Visual and multimodal functions
|
| 33 |
|
| 34 |
## Training Data
|
| 35 |
1. [`agentlans/claude`](https://huggingface.co/datasets/agentlans/claude) dataset, `sample_k100000` configuration with LoRA rank 16, alpha 32, and NEFTune 5
|
|
|
|
| 13 |
---
|
| 14 |
# Gemma 3 4B – Claude Edition
|
| 15 |
|
| 16 |
+
[Gemma 3 4B](https://huggingface.co/google/gemma-3-4b-it) ([Claude](https://claude.ai/) Edition) is a fine-tuned version of the Gemma 3 model, trained on the Claude dataset to enhance its English writing style. The goal of this release is to produce outputs that are more natural, creative, and coherent across a wide range of use cases.
|
| 17 |
|
| 18 |
## Overview
|
| 19 |
This variant benefits from Claude’s diverse English-language text and code examples, improving fluency and expressiveness while maintaining the stable performance Gemma models are known for.
|
|
|
|
| 25 |
- Conversational AI and chatbots
|
| 26 |
|
| 27 |
## Limitations
|
| 28 |
+
- The model may generate inaccurate or outdated information. **Always double-check important details before using outputs in production.**
|
| 29 |
+
- Can still give verbose or redundant output.
|
| 30 |
+
- Capable of basic chain-of-thought reasoning but not the long DeepSeek style reasoning.
|
| 31 |
+
- May not understand some prompts or long conversations well.
|
| 32 |
- Built-in content filters may limit creativity or restrict certain topics.
|
| 33 |
- Non-English translations are tuned for natural-sounding English rather than strict literal accuracy.
|
| 34 |
- The model is not specialized for math or code generation.
|
| 35 |
+
- Visual and multimodal functions were not tested.
|
| 36 |
|
| 37 |
## Training Data
|
| 38 |
1. [`agentlans/claude`](https://huggingface.co/datasets/agentlans/claude) dataset, `sample_k100000` configuration with LoRA rank 16, alpha 32, and NEFTune 5
|