pedrodev2026
/

microcoder-1.5b

 - coder
 - code
 - microcoder
+---
+# Microcoder 1.5B
+**Microcoder 1.5B** is a code-focused language model fine-tuned from [Qwen 2.5 Coder 1.5B Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) using LoRA (Low-Rank Adaptation) on curated code datasets. It is designed for code generation, completion, and instruction-following tasks in a lightweight, efficient package.
+---
+## Model Details
+| Property         | Value                                      |
+|------------------|--------------------------------------------|
+| **Base Model**   | Qwen 2.5 Coder 1.5B Instruct               |
+| **Fine-tuning**  | LoRA                                       |
+| **Parameters**   | ~1.5B                                      |
+| **License**      | BSD 3-Clause                               |
+| **Language**     | English (primary), multilingual code       |
+| **Task**         | Code generation, completion, instruction following |
+---
+## Benchmarks
+| Benchmark          | Metric   | Score        |
+|--------------------|----------|--------------|
+| HumanEval          | pass@1   | **59.15%**    |
+> HumanEval results were obtained using the model in **GGUF format** with **Q5_K_M quantization**. Results may vary slightly with other formats or quantization levels.
+---
+## Usage
+> **Important:** You must use `apply_chat_template` when formatting inputs. Passing raw text directly to the tokenizer will produce incorrect results.
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_id = "your-org/microcoder-1.5b"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id)
+messages = [
+    {
+        "role": "user",
+        "content": "Write a Python function that returns the nth Fibonacci number."
+    }
+]
+input_text = tokenizer.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=256)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+---
+## Training Details
+Microcoder 1.5B was fine-tuned using LoRA on top of Qwen 2.5 Coder 1.5B Instruct. The training focused on code-heavy datasets covering multiple programming languages and problem-solving scenarios, aiming to improve instruction-following and code correctness at a small model scale.
+---
+## Credits
+- **Model credits** — see [`MODEL_CREDITS.md`](./MODEL_CREDITS.md)
+- **Dataset credits** — see [`DATASET_CREDITS.md`](./DATASET_CREDITS.md)
+---
+## License
+The Microcoder 1.5B model weights and associated code in this repository are released under the **BSD 3-Clause License**. See [`LICENSE`](./LICENSE) for details.
+Note that the base model (Qwen 2.5 Coder 1.5B Instruct) and the datasets used for fine-tuning are subject to their own respective licenses, as detailed in the credit files above.
+---
+## Notice
+The documentation files in this repository (including `README.md`, `MODEL_CREDITS.md`, `DATASET_CREDITS.md`, and other `.md` files) were generated with the assistance of an AI language model.