|
|
--- |
|
|
tags: |
|
|
- computer-engineering |
|
|
- llama-3 |
|
|
- 1b |
|
|
- lora |
|
|
- 8bit |
|
|
license: llama3.2 |
|
|
license_link: https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE |
|
|
base_model: |
|
|
- meta-llama/Llama-3.2-1B |
|
|
datasets: |
|
|
- wikitext-2-raw-v1 |
|
|
- computer-engineering-corpus |
|
|
--- |
|
|
|
|
|
**Specialized 1B Parameter Model for Computer Engineering** |
|
|
*Fine-tuned with LoRA on 8-bit quantized Llama-3-1B* |
|
|
|
|
|
<div align="center"> |
|
|
<a href="https://github.com/IrfanUruchi/Llama-3.2-1B-ComputerEngineeringLLM"> |
|
|
<img src="https://img.shields.io/badge/🔗_GitHub-Repo-181717?style=for-the-badge&logo=github" alt="GitHub"> |
|
|
</a> |
|
|
<a href="https://huggingface.co/Irfanuruchi/Llama-3.2-1B-Computer-Engineering-LLM"> |
|
|
<img src="https://img.shields.io/badge/🤗_HuggingFace-Model_Repo-FFD21F?style=for-the-badge" alt="HuggingFace"> |
|
|
</a> |
|
|
<br> |
|
|
<img src="https://img.shields.io/badge/Model_Size-1B_parameters-blue" alt="Model Size"> |
|
|
<img src="https://img.shields.io/badge/Quantization-8bit-green" alt="Quantization"> |
|
|
<img src="https://img.shields.io/badge/Adapter-LoRA-orange" alt="Adapter"> |
|
|
<img src="https://img.shields.io/badge/Context-8k-lightgrey" alt="Context"> |
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## 🛠️ Technical Specifications |
|
|
|
|
|
### Architecture |
|
|
| Component | Specification | |
|
|
|------------------------|---------------------------------| |
|
|
| Base Model | Meta-Llama-3-1B | |
|
|
| Hidden Size | 2048 | |
|
|
| Layers | 16 | |
|
|
| Attention Heads | 32 | |
|
|
| Quantization | 8-bit via BitsAndBytes | |
|
|
| Fine-Tuning Method | LoRA (Low-Rank Adaptation) | |
|
|
| Tokenizer Vocabulary | 128,256 tokens | |
|
|
|
|
|
|
|
|
### Training Data |
|
|
- Wikitext-2-raw-v1 (General knowledge) |
|
|
- Custom computer engineering corpus: |
|
|
- Hardware design principles |
|
|
- Processor architectures |
|
|
- Embedded systems documentation |
|
|
|
|
|
--- |
|
|
|
|
|
## Installation and usage |
|
|
|
|
|
|
|
|
### Option 1: From Hugging Face Hub (Recommended) |
|
|
|
|
|
```python |
|
|
|
|
|
model_id = "Irfanuruchi/Llama-3.2-1B-Computer-Engineering-LLM" |
|
|
``` |
|
|
|
|
|
### Option 2: Local Installation (Git LFS Required) |
|
|
|
|
|
|
|
|
```python |
|
|
|
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
# Replace with your local path |
|
|
model_path = "./Llama-3.2-1B-ComputerEngineeringLLM" |
|
|
|
|
|
``` |
|
|
|
|
|
*Recomended Config* |
|
|
|
|
|
```python |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=200, |
|
|
temperature=0.7, |
|
|
top_p=0.9, |
|
|
do_sample=True, |
|
|
repetition_penalty=1.1 |
|
|
) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
|
|
|
## Licence complience |
|
|
|
|
|
This model is governed by the Llama 3.2 Community License. Key requirements: |
|
|
|
|
|
Non-commercial use only |
|
|
Attribution to Meta required |
|
|
Cannot be used to train other LLMs |
|
|
Attribution Notice: |
|
|
"Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc." |
|
|
|
|
|
--- |
|
|
|
|
|
|
|
|
## Limitations |
|
|
|
|
|
Specialized for computer engineering (general performance may vary) |
|
|
Occasional repetition in outputs |
|
|
Requires prompt engineering for optimal results |
|
|
Knowledge cutoff: January 2025 |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
|
|
|
If using for academic research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{llama3.2-1b-eng-2025, |
|
|
title = {Llama-3.2-1B-Computer-Engineering-LLM}, |
|
|
author = {Irfanuruchi}, |
|
|
year = {2025}, |
|
|
publisher = {Hugging Face}, |
|
|
url = {https://huggingface.co/Irfanuruchi/Llama-3.2-1B-Computer-Engineering-LLM}, |
|
|
} |
|
|
``` |
|
|
|