BriskFO_Coderv1 / README.md
Abel252's picture
Upload PEFT/LoRA adapter (300 steps fine-tuned)
1237a4b verified
---
language:
- en
license: apache-2.0
tags:
- pytorch
- peft
- lora
- code-generation
- deepseek-coder
- fine-tuned
datasets:
- custom-code-dataset
model-index:
- name: BriskFO_Coderv1
results: []
---
# BriskFO_Coderv1
## Model Description
This is a **PEFT/LoRA adapter** fine-tuned on DeepSeek Coder 1.3B Instruct model. It was trained for 300 steps on a custom code generation dataset.
## Model Type
This is a **PEFT (Parameter-Efficient Fine-Tuning)** model, specifically using **LoRA (Low-Rank Adaptation)**. It contains only the adapter weights, not the full model.
## Training Details
- **Base Model**: `deepseek-ai/deepseek-coder-1.3b-instruct`
- **Training Steps**: 300
- **Learning Rate**: 2e-4
- **Batch Size**: 16
- **Gradient Accumulation**: 4
- **Sequence Length**: 34958
- **Training Method**: PEFT/LoRA
## Files
This repository contains:
- `adapter_model.bin` / `adapter_model.safetensors` - LoRA adapter weights
- `adapter_config.json` - PEFT configuration
- `tokenizer.json`, `tokenizer_config.json` - Tokenizer files
- `special_tokens_map.json` - Special tokens mapping
## Usage
### Installation
```bash
pip install transformers peft accelerate torch
```
### Loading the Model
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel, PeftConfig
# Load the base model
base_model_id = "deepseek-ai/deepseek-coder-1.3b-instruct"
adapter_model_id = "abel252/BriskFO_Coderv1"
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(adapter_model_id)
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype="auto",
device_map="auto"
)
# Load PEFT adapter
model = PeftModel.from_pretrained(base_model, adapter_model_id)
# For inference, you can merge the adapter with the base model (optional)
# model = model.merge_and_unload()
```
### Inference Example
```python
# Prepare input
prompt = "Write a Python function to calculate fibonacci numbers"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
# Generate
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
top_p=0.95,
do_sample=True
)
# Decode
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## License
This model is released under the Apache 2.0 license.
## Acknowledgments
- Base model: [DeepSeek Coder](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct)
- Fine-tuning framework: [PEFT](https://github.com/huggingface/peft)