Abel252
/

BriskFO_Coderv1

code-generation

Model card Files Files and versions

BriskFO_Coderv1 / README.md

Abel252's picture

Upload PEFT/LoRA adapter (300 steps fine-tuned)

1237a4b verified about 2 months ago

|

history blame contribute delete

2.53 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- pytorch
	- peft
	- lora
	- code-generation
	- deepseek-coder
	- fine-tuned
	datasets:
	- custom-code-dataset
	model-index:
	- name: BriskFO_Coderv1
	results: []
	---

	# BriskFO_Coderv1

	## Model Description

	This is a PEFT/LoRA adapter fine-tuned on DeepSeek Coder 1.3B Instruct model. It was trained for 300 steps on a custom code generation dataset.

	## Model Type

	This is a PEFT (Parameter-Efficient Fine-Tuning) model, specifically using LoRA (Low-Rank Adaptation). It contains only the adapter weights, not the full model.

	## Training Details

	- Base Model: `deepseek-ai/deepseek-coder-1.3b-instruct`
	- Training Steps: 300
	- Learning Rate: 2e-4
	- Batch Size: 16
	- Gradient Accumulation: 4
	- Sequence Length: 34958
	- Training Method: PEFT/LoRA

	## Files

	This repository contains:
	- `adapter_model.bin` / `adapter_model.safetensors` - LoRA adapter weights
	- `adapter_config.json` - PEFT configuration
	- `tokenizer.json`, `tokenizer_config.json` - Tokenizer files
	- `special_tokens_map.json` - Special tokens mapping

	## Usage

	### Installation

	```bash
	pip install transformers peft accelerate torch
	```

	### Loading the Model

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel, PeftConfig

	# Load the base model
	base_model_id = "deepseek-ai/deepseek-coder-1.3b-instruct"
	adapter_model_id = "abel252/BriskFO_Coderv1"

	# Load tokenizer
	tokenizer = AutoTokenizer.from_pretrained(adapter_model_id)

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained(
	base_model_id,
	torch_dtype="auto",
	device_map="auto"
	)

	# Load PEFT adapter
	model = PeftModel.from_pretrained(base_model, adapter_model_id)

	# For inference, you can merge the adapter with the base model (optional)
	# model = model.merge_and_unload()
	```

	### Inference Example

	```python
	# Prepare input
	prompt = "Write a Python function to calculate fibonacci numbers"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	# Generate
	outputs = model.generate(
	**inputs,
	max_new_tokens=256,
	temperature=0.7,
	top_p=0.95,
	do_sample=True
	)

	# Decode
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	## License

	This model is released under the Apache 2.0 license.

	## Acknowledgments

	- Base model: [DeepSeek Coder](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct)
	- Fine-tuning framework: [PEFT](https://github.com/huggingface/peft)