--- language: - en license: apache-2.0 tags: - pytorch - peft - lora - code-generation - deepseek-coder - fine-tuned datasets: - custom-code-dataset model-index: - name: BriskFO_Coderv1 results: [] --- # BriskFO_Coderv1 ## Model Description This is a **PEFT/LoRA adapter** fine-tuned on DeepSeek Coder 1.3B Instruct model. It was trained for 300 steps on a custom code generation dataset. ## Model Type This is a **PEFT (Parameter-Efficient Fine-Tuning)** model, specifically using **LoRA (Low-Rank Adaptation)**. It contains only the adapter weights, not the full model. ## Training Details - **Base Model**: `deepseek-ai/deepseek-coder-1.3b-instruct` - **Training Steps**: 300 - **Learning Rate**: 2e-4 - **Batch Size**: 16 - **Gradient Accumulation**: 4 - **Sequence Length**: 34958 - **Training Method**: PEFT/LoRA ## Files This repository contains: - `adapter_model.bin` / `adapter_model.safetensors` - LoRA adapter weights - `adapter_config.json` - PEFT configuration - `tokenizer.json`, `tokenizer_config.json` - Tokenizer files - `special_tokens_map.json` - Special tokens mapping ## Usage ### Installation ```bash pip install transformers peft accelerate torch ``` ### Loading the Model ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel, PeftConfig # Load the base model base_model_id = "deepseek-ai/deepseek-coder-1.3b-instruct" adapter_model_id = "abel252/BriskFO_Coderv1" # Load tokenizer tokenizer = AutoTokenizer.from_pretrained(adapter_model_id) # Load base model base_model = AutoModelForCausalLM.from_pretrained( base_model_id, torch_dtype="auto", device_map="auto" ) # Load PEFT adapter model = PeftModel.from_pretrained(base_model, adapter_model_id) # For inference, you can merge the adapter with the base model (optional) # model = model.merge_and_unload() ``` ### Inference Example ```python # Prepare input prompt = "Write a Python function to calculate fibonacci numbers" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) # Generate outputs = model.generate( **inputs, max_new_tokens=256, temperature=0.7, top_p=0.95, do_sample=True ) # Decode response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## License This model is released under the Apache 2.0 license. ## Acknowledgments - Base model: [DeepSeek Coder](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct) - Fine-tuning framework: [PEFT](https://github.com/huggingface/peft)