--- base_model: BSC-LT/salamandra-7b-instruct library_name: peft pipeline_tag: text-generation language: - ca - es - eu - gl - en tags: - base_model:adapter:BSC-LT/salamandra-7b-instruct - lora - peft - transformers - salamandra - instruction-tuning - chain-of-thought --- # Salamandra CoT LoRA Adapter This repository contains a **LoRA (Low-Rank Adaptation) fine-tuned adapter** for [BSC-LT/salamandra-7b-instruct](https://huggingface.co/BSC-LT/salamandra-7b-instruct). The adapter improves structured reasoning and instruction-following behavior while keeping the original Salamandra model weights frozen. --- ## Model Description - **Base model:** [BSC-LT/salamandra-7b-instruct](https://huggingface.co/BSC-LT/salamandra-7b-instruct) - **Architecture:** Decoder-only Transformer - **Fine-tuning method:** LoRA (PEFT) - **Framework:** 🤗 Transformers + PEFT - **Adapter type:** Adapter-only (base model not included) > **Note:** This repository **does not contain the base model weights**. You must load the base model to use this adapter. --- ## Training Details - **Training type:** Supervised fine-tuning (SFT) - **Objective:** Improve step-by-step reasoning (chain-of-thought style) - **Checkpoint:** 4850 - **Quantization during training:** 4-bit (QLoRA) - **Tokenizer:** Salamandra tokenizer (using a custom chat template) ### Framework Versions - **PEFT:** 0.17.1 --- ## Intended Use This adapter is intended for: - Instruction-following tasks - Reasoning-intensive applications - Multilingual use, prioritising Spanish and European official languages (Catalan, Basque, Galician, English) --- ## Limitations - This adapter inherits all limitations of the Salamandra base model. - Outputs may contain incorrect or incomplete reasoning. - Generated chain-of-thought content should not be assumed to be factual. Status: This is exploratory work and is still under development. (!!!) This is an exploratory work to understand current state of the art techniques to teach models to reason. Most of the outputs are incorrect however, cool generalization sings have been observed, for example, asking the model in galician makes the model to reason in galician, future and current focuse is to improve performance with GSPO. ## Usage You can load this model using `peft` and `transformers`. ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel # 1. Setup IDs base_model_id = "BSC-LT/salamandra-7b-instruct" adapter_id = "luisibear/salamandra-cot-lora" # 2. Load Tokenizer tokenizer = AutoTokenizer.from_pretrained(base_model_id) # 3. Load Base Model base_model = AutoModelForCausalLM.from_pretrained( base_model_id, torch_dtype=torch.float16, device_map="auto" ) # 4. Load Adapter model = PeftModel.from_pretrained(base_model, adapter_id) # 5. Inference Example messages = [ {"role": "user", "content": "Explain the logic behind the Pythagorean theorem step by step."} ] # Apply chat template (ensure your tokenizer has the correct template set) inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda") outputs = model.generate(inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True))```