---
base_model: BSC-LT/salamandra-7b-instruct
library_name: peft
pipeline_tag: text-generation
language:
- ca
- es
- eu
- gl
- en
tags:
- base_model:adapter:BSC-LT/salamandra-7b-instruct
- lora
- peft
- transformers
- salamandra
- instruction-tuning
- chain-of-thought
---

# Salamandra CoT LoRA Adapter

This repository contains a **LoRA (Low-Rank Adaptation) fine-tuned adapter** for [BSC-LT/salamandra-7b-instruct](https://huggingface.co/BSC-LT/salamandra-7b-instruct).

The adapter improves structured reasoning and instruction-following behavior while keeping the original Salamandra model weights frozen.

---

## Model Description

- **Base model:** [BSC-LT/salamandra-7b-instruct](https://huggingface.co/BSC-LT/salamandra-7b-instruct)
- **Architecture:** Decoder-only Transformer
- **Fine-tuning method:** LoRA (PEFT)
- **Framework:** 🤗 Transformers + PEFT
- **Adapter type:** Adapter-only (base model not included)

> **Note:** This repository **does not contain the base model weights**. You must load the base model to use this adapter.

---

## Training Details

- **Training type:** Supervised fine-tuning (SFT)
- **Objective:** Improve step-by-step reasoning (chain-of-thought style)
- **Checkpoint:** 4850
- **Quantization during training:** 4-bit (QLoRA)
- **Tokenizer:** Salamandra tokenizer (using a custom chat template)

### Framework Versions
- **PEFT:** 0.17.1

---

## Intended Use

This adapter is intended for:
- Instruction-following tasks
- Reasoning-intensive applications
- Multilingual use, prioritising Spanish and European official languages (Catalan, Basque, Galician, English)

---

## Limitations
- This adapter inherits all limitations of the Salamandra base model.

- Outputs may contain incorrect or incomplete reasoning.

- Generated chain-of-thought content should not be assumed to be factual.

Status: This is exploratory work and is still under development. (!!!)
This is an exploratory work to understand current state of the art techniques to teach models to reason. Most of the outputs are incorrect however, cool generalization sings have been observed, for example, asking the model in galician makes the model to reason in galician,
future and current focuse is to improve performance with GSPO.

## Usage

You can load this model using `peft` and `transformers`.

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# 1. Setup IDs
base_model_id = "BSC-LT/salamandra-7b-instruct"
adapter_id = "luisibear/salamandra-cot-lora"

# 2. Load Tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# 3. Load Base Model
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

# 4. Load Adapter
model = PeftModel.from_pretrained(base_model, adapter_id)

# 5. Inference Example
messages = [
    {"role": "user", "content": "Explain the logic behind the Pythagorean theorem step by step."}
]

# Apply chat template (ensure your tokenizer has the correct template set)
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to("cuda")

outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))```