LoRA_Fusion_IMDB-Sentiment_EN-FR / README.md

Update README.md

f3e58b3 verified 7 days ago

5.65 kB

	---
	license: apache-2.0
	language:
	- en
	- fr
	pipeline_tag: text-generation
	tags:
	- lora
	- peft
	- multi-task
	- sentiment-analysis
	- translation
	- tinyllama
	- adapter-fusion
	base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T

	---

	- # LoRA Fusion: IMDB Sentiment + EN-FR Translation on TinyLlama-1.1B

	This repository contains a fully merged multi-task model created by sequentially fusing two independently trained LoRA adapters into a single TinyLlama-1.1B base model.

	The final model supports multiple tasks within one unified set of weights, without requiring PEFT or LoRA adapters at inference time.

	------

	## 🧠 Tasks Supported

	The model is capable of performing the following tasks via prompt-based inference:

	- 🧠 Sentiment Analysis
	Binary sentiment classification (positive / negative) trained on the IMDB movie review dataset.
	- 🌍 English → French Translation
	Neural machine translation trained on OPUS-100 (EN-FR) data.

	------

	## 🔧 How This Model Was Built

	### Base Model

	- TinyLlama-1.1B

	```
	TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
	```

	### Independent LoRA Adapters

	Two LoRA adapters were trained independently, each specializing in a single task:

	1. IMDB Sentiment Analysis LoRA

	```
	BEncoderRT/IMDB-Sentiment-LoRA-TinyLlama-1.1B
	```

	2. English → French Translation LoRA

	```
	BEncoderRT/EN-FR-Translation-LoRA-TinyLlama-1.1B
	```

	### Fusion Method: Sequential LoRA Merge

	The final model was created using sequential LoRA fusion:

	1. Load the frozen TinyLlama base model
	2. Merge the sentiment analysis LoRA into the base model
	3. Treat the merged model as a new base
	4. Merge the translation LoRA into the updated base
	5. Export the final merged weights

	This process uses `merge_and_unload()` from PEFT, resulting in a standard `LlamaForCausalLM` model.

	> ⚠️ Important
	> This repository does NOT contain LoRA adapters.
	> It contains a fully merged model and should NOT be loaded with `PeftModel`.

	---

	## 🧠 Architecture

	┌──────────────────────────┐
	│ TinyLlama-1.1B Base LM│
	│ (Frozen Parameters) │
	└────────────┬─────────────┘
	│
	┌─────────────────┴─────────────────┐
	│ │
	┌──────────────────────────┐ ┌──────────────────────────┐
	│ Sentiment LoRA Adapter │ │Translation LoRA Adapter│
	│ (IMDB) │ │ (EN → FR) │
	└──────────────────────────┘ └──────────────────────────┘
	set_adapter("sentiment") set_adapter("translation")



	---

	# Usage Example

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	repo_id = "BEncoderRT/LoRA_Fusion_IMDB-Sentiment_EN-FR"

	tokenizer = AutoTokenizer.from_pretrained(repo_id)
	tokenizer.pad_token = tokenizer.eos_token

	model = AutoModelForCausalLM.from_pretrained(
	repo_id,
	device_map="auto",
	torch_dtype="auto"
	)

	model.eval()

	```

	```
	LlamaForCausalLM(
	(model): LlamaModel(
	(embed_tokens): Embedding(32000, 2048)
	(layers): ModuleList(
	(0-21): 22 x LlamaDecoderLayer(
	(self_attn): LlamaAttention(
	(q_proj): Linear(in_features=2048, out_features=2048, bias=False)
	(k_proj): Linear(in_features=2048, out_features=256, bias=False)
	(v_proj): Linear(in_features=2048, out_features=256, bias=False)
	(o_proj): Linear(in_features=2048, out_features=2048, bias=False)
	)
	(mlp): LlamaMLP(
	(gate_proj): Linear(in_features=2048, out_features=5632, bias=False)
	(up_proj): Linear(in_features=2048, out_features=5632, bias=False)
	(down_proj): Linear(in_features=5632, out_features=2048, bias=False)
	(act_fn): SiLUActivation()
	)
	(input_layernorm): LlamaRMSNorm((2048,), eps=1e-05)
	(post_attention_layernorm): LlamaRMSNorm((2048,), eps=1e-05)
	)
	)
	(norm): LlamaRMSNorm((2048,), eps=1e-05)
	(rotary_emb): LlamaRotaryEmbedding()
	)

	```

	```python
	def generate(prompt, max_new_tokens=32):
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	output = model.generate(
	**inputs,
	max_new_tokens=max_new_tokens,
	temperature=0.3,
	pad_token_id=tokenizer.eos_token_id
	)
	return tokenizer.decode(output[0], skip_special_tokens=True)
	```

	```python
	print(generate(
	"### Task: Sentiment Analysis\n### Review:\nThis movie was amazing.\n### Answer:\n",
	8
	))

	```

	```
	The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
	### Task: Sentiment Analysis
	### Review:
	This movie was amazing.
	### Answer:
	positive
	```

	```python
	print(generate(
	"### Task: Translation (English to French)\n### English:\nI love deep learning.\n### French:\n",
	32
	))

	```

	```
	### Task: Translation (English to French)
	### English:
	I love deep learning.
	### French:
	Je tiens à la deep learning.
	```