natalie-a-1

Refactor README.md to improve model details, technical specifications, and usage guidelines for the Biden Mistral Adapter

b482329 11 months ago

preview code

raw

history blame contribute delete

4.94 kB

	---
	language:
	- en
	tags:
	- mistral
	- lora
	- adapter
	- fine-tuned
	- politics
	- conversational
	license: mit
	datasets:
	- rohanrao/joe-biden-tweets
	- christianlillelund/joe-biden-2020-dnc-speech
	base_model: mistralai/Mistral-7B-Instruct-v0.2
	library_name: peft
	---

	# 🇺🇸 Biden Mistral Adapter 🇺🇸

	> "Look, folks, this adapter, it's about our common purpose, our shared values. That's no joke."

	This LoRA adapter for [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) has been fine-tuned to emulate Joe Biden's distinctive speaking style, discourse patterns, and policy positions. The model captures the measured cadence, personal anecdotes, and characteristic expressions associated with the current U.S. President.

	## ✨ Model Details

	\| Feature \| Description \|
	\|---------\|-------------\|
	\| Base Model \| [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) \|
	\| Architecture \| LoRA adapter (Low-Rank Adaptation) \|
	\| LoRA Rank \| 16 \|
	\| Language \| English \|
	\| Training Focus \| Biden's communication style, rhetoric, and response patterns \|
	\| Merged Adapters \| Combines style and identity LoRA weights from:<br>- nnat03/biden-mistral-adapter (original adapter)<br>- ./identity-adapters/biden-identity-adapter \|

	## 🎯 Intended Use

	<div align="center">
	<table>
	<tr>
	<td align="center">📚 <b>Education</b></td>
	<td align="center">🔍 <b>Research</b></td>
	<td align="center">🎭 <b>Creative</b></td>
	</tr>
	<tr>
	<td>Political discourse analysis</td>
	<td>Rhetoric pattern studies</td>
	<td>Interactive simulations</td>
	</tr>
	</table>
	</div>

	## 📊 Training Data

	This model was trained on carefully curated datasets that capture authentic speech patterns:

	- 📱 [Biden tweets dataset (2007-2020)](https://www.kaggle.com/datasets/rohanrao/joe-biden-tweets) - Extensive collection capturing everyday communication
	- 🎤 [Biden 2020 DNC speech dataset](https://www.kaggle.com/datasets/christianlillelund/joe-biden-2020-dnc-speech) - Formal oratorical patterns

	These datasets were processed into a specialized instruction format to optimize learning of distinctive speech patterns.

	## ⚙️ Technical Specifications

	### Training Configuration

	```
	🧠 Framework: Hugging Face Transformers + PEFT
	📊 Optimization: 4-bit quantization
	🔧 LoRA Config: r=16, alpha=64, dropout=0.05
	🎛️ Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
	```

	### Training Parameters

	```
	📦 Batch size: 4
	🔄 Gradient accumulation: 4
	📈 Learning rate: 2e-4
	🔁 Epochs: 3
	📉 LR scheduler: cosine
	⚡ Optimizer: paged_adamw_8bit
	🧮 Precision: BF16
	```

	## ⚠️ Limitations and Biases

	- This model mimics a speaking style but doesn't guarantee factual accuracy
	- While emulating Biden's rhetoric, it doesn't represent his actual views
	- May reproduce biases present in the training data
	- Not suitable for production applications without additional fact-checking

	## 💻 Usage

	Run this code to start using the adapter with the Mistral-7B-Instruct-v0.2 base model:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import PeftModel
	import torch

	# Load base model with 4-bit quantization
	base_model_id = "mistralai/Mistral-7B-Instruct-v0.2"
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_compute_dtype=torch.float16,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_use_double_quant=True,
	)

	# Load model and tokenizer
	model = AutoModelForCausalLM.from_pretrained(
	base_model_id,
	quantization_config=bnb_config,
	device_map="auto",
	torch_dtype=torch.float16
	)
	tokenizer = AutoTokenizer.from_pretrained(base_model_id)

	# Apply the adapter
	model = PeftModel.from_pretrained(model, "nnat03/biden-mistral-adapter")

	# Generate a response
	prompt = "What's your vision for America's future?"
	input_text = f"<s>[INST] {prompt} [/INST]"
	inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
	outputs = model.generate(**inputs, max_length=512, temperature=0.7, do_sample=True)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response.split("[/INST]")[-1].strip())
	```

	## 📚 Citation

	If you use this model in your research, please cite:

	```bibtex
	@misc{nnat03-biden-mistral-adapter,
	author = {nnat03},
	title = {Biden Mistral Adapter},
	year = {2023},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/nnat03/biden-mistral-adapter}}
	}
	```

	## 🔍 Ethical Considerations

	This model is created for educational and research purposes. It attempts to mimic the speaking style of a public figure but does not represent their actual views or statements. Use responsibly.

	---

	<div align="center">
	<p><b>Framework version:</b> PEFT 0.15.0</p>
	<p>Made with ❤️ for NLP research and education</p>
	</div>