prithivMLmods
/

Instella-3B-Instruct-abliterated

Text Generation

text-generation-inference

Model card Files Files and versions

Instella-3B-Instruct-abliterated / README.md

prithivMLmods's picture

Update README.md

f5d7ec5 verified 12 months ago

|

history blame contribute delete

2.11 kB

	---
	library_name: transformers
	tags:
	- text-generation-inference
	license: apache-2.0
	language:
	- en
	base_model:
	- amd/Instella-3B-Instruct
	pipeline_tag: text-generation
	---

	# Instella-3B-Instruct-Abliterated

	> The Instella models are text-only, autoregressive transformer-based LMs having 3 billion parameters. Architecture-wise, Instella is packed with 36 decoder layers, each having 32 attention heads. These models support a sequence length of up to 4,096 tokens and have a vocabulary size of ~50,000 tokens using the OLMo tokenizer. During both pre-training and fine-tuning, we utilized FlashAttention-2, Torch Compile, and bfloat16 mixed-precision training to reduce memory usage, leading to computational speedups and optimal resource utilization. To balance inter-node memory efficiency and intra-node communication overhead within our cluster, we employed fully sharded data parallelism (FSDP) with hybrid sharding, with model parameters, gradients, and optimizer states sharded within a node and replicated across the nodes.

	### Example Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	checkpoint = "prithivMLmods/Instella-3B-Instruct-abliterated"

	tokenizer = AutoTokenizer.from_pretrained(checkpoint, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", trust_remote_code=True)

	prompt = [{"role": "user", "content": "What are the benefits of open-source AI research?"}]
	inputs = tokenizer.apply_chat_template(
	prompt,
	add_generation_prompt=True,
	return_tensors='pt'
	)

	tokens = model.generate(
	inputs.to(model.device),
	max_new_tokens=1024,
	temperature=0.8,
	do_sample=True
	)

	print(tokenizer.decode(tokens[0], skip_special_tokens=False))
	```

	> Overall, Instella-3B-Instruct excels in instruction following tasks and multi-turn QA tasks like TruthfulQA, GPQA, IFEval and MT-Bench, while being highly competitive compared to existing state-of-the-art open weight models on other knowledge recall and math benchmarks, while being trained on significantly fewer training tokens.