Trouter-20b / README.md

Update README.md

b077766 verified 2 months ago

8.74 kB

	---
	license: apache-2.0
	datasets:
	- HuggingFaceFW/finewiki
	metrics:
	- accuracy
	base_model:
	- PaddlePaddle/PaddleOCR-VL
	new_version: OpenTrouter/Trouter-Terminus-20b
	pipeline_tag: text-generation
	library_name: adapter-transformers
	tags:
	- agent
	- code
	---
	# Trouter-20B

	<div align="center">

	![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)
	![Model Size](https://img.shields.io/badge/Parameters-20B-green.svg)
	![Python](https://img.shields.io/badge/Python-3.8%2B-blue.svg)
	![PyTorch](https://img.shields.io/badge/PyTorch-2.0%2B-orange.svg)

	A powerful 20 billion parameter language model for advanced natural language processing

	[🤗 Model Card](https://huggingface.co/Trouter-Library/Trouter-20B) \| [📖 Documentation](./USAGE_GUIDE.md)

	</div>

	---

	## 📋 Table of Contents

	- [Overview](#overview)
	- [Key Features](#key-features)
	- [Quick Start](#quick-start)
	- [Model Details](#model-details)
	- [Performance](#performance)
	- [Use Cases](#use-cases)
	- [System Requirements](#system-requirements)
	- [Training Details](#training-details)
	- [Limitations & Bias](#limitations--bias)
	- [License](#license)
	- [Citation](#citation)
	- [Acknowledgments](#acknowledgments)

	## 🎯 Overview

	Trouter-20B is a state-of-the-art decoder-only transformer language model with 20 billion parameters. Designed for versatility and performance, it excels at a wide range of natural language understanding and generation tasks including reasoning, question answering, creative writing, code generation, and conversational AI.

	## ✨ Key Features

	- 20B Parameters: Optimal balance between performance and computational efficiency
	- 4K Context Length: Process and generate longer sequences with 4096 token context window
	- Apache 2.0 License: Fully open for commercial and research use
	- Optimized Architecture: Efficient attention mechanisms with GQA (Grouped Query Attention)
	- Multi-lingual Capable: Strong performance on English with support for multiple languages
	- Quantization Ready: Compatible with 8-bit and 4-bit quantization for reduced memory footprint
	- Chat Optimized: Built-in chat template for conversational applications

	## 🚀 Quick Start

	### Installation

	```bash
	pip install transformers>=4.38.0 torch>=2.0.0 accelerate bitsandbytes
	```

	### Basic Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	# Load model and tokenizer
	model_id = "Trouter-Library/Trouter-20B"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)

	# Generate text
	prompt = "Explain the concept of neural networks:"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	### Memory-Efficient Loading (4-bit)

	```python
	from transformers import BitsAndBytesConfig

	# Configure 4-bit quantization
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.bfloat16
	)

	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	quantization_config=bnb_config,
	device_map="auto"
	)
	```

	For more detailed usage examples, see the [Usage Guide](./USAGE_GUIDE.md).

	## 📊 Model Details

	\| Specification \| Value \|
	\|--------------\|-------\|
	\| Parameters \| 20 billion \|
	\| Architecture \| Decoder-only Transformer \|
	\| Layers \| 48 \|
	\| Hidden Size \| 5120 \|
	\| Attention Heads \| 40 (8 KV heads with GQA) \|
	\| Context Length \| 4096 tokens \|
	\| Vocabulary Size \| 32,000 tokens \|
	\| Activation \| SiLU (Swish) \|
	\| Positional Encoding \| RoPE (Rotary Position Embedding) \|
	\| Normalization \| RMSNorm \|
	\| Precision \| BFloat16 \|

	## 📈 Performance

	### Benchmark Results

	\| Benchmark \| Score \| Notes \|
	\|-----------\|-------\|-------\|
	\| MMLU (5-shot) \| TBD \| Multitask Language Understanding \|
	\| HellaSwag \| TBD \| Commonsense Reasoning \|
	\| TruthfulQA \| TBD \| Truthfulness & Accuracy \|
	\| HumanEval \| TBD \| Code Generation \|
	\| GSM8K \| TBD \| Mathematical Reasoning \|
	\| BBH \| TBD \| Big Bench Hard \|

	Benchmarks to be updated after comprehensive evaluation

	### Inference Speed

	\| Configuration \| Tokens/Second \| Memory Usage \|
	\|--------------\|---------------\|--------------\|
	\| BF16 (A100 80GB) \| ~XX tokens/s \| ~40GB \|
	\| 8-bit (A100 40GB) \| ~XX tokens/s \| ~20GB \|
	\| 4-bit (RTX 4090) \| ~XX tokens/s \| ~10GB \|

	## 💡 Use Cases

	### ✅ Recommended Uses

	- Text Generation: Articles, stories, creative writing
	- Question Answering: Information retrieval and explanation
	- Code Assistance: Code completion, debugging, explanation
	- Summarization: Document and conversation summarization
	- Translation: Multi-language translation tasks
	- Dialogue Systems: Chatbots and conversational AI
	- Content Analysis: Sentiment analysis, classification
	- Educational Tools: Tutoring and learning assistance

	### ⚠️ Limitations

	- May generate incorrect or nonsensical information (hallucinations)
	- Not suitable for high-stakes decision making without human oversight
	- Performance may vary on specialized or domain-specific tasks
	- Requires careful prompt engineering for optimal results
	- May reflect biases present in training data

	### ❌ Out of Scope

	- Real-time medical diagnosis or treatment recommendations
	- Legal advice or binding interpretations
	- Financial investment decisions
	- Safety-critical systems without human verification
	- Generating harmful, illegal, or unethical content

	## 💻 System Requirements

	### Minimum Requirements

	- GPU: 24GB VRAM (with 4-bit quantization)
	- RAM: 32GB system memory
	- Storage: 50GB free space
	- CUDA: 11.8 or higher

	### Recommended Specifications

	- GPU: A100 (40GB/80GB) or H100
	- RAM: 64GB+ system memory
	- Storage: 100GB+ SSD
	- Multi-GPU: Supported via `device_map="auto"`

	## 🏋️ Training Details

	### Training Data

	Trouter-20B was trained on a diverse corpus of high-quality text data including:

	- Web documents and articles
	- Books and academic papers
	- Code repositories
	- Conversational data
	- Multilingual text

	Total Training Tokens: [Specify total tokens]
	Data Mix: [Provide breakdown of data sources]
	Cutoff Date: January 2025

	### Training Infrastructure

	- Framework: PyTorch 2.0+ with FSDP
	- Hardware: [Specify GPU cluster details]
	- Training Time: [Specify duration]
	- Optimizer: AdamW
	- Learning Rate: Cosine schedule with warmup
	- Batch Size: [Specify effective batch size]
	- Sequence Length: 4096 tokens

	### Training Objective

	Causal language modeling with next-token prediction using cross-entropy loss.

	## ⚖️ Limitations & Bias

	### Known Limitations

	1. Hallucinations: May generate plausible-sounding but incorrect information
	2. Temporal Knowledge: Training data cutoff is January 2025
	3. Mathematical Reasoning: May struggle with complex multi-step calculations
	4. Multilingual Performance: Optimized for English; other languages may have reduced quality
	5. Context Window: Limited to 4096 tokens

	### Bias Considerations

	Like all large language models, Trouter-20B may exhibit biases including:

	- Gender, racial, and cultural biases from training data
	- Western/English-centric perspective
	- Potential stereotyping in generated content

	Mitigation Efforts: We encourage users to:
	- Implement appropriate content filtering
	- Use diverse evaluation datasets
	- Apply bias detection tools
	- Provide human oversight for production deployments

	## 📜 License

	Trouter-20B is released under the Apache 2.0 License. You are free to:

	✅ Use commercially
	✅ Modify and distribute
	✅ Use privately
	✅ Use for patent purposes

	See [LICENSE](./LICENSE) file for full terms.

	## 📝 Citation

	If you use Trouter-20B in your research or applications, please cite:

	```bibtex
	@software{trouter20b2025,
	title={Trouter-20B: A 20 Billion Parameter Language Model},
	author={Trouter-Library},
	year={2025},
	month={10},
	url={https://huggingface.co/Trouter-Library/Trouter-20B},
	version={1.0},
	license={Apache-2.0}
	}
	```

	## 🙏 Acknowledgments

	We thank the open-source community and the following projects that made this work possible:

	- [Hugging Face Transformers](https://github.com/huggingface/transformers)
	- [PyTorch](https://pytorch.org/)
	- [LLaMA](https://ai.meta.com/llama/) architecture inspiration
	- [EleutherAI](https://www.eleuther.ai/) for evaluation frameworks

	---

	<div align="center">

	Built with ❤️ for the AI community

	[⬆ Back to Top](#trouter-20b)

	</div>