ByteDance-Seed
/

Seed-X-Instruct-7B

Model card Files Files and versions

Seed-X-Instruct-7B / README.md

elonreevemusk009's picture

elonreevemusk009

Update README.md

4a6b055 verified 6 months ago

|

3.87 kB

	---
	license_name: openmdw
	license_link: LICENSE
	datasets:
	- fka/awesome
	metrics:
	- accuracy
	- character
	pipeline_tag: text-classification

	## Introduction
	We are excited to introduce Seed-X, a powerful series of open-source multilingual translation language models, including an instruction model, a reinforcement learning model, and a reward model. It pushes the boundaries of translation capabilities within 7 billion parameters.
	We develop Seed-X as an accessible, off-the-shelf tool to support the community in advancing translation research and applications:
	* Exceptional translation capabilities: Seed-X exhibits state-of-the-art translation capabilities, on par with or outperforming ultra-large models like Gemini-2.5, Claude-3.5, and GPT-4, as validated by human evaluations and automatic metrics.
	* Deployment and inference-friendly: With a compact 7B parameter count and mistral architecture, Seed-X offers outstanding translation performance in a lightweight and efficient package, ideal for deployment and inference.
	* Broad domain coverage: Seed-X excels on a highly challenging translation test set spanning diverse domains, including the internet, science and technology, office dialogues, e-commerce, biomedicine, finance, law, literature, and entertainment.
	![performance](imgs/model_comparsion.png)

	This repo contains the Seed-X-Instruct model, with the following features:
	* Type: Causal language models
	* Training Stage: Pretraining & Post-training
	* Support: Multilingual translation among 28 languages

	\| Languages \| Abbr. \| Languages \| Abbr. \| Languages \| Abbr. \| Languages \| Abbr. \|
	\| ----------- \| ----------- \|-----------\|-----------\|-----------\|-----------\| -----------\|-----------\|
	\|Arabic \| ar \|French \| fr \| Malay \| ms \| Russian \| ru \|
	\|Czech \| cs \|Croatian \| hr \| Norwegian Bokmal \| nb \| Swedish \| sv \|
	\|Danish \| da \|Hungarian \| hu \| Dutch \| nl \| Thai \| th \|
	\|German \| de \|Indonesian \| id \| Norwegian \| no \| Turkish \| tr \|
	\|English \| en \|Italian \| it \| Polish \| pl \| Ukrainian \| uk \|
	\|Spanish \| es \|Japanese \| ja \| Portuguese \| pt \| Vietnamese \| vi \|
	\|Finnish \| fi \|Korean \| ko \| Romanian \| ro \| Chinese \| zh \|

	## Model Downloads
	\| Model Name \| Description \| Download \|
	\| ----------- \| ----------- \|-----------
	\| 👉 Seed-X-Instruct \| Instruction-tuned for alignment with user intent. \|🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-Instruct-7B)\|
	\| Seed-X-PPO \| RL trained to boost translation capabilities. \| 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-PPO-7B)\|
	\|Seed-X-RM \| Reward model to evaluate the quality of translation.\| 🤗 [Model](https://huggingface.co/ByteDance-Seed/Seed-X-RM-7B)\|

	## Quickstart
	Here is a simple example demonstrating how to load the model and perform translation using ```vllm```
	```python
	from vllm import LLM, SamplingParams
	model = LLM(model=model_path,
	max_num_seqs=512,
	tensor_parallel_size=8,
	enable_prefix_caching=True,
	gpu_memory_utilization=0.95)

	messages = [
	"Translate the following English sentence :\nMay the force be with you <zh>", # without CoT
	"Translate the following English sentence and explain it in detail:\nMay the force be with you <zh>" # with CoT
	]


	results = model.generate(messages, decoding_params)
	responses = [res.outputs[0].text.strip() for res in results]

	print(responses)
	```
	## Evaluation
	We evaluated Seed-X on a diverse set