catalystsec
/

Seed-OSS-36B-Instruct-4bit-DWQ

Text Generation

4-bit precision

Model card Files Files and versions

Seed-OSS-36B-Instruct-4bit-DWQ / README.md

kernelpool's picture

Update README.md

806abdf verified 2 months ago

|

history blame contribute delete

980 Bytes

	---
	license: apache-2.0
	pipeline_tag: text-generation
	library_name: mlx
	tags:
	- mlx
	base_model: ByteDance-Seed/Seed-OSS-36B-Instruct
	language:
	- en
	- zh
	---

	# catalystsec/Seed-OSS-36B-Instruct-4bit-DWQ

	This model was quantized to 4-bit using DWQ with mlx-lm version 0.27.1, distilled from a BF16 teacher model.

	\| Learning Rate \| Total Loss \| KL Loss \| Activation Loss \| Improvement \|
	\|---------------\|------------\|---------\|-----------------\|-------------\|
	\| 2e-7 \| 0.415 \| 0.025 \| 0.390 \| 15.8% \|

	## Use with mlx

	```bash
	pip install mlx-lm
	```

	```python
	from mlx_lm import load, generate

	model, tokenizer = load("catalystsec/Seed-OSS-36B-Instruct-4bit-DWQ")

	prompt = "hello"

	if tokenizer.chat_template is not None:
	messages = [{"role": "user", "content": prompt}]
	prompt = tokenizer.apply_chat_template(
	messages, add_generation_prompt=True
	)

	response = generate(model, tokenizer, prompt=prompt, verbose=True)
	```