squ11z1
/

claude-oss

Text Generation

instruction-tuned

Model card Files Files and versions

claude-oss / README.md

squ11z1's picture

Update README.md

5d62f3e verified 7 days ago

|

history blame contribute delete

2.99 kB

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- claude
	- conversational
	- instruction-tuned
	- multilingual
	- reasoning
	- open-source
	datasets:
	- Roman1111111/claude-opus-4.6-10000x
	- Crownelius/Opus-4.6-Reasoning-3300x
	- peteromallet/dataclaw-peteromallet
	base_model:
	- Qwen/Qwen3.5-9B
	base_model_relation: finetune
	---

	# Claude OSS 9b

	> Disclaimer: This is not an official release by Anthropic.
	> Claude OSS 9B is an independent open model project.

	![claudeoss9b](https://cdn-uploads.huggingface.co/production/uploads/67329d3f69fded92d56ab41a/wW3owxbkwKbjLvXLGh16Q.png)

	## Overview

	Claude OSS 9B is a multilingual conversational language model designed to deliver a familiar polished assistant experience with strong instruction-following, stable identity behavior, and practical general-purpose usefulness.

	The model was fine-tuned on open-source datasets, with a combined total of approximately 200,000 rows collected from Hugging Face. The training mixture focused on assistant behavior, reasoning preservation, multilingual interaction, and stronger identity consistency.

	Claude OSS 9B is intended for:

	- general chat and assistant use
	- multilingual interaction
	- reasoning-oriented prompting
	- writing and summarization
	- lightweight coding help
	- identity-consistent assistant behavior
	- 200+ languages
	---

	## Benchmarks


	![Image 01_50_05](https://cdn-uploads.huggingface.co/production/uploads/67329d3f69fded92d56ab41a/tjcrXVd4m5b8Q2yxZx7f8.png)
	(Based on Qwen3.5 9b benchmarks results)

	## Training Summary
	Claude OSS 9B was fine-tuned on a curated open-source training mixture totaling roughly 200k rows from Hugging Face.
	The data mix emphasized:

	- assistant-style conversations
	- instruction following
	- identity reinforcement
	- multilingual prompts and answers
	- reasoning preservation
	- general usability tasks

	## Usage
	- Transformers
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	model_id = "squ11z1/claude-oss-9b"

	tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	trust_remote_code=True,
	torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
	device_map="auto",
	)

	messages = [{"role": "user", "content": "Who are you?"},]

	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize=True,
	add_generation_prompt=True,
	return_tensors="pt",
	return_dict=True,
	)

	inputs = {k: v.to(model.device) for k, v in inputs.items()}

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=128,
	do_sample=False,
	pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
	)

	prompt_len = inputs["input_ids"].shape[1]
	print(tokenizer.decode(outputs[0][prompt_len:], skip_special_tokens=True))
	```
	- GGUF / llama.cpp
	```bash
	./llama-cli -m claude-oss-9b-q4_k_m.gguf -p "Who are you?"

	```