| --- |
| license: apache-2.0 |
| language: |
| - en |
| pipeline_tag: text-generation |
| tags: |
| - claude |
| - conversational |
| - instruction-tuned |
| - multilingual |
| - reasoning |
| - open-source |
| datasets: |
| - Roman1111111/claude-opus-4.6-10000x |
| - Crownelius/Opus-4.6-Reasoning-3300x |
| - peteromallet/dataclaw-peteromallet |
| base_model: |
| - Qwen/Qwen3.5-9B |
| base_model_relation: finetune |
| --- |
| |
| # Claude OSS 9b |
|
|
| > **Disclaimer:** This is **not** an official release by Anthropic. |
| > Claude OSS 9B is an independent open model project. |
|
|
|  |
|
|
| ## Overview |
|
|
| Claude OSS 9B is a multilingual conversational language model designed to deliver a familiar polished assistant experience with strong instruction-following, stable identity behavior, and practical general-purpose usefulness. |
|
|
| The model was fine-tuned on **open-source datasets**, with a combined total of approximately **200,000 rows** collected from Hugging Face. The training mixture focused on assistant behavior, reasoning preservation, multilingual interaction, and stronger identity consistency. |
|
|
| Claude OSS 9B is intended for: |
|
|
| - general chat and assistant use |
| - multilingual interaction |
| - reasoning-oriented prompting |
| - writing and summarization |
| - lightweight coding help |
| - identity-consistent assistant behavior |
| - 200+ languages |
| --- |
|
|
| ## Benchmarks |
|
|
|
|
|  |
| (Based on Qwen3.5 9b benchmarks results) |
|
|
| ## Training Summary |
| Claude OSS 9B was fine-tuned on a curated open-source training mixture totaling roughly 200k rows from Hugging Face. |
| The data mix emphasized: |
|
|
| - assistant-style conversations |
| - instruction following |
| - identity reinforcement |
| - multilingual prompts and answers |
| - reasoning preservation |
| - general usability tasks |
|
|
| ## Usage |
| - Transformers |
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| import torch |
| |
| model_id = "squ11z1/claude-oss-9b" |
| |
| tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, |
| trust_remote_code=True, |
| torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, |
| device_map="auto", |
| ) |
| |
| messages = [{"role": "user", "content": "Who are you?"},] |
| |
| inputs = tokenizer.apply_chat_template( |
| messages, |
| tokenize=True, |
| add_generation_prompt=True, |
| return_tensors="pt", |
| return_dict=True, |
| ) |
| |
| inputs = {k: v.to(model.device) for k, v in inputs.items()} |
| |
| with torch.no_grad(): |
| outputs = model.generate( |
| **inputs, |
| max_new_tokens=128, |
| do_sample=False, |
| pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id, |
| ) |
| |
| prompt_len = inputs["input_ids"].shape[1] |
| print(tokenizer.decode(outputs[0][prompt_len:], skip_special_tokens=True)) |
| ``` |
| - GGUF / llama.cpp |
| ```bash |
| ./llama-cli -m claude-oss-9b-q4_k_m.gguf -p "Who are you?" |
| |
| ``` |