--- license: apache-2.0 language: - en pipeline_tag: text-generation tags: - claude - conversational - instruction-tuned - multilingual - reasoning - open-source datasets: - Roman1111111/claude-opus-4.6-10000x - Crownelius/Opus-4.6-Reasoning-3300x - peteromallet/dataclaw-peteromallet base_model: - Qwen/Qwen3.5-9B base_model_relation: finetune --- # Claude OSS 9b > **Disclaimer:** This is **not** an official release by Anthropic. > Claude OSS 9B is an independent open model project. ![claudeoss9b](https://cdn-uploads.huggingface.co/production/uploads/67329d3f69fded92d56ab41a/wW3owxbkwKbjLvXLGh16Q.png) ## Overview Claude OSS 9B is a multilingual conversational language model designed to deliver a familiar polished assistant experience with strong instruction-following, stable identity behavior, and practical general-purpose usefulness. The model was fine-tuned on **open-source datasets**, with a combined total of approximately **200,000 rows** collected from Hugging Face. The training mixture focused on assistant behavior, reasoning preservation, multilingual interaction, and stronger identity consistency. Claude OSS 9B is intended for: - general chat and assistant use - multilingual interaction - reasoning-oriented prompting - writing and summarization - lightweight coding help - identity-consistent assistant behavior - 200+ languages --- ## Benchmarks ![Image 01_50_05](https://cdn-uploads.huggingface.co/production/uploads/67329d3f69fded92d56ab41a/tjcrXVd4m5b8Q2yxZx7f8.png) (Based on Qwen3.5 9b benchmarks results) ## Training Summary Claude OSS 9B was fine-tuned on a curated open-source training mixture totaling roughly 200k rows from Hugging Face. The data mix emphasized: - assistant-style conversations - instruction following - identity reinforcement - multilingual prompts and answers - reasoning preservation - general usability tasks ## Usage - Transformers ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_id = "squ11z1/claude-oss-9b" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_id, trust_remote_code=True, torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32, device_map="auto", ) messages = [{"role": "user", "content": "Who are you?"},] inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt", return_dict=True, ) inputs = {k: v.to(model.device) for k, v in inputs.items()} with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=128, do_sample=False, pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id, ) prompt_len = inputs["input_ids"].shape[1] print(tokenizer.decode(outputs[0][prompt_len:], skip_special_tokens=True)) ``` - GGUF / llama.cpp ```bash ./llama-cli -m claude-oss-9b-q4_k_m.gguf -p "Who are you?" ```