Instructions to use likhonhfai/mysterious-coding-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use likhonhfai/mysterious-coding-model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="likhonhfai/mysterious-coding-model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("likhonhfai/mysterious-coding-model", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use likhonhfai/mysterious-coding-model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "likhonhfai/mysterious-coding-model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "likhonhfai/mysterious-coding-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/likhonhfai/mysterious-coding-model

SGLang

How to use likhonhfai/mysterious-coding-model with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "likhonhfai/mysterious-coding-model" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "likhonhfai/mysterious-coding-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "likhonhfai/mysterious-coding-model" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "likhonhfai/mysterious-coding-model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use likhonhfai/mysterious-coding-model with Docker Model Runner:
```
docker model run hf.co/likhonhfai/mysterious-coding-model
```

Mysterious Coding Model

This repository contains a specialised AI model for agentic code generation and text generation tasks. The model is inspired by the GPT‑OSS series (gpt oss 20b and gpt oss 120b) described in the corresponding paper. It is built on open‑source Llama architecture and fine‑tuned for programming assistance, conversation and multi‑language support.

Key Features

Open source: released under the Apache‑2.0 license.
Text and code generation: supports code completion, bug fixing, refactoring and documentation generation.
Efficient storage: models are stored in the secure and fast safetensors format.
Multiple precisions: includes base FP16 models, 8‑bit quantised models and MXFP4 (mixed precision) variants.
vLLM compatibility: compatible with the vLLM engine for high‑throughput inference.
Conversational: instruction tuned for interactive coding assistance.

Repository Structure

coding-model-repository/
├── README.md
├── .gitattributes              # Updated for safetensors
├── .gitignore
├── requirements.txt
├── model_index.json           # Safetensors model index
├── config.json                # Coding model configuration
├── model_card.md             # Coding model documentation
│
├── models/
│   ├── library=safetensors/   # Main safetensors models directory
│   │   ├── base/
│   │   │   ├── model-00001-of-00003.safetensors
│   │   │   ├── model-00002-of-00003.safetensors
│   │   │   ├── model-00003-of-00003.safetensors
│   │   │   ├── model.safetensors.index.json
│   │   │   ├── config.json
│   │   │   ├── generation_config.json
│   │   │   └── tokenizer/
│   │   │       ├── tokenizer.json
│   │   │       ├── tokenizer_config.json
│   │   │       ├── vocab.json
│   │   │       ├── merges.txt
│   │   │       └── special_tokens_map.json
│   │   │
│   │   ├── quantized/
│   │   │   ├── 4bit/
│   │   │   │   ├── model.safetensors
│   │   │   │   └── quantization_config.json
│   │   │   ├── 8bit/
│   │   │   │   ├── model.safetensors
│   │   │   │   └── quantization_config.json
│   │   │   └── awq/
│   │   │       ├── model.safetensors
│   │   │       └── quant_config.json
│   │   │
│   │   ├── instruct/
│   │   │   ├── model.safetensors
│   │   │   ├── config.json
│   │   │   └── training_config.json
│   │   │
│   │   └── specialized/
│   │       ├── python-focused/
│   │       │   └── model.safetensors
│   │       ├── web-dev/
│   │       │   └── model.safetensors
│   │       ├── systems-programming/
│   │       │   └── model.safetensors
│   │       └── data-science/
│   │           └── model.safetensors
│   │
│   ├── adapters/              # Safetensors adapters
│   │   ├── lora/
│   │   │   ├── adapter_model.safetensors
│   │   │   └── adapter_config.json
│   │   ├── coding-specific/
│   │   │   ├── debugging-adapter.safetensors
│   │   │   ├── refactoring-adapter.safetensors
│   │   │   └── documentation-adapter.safetensors
│   │   └── language-specific/
│   │       ├── python-adapter.safetensors
│   │       ├── javascript-adapter.safetensors
│   │       ├── rust-adapter.safetensors
│   │       └── cpp-adapter.safetensors
│   │
│   └── merged/                # Merged coding models
│       ├── code-instruct-merge/
│       │   └── model.safetensors
│       ├── multilang-merge/
│       │   └── model.safetensors
│       └── merge_recipes/
│           ├── coding_merge_v1.json
│           └── instruct_coding_merge.json
│
├── datasets/                  # Coding datasets
│   ├── training/
│   │   ├── code_samples/
│   │   ├── documentation/
│   │   └── problem_solutions/
│   ├── evaluation/
│   │   ├── humaneval/
│   │   ├── mbpp/
│   │   ├── codecontests/
│   │   └── custom_benchmarks/
│   └── instruction_tuning/
│       ├── code_alpaca/
│       ├── evol_instruct_code/
│       └── magicoder_data/
│
├── scripts/
│   ├── convert_to_safetensors.py    # Convert models to safetensors
│   ├── validate_safetensors.py     # Validate safetensors integrity
│   ├── quantize_coding_model.py    # Code-optimized quantization
│   ├── merge_coding_models.py      # Merge coding-specific models
│   ├── train_coding_adapter.py     # Train coding adapters
│   ├── evaluate_coding.py          # Code generation evaluation
│   └── benchmark_performance.py    # Performance benchmarks
│
├── evaluation/
│   ├── code_generation/
│   │   ├── python_eval.py
│   │   ├── javascript_eval.py
│   │   └── multilang_eval.py
│   ├── code_completion/
│   │   ├── completion_benchmark.py
│   │   └── context_accuracy.py
│   ├── code_understanding/
│   │   ├── bug_detection.py
│   │   ├── code_explanation.py
│   │   └── refactoring_suggestions.py
│   └── benchmarks/
│       ├── humaneval_results/
│       ├── mbpp_results/
│       └── custom_results/
│
├── tools/
│   ├── code_formatter.py
│   ├── syntax_validator.py
│   ├── dependency_analyzer.py
│   └── performance_profiler.py
│
└── docs/
    ├── coding_model_guide.md
    ├── safetensors_usage.md
    ├── evaluation_metrics.md
    └── api_reference.md

Usage

To load the model and generate code using transformers and safetensors, run:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load the safetensors model
auto_model = AutoModelForCausalLM.from_pretrained(
    "likhonhfai/mysterious-coding-model",
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained("likhonhfai/mysterious-coding-model")

prompt = "def fibonacci(n):\n    \"\"\"Calculate the nth Fibonacci number\"\"\"\n"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = auto_model.generate(
    **inputs,
    max_new_tokens=64,
    do_sample=True,
    top_p=0.95,
    temperature=0.1
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

For vLLM-based inference or to use quantized models (4‑bit, 8‑bit or AWQ), explore the subdirectories under models/quantized/ and see the scripts for quantisation and evaluation.

Safetensors Format

All model weights are stored in .safetensors format. This binary format provides:

Security – loading the model doesn’t execute arbitrary code.
Speed – faster loading compared to pickle-based formats.
Memory efficiency – supports lazy loading.
Cross-platform compatibility – works across operating systems.
Rich metadata – makes it easier to inspect and validate model shards.

Refer to scripts/convert_to_safetensors.py to convert PyTorch checkpoints into safetensors.

Quantisation

The models/quantized/ directory contains 4‑bit, 8‑bit and AWQ quantised versions of the model. These variants reduce memory requirements and accelerate inference with minimal impact on accuracy. See scripts/quantize_coding_model.py for details.

Evaluation

Benchmark scripts are available under evaluation/ and scripts/evaluate_coding.py. Use them to run HumanEval, MBPP and other coding benchmarks. Example:

python scripts/evaluate_coding.py --benchmark humaneval

ArXiv Reference

This model draws on techniques described in the paper "gpt oss 120b & gpt oss 20b", which details the training and capabilities of open‑source GPT‑OSS models.

Contribution

Contributions are welcome! Feel free to open issues or pull requests to improve the code, documentation, or add new adapters and datasets.

Downloads last month: 11

Space using likhonhfai/mysterious-coding-model 1

Paper for likhonhfai/mysterious-coding-model

gpt-oss-120b & gpt-oss-20b Model Card

Paper • 2508.10925 • Published Aug 8, 2025 • 22