Instructions to use blue-tundra-42/code_and_model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use blue-tundra-42/code_and_model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="blue-tundra-42/code_and_model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("blue-tundra-42/code_and_model")
model = AutoModelForCausalLM.from_pretrained("blue-tundra-42/code_and_model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use blue-tundra-42/code_and_model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "blue-tundra-42/code_and_model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "blue-tundra-42/code_and_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/blue-tundra-42/code_and_model

SGLang

How to use blue-tundra-42/code_and_model with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "blue-tundra-42/code_and_model" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "blue-tundra-42/code_and_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "blue-tundra-42/code_and_model" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "blue-tundra-42/code_and_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use blue-tundra-42/code_and_model with Docker Model Runner:
```
docker model run hf.co/blue-tundra-42/code_and_model
```

code_and_model / README.md

blue-tundra-42

Upload UNO Scorer (initial version)

f1f682e verified 16 days ago

preview code

raw

history blame contribute delete

3.22 kB

	---
	language:
	- zh
	- en
	license: apache-2.0
	base_model: Qwen/Qwen3-14B
	library_name: transformers
	tags:
	- qwen
	- scoring
	- grading
	- evaluation
	- llm-judge
	pipeline_tag: text-generation
	---

	# UNO-Scorer: A Unified General Scoring Model for UNO-Bench

	## 📖 Introduction

	UNO-Scorer is a lightweight yet high-precision general scoring model developed as part of UNO-Bench. It is designed to efficiently automate the evaluation of Large Multimodal Models (LMMs) with minimal computational overhead.

	Built upon the powerful Qwen3-14B backbone, UNO-Scorer is fine-tuned on 13K high-quality in-house data. It overcomes the limitations of traditional Overall Reward Models (ORMs) by supporting 6 distinct question types, with particular excellence in Multi-Step Open-Ended Questions (MO).

	## 📊 Performance

	UNO-Scorer demonstrates superior performance in automated evaluation, particularly in handling complex Multi-Step Open-Ended Questions. We compared the accuracy of our scorer against other advanced evaluators:

	\| Model \| Accuracy \|
	\| :--- \| :--- \|
	\| Seed-1.5-VL \| 0.9118 \|
	\| GPT-4.1 \| 0.9457 \|
	\| UNO-Scorer (Ours) \| 0.9505 \|

	Experiments show that UNO-Scorer surpasses even proprietary frontier models like GPT-4.1 in this specific evaluation domain with lower cost.



	## 💻 Usage

	### Run Inference
	We provide an example script based on vLLM for efficient model inference. You can run the following command to test the scorer:

	```bash
	bash examples/test_scorer.sh
	```

	### 4. Adapt Your Reference Answer
	The most critical aspect of utilizing the UNO-Scorer lies in the proper formatting of the Reference Answer. Specifically, it is required to:

	1. Assign point values to the answer components. The total points for the question should typically sum to 10 points.
	2. You may customize detailed scoring criteria for each reference answer to suit your needs(e.g., clarifying how to judge cases where the final choice is correct but the reasoning is flawed).

	Note: Since the model is primarily trained on Chinese corpora, it adheres more accurately to instructions when these specific descriptions are written in Chinese.

	You can structure the Reference Answer as follows:

	\| Question Type \| Scenario \| Reference Answer \| Example \|
	\| :--- \| :--- \| :--- \| :--- \|
	\| Single Question \| The model only needs to check if the final result matches. \| Format as a single sub-question (Sub-question 1) worth exactly 10 points.<br><br>Template:<br>`小问1：{Answer}，总分10分，无需关注推理过程，最终答案正确即可` \| Raw Answer: "C"<br>Input Answer: `小问1：C，总分10分，无需关注推理过程，最终答案正确即可` \|
	\| Multiple Question \| The model needs to grade specific checkpoints. \| Break down the answer into numbered sub-steps with assigned points (summing to exactly 10).<br><br>Template:<br>`1. {Sub-Answer A} ({X} points); 2. {Sub-Answer B} ({Y} points).` \| Raw Answer: "5 apples, 6 bananas"<br>Input Answer: `1. 5 apples (4 points); 2. 6 bananas (6 points).` \|


	---

	Disclaimer: This model is based on Qwen3-14B. Please strictly follow the license and usage policy of the original Qwen model series.