Instructions to use OpenMOSS-Team/SciThinker-30B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OpenMOSS-Team/SciThinker-30B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="OpenMOSS-Team/SciThinker-30B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenMOSS-Team/SciThinker-30B")
model = AutoModelForCausalLM.from_pretrained("OpenMOSS-Team/SciThinker-30B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use OpenMOSS-Team/SciThinker-30B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpenMOSS-Team/SciThinker-30B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenMOSS-Team/SciThinker-30B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/OpenMOSS-Team/SciThinker-30B

SGLang

How to use OpenMOSS-Team/SciThinker-30B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "OpenMOSS-Team/SciThinker-30B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenMOSS-Team/SciThinker-30B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "OpenMOSS-Team/SciThinker-30B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenMOSS-Team/SciThinker-30B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use OpenMOSS-Team/SciThinker-30B with Docker Model Runner:
```
docker model run hf.co/OpenMOSS-Team/SciThinker-30B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

SciThinker-30B

SciThinker-30B is a fine-tuned language model for scientific ideation with high potential impact. Given a seed paper (title and abstract), it proposes a follow-up research idea.

This model is part of the paper: AI Can Learn Scientific Taste.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "OpenMOSS-Team/SciThinker-30B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

messages = [
    {"role": "system", "content": "You are a helpful assistant. You first think about the reasoning process in your mind and then provide the user with the answer."},
    {"role": "user", "content": "You are a knowledgeable and insightful AI researcher. You have come across a new research paper with the following title and abstract:\n\nTitle: ...\nAbstract: ...\n\nBased on the core ideas, methods, or findings of this work, engage in heuristic thinking and propose a follow-up research idea. You need not confine yourself to the specific scenario or task of the original paper. You may consider shortcomings of the original method, propose improvements, apply its ideas to other tasks or AI domains, or even introduce entirely new problems and approaches. Aim to formulate an idea with high academic value and potential impact.\n\nIn your response, solely present your proposed title and abstract. Think independently and there is no need to imitate the format of the provided paper's title and abstract, nor to intentionally cite it. You must ensure that no specific numerical results are included in the abstract. You must ensure the abstract is of a moderate length, avoiding excessive length, as if you were writing it for a typical academic paper.\n\nOutput format (strict, no extra text):\nTitle: <your proposed paper title>\nAbstract: <your proposed abstract>"}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768,
    do_sample=True,
    temperature=0.6,
    top_p=0.95,
    top_k=20
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

try:
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

print("thinking content:", thinking_content)
print("content:", content)