Instructions to use chrisrutherford/pumlGenV2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use chrisrutherford/pumlGenV2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="chrisrutherford/pumlGenV2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("chrisrutherford/pumlGenV2")
model = AutoModelForCausalLM.from_pretrained("chrisrutherford/pumlGenV2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use chrisrutherford/pumlGenV2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "chrisrutherford/pumlGenV2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "chrisrutherford/pumlGenV2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/chrisrutherford/pumlGenV2

SGLang

How to use chrisrutherford/pumlGenV2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "chrisrutherford/pumlGenV2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "chrisrutherford/pumlGenV2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "chrisrutherford/pumlGenV2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "chrisrutherford/pumlGenV2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use chrisrutherford/pumlGenV2 with Docker Model Runner:
```
docker model run hf.co/chrisrutherford/pumlGenV2
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

pumlGenV2-1

This model is a fine-tuned version of Qwen/Qwen3-8B-Base on a pumlGen dataset. It specializes in generating PlantUML diagrams from natural language questions.

Model description

pumlGenV2-1 is a specialized language model that converts complex questions into structured PlantUML diagrams. The model takes philosophical, historical, legal, or analytical questions as input and generates comprehensive PlantUML code that visualizes the relationships, hierarchies, and connections between concepts mentioned in the question.

Key features:

Generates syntactically correct PlantUML diagrams
Creates structured visualizations with packages, entities, and relationships
Adds contextual notes and annotations
Handles complex domain-specific topics across various fields

Intended uses & limitations

Intended uses

Educational purposes: Creating visual diagrams to explain complex concepts
Research visualization: Mapping relationships between ideas, theories, or historical events
Documentation: Generating diagrams for technical or conceptual documentation
Analysis tools: Visualizing interconnections in philosophical, legal, or social topics

Limitations

The model is specifically trained for PlantUML output format
Best performance on analytical, philosophical, historical, and conceptual questions
May require post-processing for specific PlantUML styling preferences
Generated diagrams should be reviewed for accuracy and completeness

Training and evaluation data

The model was trained on the pumlGen dataset, which consists of question-answer pairs where:

Input: Complex analytical questions about various topics (philosophy, history, law, social sciences)
Output: Corresponding PlantUML diagram code that visualizes the concepts and relationships

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 16
total_train_batch_size: 128
total_eval_batch_size: 64
optimizer: Use OptimizerNames.ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
num_epochs: 3.0

Training results

The model demonstrates strong capabilities in:

Generating valid PlantUML syntax
Creating meaningful entity relationships
Adding appropriate annotations and notes
Structuring complex information hierarchically

Framework versions

Transformers 4.52.3
Pytorch 2.6.0+cu124
Datasets 3.6.0
Tokenizers 0.21.1

Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("your-username/pumlGenV1-1")
tokenizer = AutoTokenizer.from_pretrained("your-username/pumlGenV1-1")

# Prepare the input in conversation format
question = "What role does the annual flooding of the Nile play in the overall agricultural success and survival of the kingdoms along its banks?"

messages = [
    {"from": "human", "value": question},
]

# Format the input (adjust based on your specific tokenizer's chat template)
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt")

# Generate PlantUML diagram
outputs = model.generate(
    **inputs, 
    max_length=2048,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

# Decode and extract the PlantUML code
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Extract the PlantUML code from the response (between @startuml and @enduml)
plantuml_code = response.split("@startuml")[-1].split("@enduml")[0]
plantuml_code = "@startuml" + plantuml_code + "@enduml"

print(plantuml_code)

Eval Q1

Can artificial intelligence ever achieve true understanding, or is it limited to sophisticated pattern recognition? Break this down by examining the nature of consciousness, the semantics of 'understanding,' the boundaries of computational logic, and the role of embodiment in cognition—then map these components into a coherent framework

Downloads last month: 7

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for chrisrutherford/pumlGenV2

Base model

Qwen/Qwen3-8B-Base

Finetuned

(439)

this model

Quantizations

3 models