Instructions to use gabriellarson/pumlGenV2-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use gabriellarson/pumlGenV2-GGUF with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("gabriellarson/pumlGenV2-GGUF", dtype="auto") - llama-cpp-python
How to use gabriellarson/pumlGenV2-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="gabriellarson/pumlGenV2-GGUF", filename="pumlGenV2-8.2B-F16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use gabriellarson/pumlGenV2-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf gabriellarson/pumlGenV2-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf gabriellarson/pumlGenV2-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf gabriellarson/pumlGenV2-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf gabriellarson/pumlGenV2-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf gabriellarson/pumlGenV2-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf gabriellarson/pumlGenV2-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf gabriellarson/pumlGenV2-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf gabriellarson/pumlGenV2-GGUF:Q4_K_M
Use Docker
docker model run hf.co/gabriellarson/pumlGenV2-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use gabriellarson/pumlGenV2-GGUF with Ollama:
ollama run hf.co/gabriellarson/pumlGenV2-GGUF:Q4_K_M
- Unsloth Studio new
How to use gabriellarson/pumlGenV2-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for gabriellarson/pumlGenV2-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for gabriellarson/pumlGenV2-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for gabriellarson/pumlGenV2-GGUF to start chatting
- Docker Model Runner
How to use gabriellarson/pumlGenV2-GGUF with Docker Model Runner:
docker model run hf.co/gabriellarson/pumlGenV2-GGUF:Q4_K_M
- Lemonade
How to use gabriellarson/pumlGenV2-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull gabriellarson/pumlGenV2-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.pumlGenV2-GGUF-Q4_K_M
List all available models
lemonade list
llm.create_chat_completion(
messages = "No input example has been defined for this model task."
)pumlGenV2-1
This model is a fine-tuned version of Qwen/Qwen3-8B-Base on a pumlGen dataset. It specializes in generating PlantUML diagrams from natural language questions.
Model description
pumlGenV2-1 is a specialized language model that converts complex questions into structured PlantUML diagrams. The model takes philosophical, historical, legal, or analytical questions as input and generates comprehensive PlantUML code that visualizes the relationships, hierarchies, and connections between concepts mentioned in the question.
Key features:
- Generates syntactically correct PlantUML diagrams
- Creates structured visualizations with packages, entities, and relationships
- Adds contextual notes and annotations
- Handles complex domain-specific topics across various fields
Intended uses & limitations
Intended uses
- Educational purposes: Creating visual diagrams to explain complex concepts
- Research visualization: Mapping relationships between ideas, theories, or historical events
- Documentation: Generating diagrams for technical or conceptual documentation
- Analysis tools: Visualizing interconnections in philosophical, legal, or social topics
Limitations
- The model is specifically trained for PlantUML output format
- Best performance on analytical, philosophical, historical, and conceptual questions
- May require post-processing for specific PlantUML styling preferences
- Generated diagrams should be reviewed for accuracy and completeness
Training and evaluation data
The model was trained on the pumlGen dataset, which consists of question-answer pairs where:
- Input: Complex analytical questions about various topics (philosophy, history, law, social sciences)
- Output: Corresponding PlantUML diagram code that visualizes the concepts and relationships
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- total_eval_batch_size: 64
- optimizer: Use OptimizerNames.ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- num_epochs: 3.0
Training results
The model demonstrates strong capabilities in:
- Generating valid PlantUML syntax
- Creating meaningful entity relationships
- Adding appropriate annotations and notes
- Structuring complex information hierarchically
Framework versions
- Transformers 4.52.3
- Pytorch 2.6.0+cu124
- Datasets 3.6.0
- Tokenizers 0.21.1
Usage Example
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("your-username/pumlGenV1-1")
tokenizer = AutoTokenizer.from_pretrained("your-username/pumlGenV1-1")
# Prepare the input in conversation format
question = "What role does the annual flooding of the Nile play in the overall agricultural success and survival of the kingdoms along its banks?"
messages = [
{"from": "human", "value": question},
]
# Format the input (adjust based on your specific tokenizer's chat template)
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt")
# Generate PlantUML diagram
outputs = model.generate(
**inputs,
max_length=2048,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
# Decode and extract the PlantUML code
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Extract the PlantUML code from the response (between @startuml and @enduml)
plantuml_code = response.split("@startuml")[-1].split("@enduml")[0]
plantuml_code = "@startuml" + plantuml_code + "@enduml"
print(plantuml_code)
Eval Q1
Can artificial intelligence ever achieve true understanding, or is it limited to sophisticated pattern recognition? Break this down by examining the nature of consciousness, the semantics of 'understanding,' the boundaries of computational logic, and the role of embodiment in cognition—then map these components into a coherent framework
- Downloads last month
- 13
4-bit
5-bit
6-bit
8-bit
16-bit


# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="gabriellarson/pumlGenV2-GGUF", filename="", )