Text Generation
Transformers
Safetensors
PEFT
English
archimate
llama3
causal-lm
lora
conversational
Eval Results (legacy)
Instructions to use brkichle/lora-llama3-archimate with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use brkichle/lora-llama3-archimate with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="brkichle/lora-llama3-archimate") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("brkichle/lora-llama3-archimate", dtype="auto") - PEFT
How to use brkichle/lora-llama3-archimate with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use brkichle/lora-llama3-archimate with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "brkichle/lora-llama3-archimate" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "brkichle/lora-llama3-archimate", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/brkichle/lora-llama3-archimate
- SGLang
How to use brkichle/lora-llama3-archimate with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "brkichle/lora-llama3-archimate" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "brkichle/lora-llama3-archimate", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "brkichle/lora-llama3-archimate" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "brkichle/lora-llama3-archimate", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use brkichle/lora-llama3-archimate with Docker Model Runner:
docker model run hf.co/brkichle/lora-llama3-archimate
ArchiMateGPT: Llama 3.1 8B Instruct + LoRA Adapter
Base model: meta-llama/Llama-3.1-8B-Instruct
LoRA config: r=96, α=192, dropout=0.05, target_modules=["q_proj","k_proj","v_proj","o_proj"], inference_mode=true
Intended Use
Fine-tuned to interpret and generate ArchiMate 3.1 architecture descriptions, diagrams, and modeling advice. Ideal for embedding into applications that need automated ArchiMate guidance.
Not for: personal data inference, non-architecture chat.
Quantitative Metrics
| Metric | Value |
|---|---|
| Eval loss | 0.2238 |
| Perplexity | ~4.7 |
| Eval samples/sec | ~19.23 |
Example
Click to expand
Input:
Design a high-level ArchiMate view for a cloud migration scenario.
Output:
ArchiMate View:
- Application Component: Cloud Migration Service
- Business Role: Migration Lead
- Infrastructure Service: Virtual Network
...
Inference
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from peft import PeftModel
# Load base + LoRA
tokenizer = AutoTokenizer.from_pretrained(
"meta-llama/Llama-3.1-8B-Instruct", use_fast=True
)
base = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.1-8B-Instruct",
device_map="auto", torch_dtype="auto"
)
model = PeftModel.from_pretrained(base, "brkichle/lora-llama3-archimate")
model.eval()
# Create generation pipeline
pipe = pipeline(
"text-generation", model=model, tokenizer=tokenizer,
device_map="auto", return_full_text=False,
max_new_tokens=256, temperature=0.7, top_p=0.9,
repetition_penalty=1.1, pad_token_id=tokenizer.eos_token_id
)
# Run
response = pipe("Show me an ArchiMate overview of a microservices architecture.")
print(response[0]["generated_text"])
Limitations
- May hallucinate unsupported ArchiMate elements; always validate generated views with domain experts.
- Large prompts can degrade coherence.
License & Citation
MIT License. Please cite:
@misc{archimategpt2025,
title={ArchiMateGPT: LoRA‐fine‐tuned Llama 3.1 for ArchiMate 3.1},
author={Your Name},
year={2025},
publisher={Hugging Face}
}
Model tree for brkichle/lora-llama3-archimate
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-InstructEvaluation results
- Eval_Loss on Archimate v3.1 instruction/completion pairsvalidation set self-reported0.149
- Eval_Runtime on Archimate v3.1 instruction/completion pairsvalidation set self-reported7.755
- Eval_Samples_Per_Second on Archimate v3.1 instruction/completion pairsvalidation set self-reported21.276
- Eval_Steps_Per_Second on Archimate v3.1 instruction/completion pairsvalidation set self-reported5.416
- Epoch on Archimate v3.1 instruction/completion pairsvalidation set self-reported5.000
- Perplexity on Archimate v3.1 instruction/completion pairsvalidation set self-reported1.160