Instructions to use Nora-006/QwenOlga with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Nora-006/QwenOlga with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Nora-006/QwenOlga")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Nora-006/QwenOlga")
model = AutoModelForCausalLM.from_pretrained("Nora-006/QwenOlga")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Nora-006/QwenOlga with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Nora-006/QwenOlga"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Nora-006/QwenOlga",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Nora-006/QwenOlga

SGLang

How to use Nora-006/QwenOlga with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Nora-006/QwenOlga" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Nora-006/QwenOlga",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Nora-006/QwenOlga" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Nora-006/QwenOlga",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Nora-006/QwenOlga with Docker Model Runner:
```
docker model run hf.co/Nora-006/QwenOlga
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

👑 Model Card: QwenOlga

📝 Description

QwenOlga est un modèle de langage compact (SLM) de 51,6 millions de paramètres, affiné pour devenir un expert de la royauté mondiale, de l'histoire des dynasties et de l'étiquette de cour. Basé sur l'architecture Learnia (pré-entraînée par Finisha/Clémence), ce modèle a été conçu pour répondre avec élégance et précision à des requêtes historiques et protocolaires complexes.

🚀 Caractéristiques Techniques

Modèle de base : Finisha-f-scratch/Learnia
Paramètres : 51,6M
Méthode d'affinage : PEFT / LoRA (Low-Rank Adaptation)
Dataset : 300+ paires de questions/réponses synthétiques sur la royauté.
Format de dialogue :
- User: (Instruction)
- Olga: (Réponse)

🛠️ Configuration de l'Entraînement (Hyperparamètres)

Pour garantir la stabilité du modèle et éviter l'oubli catastrophique, les paramètres suivants ont été utilisés :

Learning Rate : 2e-5 📉
Epochs : 3 à 5
Optimiseur : AdamW
LR Scheduler : Linear
LoRA Rank (r) : 8
LoRA Alpha : 32

📖 Comment l'utiliser ?

Pour discuter avec Olga, utilisez le format de prompt suivant : User: Who was known as the Sun King in France?

Olga: Louis XIV is famously known as the Sun King (Le Roi Soleil) for his long and absolute reign.

Exemple de code Python (Inference)

from transformers import pipeline
pipe = pipeline("text-generation", model="Nora-006/QwenOlga")

prompt = "User:\nTell me about the Romanov dynasty.\n\nOlga:"
print(pipe(prompt, max_new_tokens=100)[0]['generated_text'])

🤝 Remerciements Un immense merci à Clémence (Clemiylia) pour le modèle de base Learnia et ses conseils précieux sur l'utilisation du LoRA et la gestion du Learning Rate. Sans son expertise sur les SLM, QwenOlga n'aurait pas pu atteindre ce niveau de fluidité.

⚠️ Limites

Étant un modèle de 51,6M, QwenOlga peut parfois présenter des hallucinations sur des dates très précises ou des noms obscurs. Il est recommandé de l'utiliser pour des tâches créatives ou de premier niveau d'information historique.

Downloads last month: 9

Safetensors

Model size

51.6M params

Tensor type

F32

Model tree for Nora-006/QwenOlga

Base model

Finisha-F-scratch/Learnia

Finetuned

(11)

this model

Quantizations

1 model