Instructions to use FritzStack/RACLETTE-fp16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FritzStack/RACLETTE-fp16 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="FritzStack/RACLETTE-fp16")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("FritzStack/RACLETTE-fp16")
model = AutoModelForMultimodalLM.from_pretrained("FritzStack/RACLETTE-fp16")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use FritzStack/RACLETTE-fp16 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FritzStack/RACLETTE-fp16"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FritzStack/RACLETTE-fp16",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/FritzStack/RACLETTE-fp16

SGLang

How to use FritzStack/RACLETTE-fp16 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "FritzStack/RACLETTE-fp16" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FritzStack/RACLETTE-fp16",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "FritzStack/RACLETTE-fp16" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FritzStack/RACLETTE-fp16",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use FritzStack/RACLETTE-fp16 with Docker Model Runner:
```
docker model run hf.co/FritzStack/RACLETTE-fp16
```

FritzStack commited on Nov 16, 2025

Commit

8bed501

verified ·

1 Parent(s): 2f35e78

Update README.md

Browse files

Files changed (1) hide show

README.md +106 -3

README.md CHANGED Viewed

@@ -3,9 +3,99 @@ library_name: transformers
 tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
@@ -170,6 +260,19 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 ## Citation [optional]
 <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**

 tags: []
 ---
+# Model Description
+ This model is identical to `DeGra/RACLETTE-v0.2`, which is based on Mistral 7B and has been fine-tuned for emotion recognition and empathetic conversational support within mental health contexts. It is derived from the research presented in the paper "The Emotional Spectrum of LLMs: Leveraging Empathy and Emotion-Based Markers for Mental Health Support". The model’s architecture and fine-tuning details follow the methodology outlined in that publication—specifically, leveraging next-token prediction for emotion labeling, progressive construction of a user emotional profile throughout conversation, and interpretable emotional embeddings for preliminary mental health screening.aclanthology+4
+Reference
+For full details see:
+De Grandi, Ravenda et al. (2025). "The Emotional Spectrum of LLMs: Leveraging Empathy and Emotion-Based Markers for Mental Health Support."
+ # Usage Example: Emotional Profile Extraction
+ Suppose you have a list of sentences and want to compute the aggregated emotional profile (distribution of emotions predicted over the set):
+ ## Example Python Code
+ ```{python}
+import torch
+import transformers
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+ model_name = 'DeGra/RACLETTE-v0.2'
+ bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch.float16,
+)
+ model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    quantization_config=bnb_config,
+    trust_remote_code=True
+)
+model.config.use_cache = False
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+tokenizer.pad_token = tokenizer.eos_token
+ generation_pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    tokenizer=tokenizer,
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True,
+    device_map="auto",
+)
+ def filter_limit_chars(text, limit_chars, max_limited_chars=2, stop_at_max=False):
+    count, index = 0, [0]
+    for separator in limit_chars:
+        separator_count = text.count(separator)
+        count += separator_count
+        i = 0
+        for _ in range(separator_count):
+            i = text.find(separator, i)
+            index.append(i if not stop_at_max else i+len(separator))
+            i += len(separator)
+    index.sort()
+    index.append(len(text))
+    if count >= max_limited_chars:
+        text = text[0:index[max_limited_chars+1] if not stop_at_max else index[max_limited_chars]]
+    return text
+ def predict_emotion(prompt, num_return_emotions=10):
+    sequences = generation_pipeline(
+        prompt,
+        min_new_tokens=2,
+        max_new_tokens=5,
+        do_sample=True,
+        top_k=5,
+        num_return_sequences=num_return_emotions,
+        eos_token_id=tokenizer.eos_token_id,
+    )
+    emotions_count = {}
+    for seq in sequences:
+        emotion = seq['generated_text'][len(prompt):].strip()
+        emotion = emotion.split('<|assistant|>',1)[0].split('<|endoftext|>',1)[0]
+        emotion = filter_limit_chars(emotion, ['|','<','>',',','.'], 0, False).strip()
+        emotions_count[emotion] = emotions_count.get(emotion, 0) + 1
+    return emotions_count
+ # Example: Extract emotional profile from sentences
+emotion_dict = {e: 0 for e in ["surprised","excited","angry","proud","sad","annoyed","grateful","lonely","afraid","terrified","guilty","impressed","disgusted","hopeful","confident","furious","anxious","anticipating","joyful","nostalgic","disappointed","prepared","jealous","content","devastated","embarrassed","caring","sentimental","trusting","ashamed","apprehensive","faithful"]}
+ sentences = ["I'm feeling really down lately.", "I don't know if I can handle this anymore.", "Today I got some good news!"]
+ for sent in sentences:
+    prompt = f'<|prompter|>{sent}<|endoftext|><|emotion|>'
+    emotions_count = predict_emotion(prompt)
+    for emotion, count in emotions_count.items():
+        if emotion in emotion_dict:
+            emotion_dict[emotion] += count
+ print(emotion_dict)
+```
 ## Citation [optional]
+```
+@inproceedings{de2025emotional,
+  title={The emotional spectrum of llms: Leveraging empathy and emotion-based markers for mental health support},
+  author={De Grandi, Alessandro and Ravenda, Federico and Raballo, Andrea and Crestani, Fabio},
+  booktitle={Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2025)},
+  pages={26--43},
+  year={2025}
+}
+```
 <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**