Model

This model is a mxfp4 quantified version of fenyo/gpt-oss-20b-FAQ-MES

It can run on low-end GPU with only 16 Gb VRAM.

Run with Ollama (on your own GPU with 16Gb VRAM)

It has been uploaded to Ollama: https://ollama.com/eowyneowyn/gpt-oss-20b-FAQ-MES

ollama run eowyneowyn/gpt-oss-20b-FAQ-MES:faq "Qu'est-ce que Mon Espace Santé ?"

Run on Colab for free (using the free T4 GPU)

Run it on Google Colab: click here

Add the system prompt to the messages list to get best results :

from transformers import pipeline

pipe = pipeline("text-generation", model="fenyo/gpt-oss-20b-FAQ-MES-mxfp4")
messages = [
    {"role": "system", "content": "You are a helpful chatbot assistant for the Mon Espace Santé website."},
    {"role": "user", "content": "Qu'est ce que Mon Espace Santé ?"},
]
pipe(messages)
Downloads last month
124
Safetensors
Model size
21B params
Tensor type
BF16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fenyo/gpt-oss-20b-FAQ-MES-mxfp4

Base model

openai/gpt-oss-20b
Quantized
(168)
this model