Instructions to use clejordan/MNLP_M3_quantized_model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use clejordan/MNLP_M3_quantized_model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="clejordan/MNLP_M3_quantized_model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("clejordan/MNLP_M3_quantized_model")
model = AutoModelForCausalLM.from_pretrained("clejordan/MNLP_M3_quantized_model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use clejordan/MNLP_M3_quantized_model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "clejordan/MNLP_M3_quantized_model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clejordan/MNLP_M3_quantized_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/clejordan/MNLP_M3_quantized_model

SGLang

How to use clejordan/MNLP_M3_quantized_model with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "clejordan/MNLP_M3_quantized_model" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clejordan/MNLP_M3_quantized_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "clejordan/MNLP_M3_quantized_model" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "clejordan/MNLP_M3_quantized_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use clejordan/MNLP_M3_quantized_model with Docker Model Runner:
```
docker model run hf.co/clejordan/MNLP_M3_quantized_model
```

clejordan commited on Jun 4, 2025

Commit

ded5c59

verified ·

1 Parent(s): bf775a2

Delete checkpoint-8760/trainer_state.json with huggingface_hub

Browse files

Files changed (1) hide show

checkpoint-8760/trainer_state.json +0 -177

checkpoint-8760/trainer_state.json DELETED Viewed

@@ -1,177 +0,0 @@
-{
-  "best_global_step": null,
-  "best_metric": null,
-  "best_model_checkpoint": null,
-  "epoch": 3.0,
-  "eval_steps": 500,
-  "global_step": 8760,
-  "is_hyper_param_search": false,
-  "is_local_process_zero": true,
-  "is_world_process_zero": true,
-  "log_history": [
-    {
-      "epoch": 0.17123287671232876,
-      "grad_norm": 0.17268113791942596,
-      "learning_rate": 0.00018867579908675802,
-      "loss": 0.1977,
-      "step": 500
-    },
-    {
-      "epoch": 0.3424657534246575,
-      "grad_norm": 0.2097366750240326,
-      "learning_rate": 0.00017726027397260274,
-      "loss": 0.1371,
-      "step": 1000
-    },
-    {
-      "epoch": 0.5136986301369864,
-      "grad_norm": 0.2663485109806061,
-      "learning_rate": 0.0001658447488584475,
-      "loss": 0.1345,
-      "step": 1500
-    },
-    {
-      "epoch": 0.684931506849315,
-      "grad_norm": 0.16753822565078735,
-      "learning_rate": 0.00015442922374429225,
-      "loss": 0.1346,
-      "step": 2000
-    },
-    {
-      "epoch": 0.8561643835616438,
-      "grad_norm": 0.1972370743751526,
-      "learning_rate": 0.00014301369863013697,
-      "loss": 0.1346,
-      "step": 2500
-    },
-    {
-      "epoch": 1.0,
-      "eval_loss": 0.13700313866138458,
-      "eval_runtime": 31.6412,
-      "eval_samples_per_second": 31.604,
-      "eval_steps_per_second": 7.901,
-      "step": 2920
-    },
-    {
-      "epoch": 1.0273972602739727,
-      "grad_norm": 0.22729772329330444,
-      "learning_rate": 0.00013159817351598174,
-      "loss": 0.1305,
-      "step": 3000
-    },
-    {
-      "epoch": 1.1986301369863013,
-      "grad_norm": 0.2352839708328247,
-      "learning_rate": 0.00012018264840182649,
-      "loss": 0.1164,
-      "step": 3500
-    },
-    {
-      "epoch": 1.36986301369863,
-      "grad_norm": 0.1942291408777237,
-      "learning_rate": 0.00010876712328767125,
-      "loss": 0.1177,
-      "step": 4000
-    },
-    {
-      "epoch": 1.541095890410959,
-      "grad_norm": 0.19947709143161774,
-      "learning_rate": 9.735159817351599e-05,
-      "loss": 0.1193,
-      "step": 4500
-    },
-    {
-      "epoch": 1.7123287671232876,
-      "grad_norm": 0.23757582902908325,
-      "learning_rate": 8.593607305936074e-05,
-      "loss": 0.1158,
-      "step": 5000
-    },
-    {
-      "epoch": 1.8835616438356164,
-      "grad_norm": 0.27170926332473755,
-      "learning_rate": 7.452054794520548e-05,
-      "loss": 0.1183,
-      "step": 5500
-    },
-    {
-      "epoch": 2.0,
-      "eval_loss": 0.1377769559621811,
-      "eval_runtime": 31.7008,
-      "eval_samples_per_second": 31.545,
-      "eval_steps_per_second": 7.886,
-      "step": 5840
-    },
-    {
-      "epoch": 2.0547945205479454,
-      "grad_norm": 0.29543906450271606,
-      "learning_rate": 6.310502283105023e-05,
-      "loss": 0.1097,
-      "step": 6000
-    },
-    {
-      "epoch": 2.2260273972602738,
-      "grad_norm": 0.4334782063961029,
-      "learning_rate": 5.1689497716894973e-05,
-      "loss": 0.0995,
-      "step": 6500
-    },
-    {
-      "epoch": 2.3972602739726026,
-      "grad_norm": 0.28854918479919434,
-      "learning_rate": 4.027397260273973e-05,
-      "loss": 0.0986,
-      "step": 7000
-    },
-    {
-      "epoch": 2.5684931506849313,
-      "grad_norm": 0.2981366515159607,
-      "learning_rate": 2.8858447488584477e-05,
-      "loss": 0.0982,
-      "step": 7500
-    },
-    {
-      "epoch": 2.73972602739726,
-      "grad_norm": 0.28199276328086853,
-      "learning_rate": 1.7442922374429226e-05,
-      "loss": 0.0985,
-      "step": 8000
-    },
-    {
-      "epoch": 2.910958904109589,
-      "grad_norm": 0.34423625469207764,
-      "learning_rate": 6.027397260273973e-06,
-      "loss": 0.0983,
-      "step": 8500
-    },
-    {
-      "epoch": 3.0,
-      "eval_loss": 0.14507943391799927,
-      "eval_runtime": 31.6284,
-      "eval_samples_per_second": 31.617,
-      "eval_steps_per_second": 7.904,
-      "step": 8760
-    }
-  ],
-  "logging_steps": 500,
-  "max_steps": 8760,
-  "num_input_tokens_seen": 0,
-  "num_train_epochs": 3,
-  "save_steps": 500,
-  "stateful_callbacks": {
-    "TrainerControl": {
-      "args": {
-        "should_epoch_stop": false,
-        "should_evaluate": false,
-        "should_log": false,
-        "should_save": true,
-        "should_training_stop": true
-      },
-      "attributes": {}
-    }
-  },
-  "total_flos": 4.795227490693939e+16,
-  "train_batch_size": 4,
-  "trial_name": null,
-  "trial_params": null
-}