Instructions to use harpreetmann/stack_exc_multilabel_base_lm_head with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use harpreetmann/stack_exc_multilabel_base_lm_head with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b")
model = PeftModel.from_pretrained(base_model, "harpreetmann/stack_exc_multilabel_base_lm_head")

Transformers

How to use harpreetmann/stack_exc_multilabel_base_lm_head with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="harpreetmann/stack_exc_multilabel_base_lm_head")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("harpreetmann/stack_exc_multilabel_base_lm_head", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use harpreetmann/stack_exc_multilabel_base_lm_head with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "harpreetmann/stack_exc_multilabel_base_lm_head"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "harpreetmann/stack_exc_multilabel_base_lm_head",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/harpreetmann/stack_exc_multilabel_base_lm_head

SGLang

How to use harpreetmann/stack_exc_multilabel_base_lm_head with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "harpreetmann/stack_exc_multilabel_base_lm_head" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "harpreetmann/stack_exc_multilabel_base_lm_head",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "harpreetmann/stack_exc_multilabel_base_lm_head" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "harpreetmann/stack_exc_multilabel_base_lm_head",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use harpreetmann/stack_exc_multilabel_base_lm_head with Docker Model Runner:
```
docker model run hf.co/harpreetmann/stack_exc_multilabel_base_lm_head
```

harpreetmann commited on Oct 4, 2025

Commit

4220108

verified ·

1 Parent(s): f380db9

Upload folder using huggingface_hub

Browse files

Files changed (5) hide show

adapter_config.json +4 -4
adapter_model.safetensors +1 -1
optimizer.pt +1 -1
trainer_state.json +49 -49
training_args.bin +1 -1

adapter_config.json CHANGED Viewed

@@ -25,13 +25,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "gate_proj",
     "k_proj",
-    "v_proj",
     "down_proj",
     "q_proj",
-    "up_proj",
-    "o_proj"
   ],
   "target_parameters": null,
   "task_type": "CAUSAL_LM",

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
     "k_proj",
+    "up_proj",
     "down_proj",
     "q_proj",
+    "o_proj",
+    "v_proj",
+    "gate_proj"
   ],
   "target_parameters": null,
   "task_type": "CAUSAL_LM",

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a4fd2f2f24b663f9f3697c74aa5c31dd0b09aa20161cfd75d3472b50e1bbf472
 size 664584480

 version https://git-lfs.github.com/spec/v1
+oid sha256:942c4e792a20d5d36d62e57ecc20b664777946d0835a9271383afd5e99b85f11
 size 664584480

optimizer.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7cfd1aad66e313c4b2faa3c98a87e667e8808b634d9d98b7f19bc2a73239fdec
 size 1329377575

 version https://git-lfs.github.com/spec/v1
+oid sha256:2373cf17766c2fbe6c76d2c61a20aec8a4ac34fb5d9556819e6fb72699a31531
 size 1329377575

trainer_state.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "best_global_step": 100,
-  "best_metric": 0.09543681889772415,
   "best_model_checkpoint": "/content/models/gemma_qlora_lmh/checkpoint-100",
   "epoch": 1.7008547008547008,
   "eval_steps": 20,
@@ -10,108 +10,108 @@
   "is_world_process_zero": true,
   "log_history": [
     {
-      "entropy": 2.4621588349342347,
       "epoch": 0.3418803418803419,
-      "grad_norm": 6.012325763702393,
       "learning_rate": 8.389830508474577e-06,
-      "loss": 0.3823,
-      "mean_token_accuracy": 0.8764635115861893,
       "num_tokens": 113164.0,
       "step": 20
     },
     {
       "epoch": 0.3418803418803419,
-      "eval_entropy": 2.307236980169247,
-      "eval_loss": 0.1412736475467682,
-      "eval_mean_token_accuracy": 0.9523680826537629,
       "eval_num_tokens": 113164.0,
-      "eval_runtime": 46.9847,
-      "eval_samples_per_second": 39.587,
-      "eval_steps_per_second": 2.49,
       "step": 20
     },
     {
-      "entropy": 2.30367816388607,
       "epoch": 0.6837606837606838,
-      "grad_norm": 2.036860466003418,
       "learning_rate": 6.694915254237288e-06,
-      "loss": 0.1364,
-      "mean_token_accuracy": 0.9568088531494141,
       "num_tokens": 225335.0,
       "step": 40
     },
     {
       "epoch": 0.6837606837606838,
-      "eval_entropy": 2.2767869560127583,
-      "eval_loss": 0.11478288471698761,
-      "eval_mean_token_accuracy": 0.9625477276296697,
       "eval_num_tokens": 225335.0,
-      "eval_runtime": 45.5897,
-      "eval_samples_per_second": 40.799,
-      "eval_steps_per_second": 2.566,
       "step": 40
     },
     {
-      "entropy": 2.2971509970151462,
       "epoch": 1.017094017094017,
-      "grad_norm": 2.243170976638794,
       "learning_rate": 5e-06,
-      "loss": 0.1134,
-      "mean_token_accuracy": 0.966560884928092,
       "num_tokens": 330390.0,
       "step": 60
     },
     {
       "epoch": 1.017094017094017,
-      "eval_entropy": 2.289694781996246,
-      "eval_loss": 0.10902266204357147,
-      "eval_mean_token_accuracy": 0.9641746343710483,
       "eval_num_tokens": 330390.0,
-      "eval_runtime": 46.8001,
-      "eval_samples_per_second": 39.744,
-      "eval_steps_per_second": 2.5,
       "step": 60
     },
     {
-      "entropy": 2.272606986761093,
       "epoch": 1.358974358974359,
-      "grad_norm": 2.2923057079315186,
       "learning_rate": 3.305084745762712e-06,
       "loss": 0.0845,
-      "mean_token_accuracy": 0.9724922418594361,
       "num_tokens": 440357.0,
       "step": 80
     },
     {
       "epoch": 1.358974358974359,
-      "eval_entropy": 2.2530585782140746,
-      "eval_loss": 0.1047038808465004,
-      "eval_mean_token_accuracy": 0.9652910543303205,
       "eval_num_tokens": 440357.0,
-      "eval_runtime": 46.2572,
-      "eval_samples_per_second": 40.21,
-      "eval_steps_per_second": 2.529,
       "step": 80
     },
     {
-      "entropy": 2.26450654566288,
       "epoch": 1.7008547008547008,
-      "grad_norm": 1.6888355016708374,
       "learning_rate": 1.6101694915254237e-06,
       "loss": 0.0715,
-      "mean_token_accuracy": 0.9734723582863808,
       "num_tokens": 552807.0,
       "step": 100
     },
     {
       "epoch": 1.7008547008547008,
-      "eval_entropy": 2.240677540118878,
-      "eval_loss": 0.09543681889772415,
-      "eval_mean_token_accuracy": 0.9683268676456224,
       "eval_num_tokens": 552807.0,
-      "eval_runtime": 46.1331,
-      "eval_samples_per_second": 40.318,
-      "eval_steps_per_second": 2.536,
       "step": 100
     }
   ],

 {
   "best_global_step": 100,
+  "best_metric": 0.09553248435258865,
   "best_model_checkpoint": "/content/models/gemma_qlora_lmh/checkpoint-100",
   "epoch": 1.7008547008547008,
   "eval_steps": 20,
   "is_world_process_zero": true,
   "log_history": [
     {
+      "entropy": 2.4642674922943115,
       "epoch": 0.3418803418803419,
+      "grad_norm": 6.1703619956970215,
       "learning_rate": 8.389830508474577e-06,
+      "loss": 0.3828,
+      "mean_token_accuracy": 0.875461021065712,
       "num_tokens": 113164.0,
       "step": 20
     },
     {
       "epoch": 0.3418803418803419,
+      "eval_entropy": 2.313392945843884,
+      "eval_loss": 0.1408257782459259,
+      "eval_mean_token_accuracy": 0.9526278610922333,
       "eval_num_tokens": 113164.0,
+      "eval_runtime": 46.6856,
+      "eval_samples_per_second": 39.841,
+      "eval_steps_per_second": 2.506,
       "step": 20
     },
     {
+      "entropy": 2.3076194286346436,
       "epoch": 0.6837606837606838,
+      "grad_norm": 2.0425662994384766,
       "learning_rate": 6.694915254237288e-06,
+      "loss": 0.1357,
+      "mean_token_accuracy": 0.9569604843854904,
       "num_tokens": 225335.0,
       "step": 40
     },
     {
       "epoch": 0.6837606837606838,
+      "eval_entropy": 2.276767115307669,
+      "eval_loss": 0.1144598051905632,
+      "eval_mean_token_accuracy": 0.9625413275172567,
       "eval_num_tokens": 225335.0,
+      "eval_runtime": 45.4774,
+      "eval_samples_per_second": 40.899,
+      "eval_steps_per_second": 2.573,
       "step": 40
     },
     {
+      "entropy": 2.298072344217545,
       "epoch": 1.017094017094017,
+      "grad_norm": 2.246678113937378,
       "learning_rate": 5e-06,
+      "loss": 0.113,
+      "mean_token_accuracy": 0.9657873175083063,
       "num_tokens": 330390.0,
       "step": 60
     },
     {
       "epoch": 1.017094017094017,
+      "eval_entropy": 2.2912978331247964,
+      "eval_loss": 0.10871552675962448,
+      "eval_mean_token_accuracy": 0.9649902301975805,
       "eval_num_tokens": 330390.0,
+      "eval_runtime": 46.0256,
+      "eval_samples_per_second": 40.412,
+      "eval_steps_per_second": 2.542,
       "step": 60
     },
     {
+      "entropy": 2.27278618812561,
       "epoch": 1.358974358974359,
+      "grad_norm": 2.236058473587036,
       "learning_rate": 3.305084745762712e-06,
       "loss": 0.0845,
+      "mean_token_accuracy": 0.9728620991110801,
       "num_tokens": 440357.0,
       "step": 80
     },
     {
       "epoch": 1.358974358974359,
+      "eval_entropy": 2.254611888502398,
+      "eval_loss": 0.10490305721759796,
+      "eval_mean_token_accuracy": 0.965580604524694,
       "eval_num_tokens": 440357.0,
+      "eval_runtime": 46.2372,
+      "eval_samples_per_second": 40.227,
+      "eval_steps_per_second": 2.53,
       "step": 80
     },
     {
+      "entropy": 2.2653892546892167,
       "epoch": 1.7008547008547008,
+      "grad_norm": 1.7268085479736328,
       "learning_rate": 1.6101694915254237e-06,
       "loss": 0.0715,
+      "mean_token_accuracy": 0.9734208568930626,
       "num_tokens": 552807.0,
       "step": 100
     },
     {
       "epoch": 1.7008547008547008,
+      "eval_entropy": 2.2389834895093217,
+      "eval_loss": 0.09553248435258865,
+      "eval_mean_token_accuracy": 0.9684329369129279,
       "eval_num_tokens": 552807.0,
+      "eval_runtime": 46.1644,
+      "eval_samples_per_second": 40.291,
+      "eval_steps_per_second": 2.534,
       "step": 100
     }
   ],

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f9399b718a9dab6f82ca03abb475407342319536237c41ccfc6081473e94f69b
 size 6289

 version https://git-lfs.github.com/spec/v1
+oid sha256:a8f974810c7f4f0af8e66ac9807b37a99c6690f3fbac636ea7560f6e4b434eb1
 size 6289