Instructions to use mamounyosef/commit-message-llm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mamounyosef/commit-message-llm with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-0.5B")
model = PeftModel.from_pretrained(base_model, "mamounyosef/commit-message-llm")

Transformers

How to use mamounyosef/commit-message-llm with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mamounyosef/commit-message-llm")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("mamounyosef/commit-message-llm", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use mamounyosef/commit-message-llm with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mamounyosef/commit-message-llm"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mamounyosef/commit-message-llm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/mamounyosef/commit-message-llm

SGLang

How to use mamounyosef/commit-message-llm with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "mamounyosef/commit-message-llm" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mamounyosef/commit-message-llm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "mamounyosef/commit-message-llm" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mamounyosef/commit-message-llm",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use mamounyosef/commit-message-llm with Docker Model Runner:
```
docker model run hf.co/mamounyosef/commit-message-llm
```

mamounyosef commited on Feb 22

Commit

9031dc1

verified ·

1 Parent(s): 6988873

Update README.md

Browse files

Files changed (1) hide show

README.md +17 -7

README.md CHANGED Viewed

@@ -9,7 +9,9 @@ tags:
 - commit-message-generation
 - code-summarization
 - generated_from_trainer
-license: apache-2.0
 language:
 - en
 ---
@@ -33,7 +35,8 @@ This model is a **QLoRA (4-bit quantized LoRA)** adapter trained on the Qwen2.5-
 - **Developed by:** Mamoun Yosef
 - **Model type:** Causal Language Model (Decoder-only Transformer) with LoRA adapters
 - **Language(s):** English
-- **License:** Apache 2.0
 - **Finetuned from model:** Qwen/Qwen2.5-Coder-0.5B
 ### Model Sources
@@ -41,6 +44,12 @@ This model is a **QLoRA (4-bit quantized LoRA)** adapter trained on the Qwen2.5-
 - **Repository:** [commit-message-llm](https://github.com/mamounyosef/commit-message-llm)
 - **Base Model:** [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B)
 ## Uses
 ### Direct Use
@@ -88,6 +97,7 @@ Can be integrated into:
 - Diffs from non-programming languages
 - Extremely large diffs (>8000 characters)
 - Commit messages requiring deep domain knowledge beyond code structure
 ## Bias, Risks, and Limitations
@@ -174,9 +184,9 @@ print(message)
 **Preprocessing:**
 - Removed trivial messages (fix, update, wip, etc.)
 - Filtered out reference-only commits (fix #123)
-- Removed placeholder tokens (<HASH>, <URL>)
 - Kept diffs between 50-8000 characters
-- Required messages with semantic content (≥3 words)
 **Final dataset sizes:**
 - Training: 120,000 samples
@@ -197,10 +207,10 @@ Prompt tokens (diff + separator) are masked with label `-100` so loss is compute
 #### Preprocessing
-1. Normalize newlines (CRLF → LF)
 2. Tokenize diff + separator + message
 3. Mask prompt labels to `-100`
-4. Truncate to max_length=512 tokens
 5. Append EOS token to target
 #### Training Hyperparameters
@@ -257,7 +267,7 @@ Prompt tokens (diff + separator) are masked with label `-100` so loss is compute
 - **Loss:** Cross-entropy loss on commit message tokens
 - **Perplexity:** exp(loss), measures model confidence
   - Lower perplexity = better prediction quality
-  - Perplexity ≈ 17 is strong for this task
 ### Results

 - commit-message-generation
 - code-summarization
 - generated_from_trainer
+license: cc-by-nc-4.0
+datasets:
+- Maxscha/commitbench
 language:
 - en
 ---
 - **Developed by:** Mamoun Yosef
 - **Model type:** Causal Language Model (Decoder-only Transformer) with LoRA adapters
 - **Language(s):** English
+- **License:** CC BY-NC 4.0 (non-commercial for this trained adapter)
+- **Base model license:** Apache 2.0 (`Qwen/Qwen2.5-Coder-0.5B`)
 - **Finetuned from model:** Qwen/Qwen2.5-Coder-0.5B
 ### Model Sources
 - **Repository:** [commit-message-llm](https://github.com/mamounyosef/commit-message-llm)
 - **Base Model:** [Qwen/Qwen2.5-Coder-0.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B)
+## License and Usage
+- This adapter was trained using **CommitBench** (`Maxscha/commitbench`), licensed **CC BY-NC 4.0**.
+- This trained adapter is therefore **non-commercial use only**.
+- The base model (`Qwen/Qwen2.5-Coder-0.5B`) remains licensed under **Apache-2.0**.
 ## Uses
 ### Direct Use
 - Diffs from non-programming languages
 - Extremely large diffs (>8000 characters)
 - Commit messages requiring deep domain knowledge beyond code structure
+- Commercial usage of this trained adapter
 ## Bias, Risks, and Limitations
 **Preprocessing:**
 - Removed trivial messages (fix, update, wip, etc.)
 - Filtered out reference-only commits (fix #123)
+- Removed placeholder tokens (`<HASH>`, `<URL>`)
 - Kept diffs between 50-8000 characters
+- Required messages with semantic content (>=3 words)
 **Final dataset sizes:**
 - Training: 120,000 samples
 #### Preprocessing
+1. Normalize newlines (CRLF -> LF)
 2. Tokenize diff + separator + message
 3. Mask prompt labels to `-100`
+4. Truncate to `max_length=512` tokens
 5. Append EOS token to target
 #### Training Hyperparameters
 - **Loss:** Cross-entropy loss on commit message tokens
 - **Perplexity:** exp(loss), measures model confidence
   - Lower perplexity = better prediction quality
+  - Perplexity ~17 is strong for this task
 ### Results