Instructions to use DCAgent/a1-codeelo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DCAgent/a1-codeelo with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="DCAgent/a1-codeelo")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("DCAgent/a1-codeelo")
model = AutoModelForCausalLM.from_pretrained("DCAgent/a1-codeelo")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use DCAgent/a1-codeelo with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DCAgent/a1-codeelo"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DCAgent/a1-codeelo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/DCAgent/a1-codeelo

SGLang

How to use DCAgent/a1-codeelo with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DCAgent/a1-codeelo" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DCAgent/a1-codeelo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DCAgent/a1-codeelo" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DCAgent/a1-codeelo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use DCAgent/a1-codeelo with Docker Model Runner:
```
docker model run hf.co/DCAgent/a1-codeelo
```

EtashGuha commited on Apr 4

Commit

7719586

verified ·

1 Parent(s): 0747036

Upload folder using huggingface_hub

Browse files

Files changed (10) hide show

README.md +1 -1
all_results.json +12 -12
model-00001-of-00004.safetensors +1 -1
model-00002-of-00004.safetensors +1 -1
model-00003-of-00004.safetensors +1 -1
model-00004-of-00004.safetensors +1 -1
run_summary.json +2 -2
train_results.json +12 -12
trainer_log.jsonl +0 -0
training_loss.png +0 -0

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 # sft_a1_codeelo__Qwen3-8B
-This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_codeelo-v2_10k_glm_4.7_traces_jupiter_upsampled_10k/snapshots/3ca27692cf3d7f3fa6ed3b83e00b3df43ad80fdc_thinking_preprocessed dataset.
 ## Model description

 # sft_a1_codeelo__Qwen3-8B
+This model is a fine-tuned version of [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_codeelo-v2_10k_glm_4.7_traces_jupiter/snapshots/82252f3ec14c532dcb0a1154c26432b8bcd8b10e_thinking_preprocessed dataset.
 ## Model description

all_results.json CHANGED Viewed

@@ -1,16 +1,16 @@
 {
-    "achieved_tflops_per_gpu": 0.0037163499371083908,
-    "achieved_tflops_per_gpu_theoretical": 1063.8020327264749,
     "epoch": 7.0,
     "loss_nan_ranks": 0,
-    "loss_rank_avg": 0.22936706244945526,
-    "mfu_percent": 0.00026263957152709475,
-    "mfu_percent_theoretical": 75.18035566971554,
-    "total_flos": 791537577689088.0,
-    "train_loss": 0.17696809689205084,
-    "train_runtime": 13311.7439,
-    "train_samples_per_second": 5.027,
-    "train_steps_per_second": 0.314,
-    "valid_targets_mean": 3387.6,
-    "valid_targets_min": 1204
 }

 {
+    "achieved_tflops_per_gpu": 0.0022911149705146998,
+    "achieved_tflops_per_gpu_theoretical": 423.488070352896,
     "epoch": 7.0,
     "loss_nan_ranks": 0,
+    "loss_rank_avg": 0.4637048840522766,
+    "mfu_percent": 0.0001619162523331943,
+    "mfu_percent_theoretical": 29.928485537307136,
+    "total_flos": 1105775565668352.0,
+    "train_loss": 0.48887504853286395,
+    "train_runtime": 30164.7773,
+    "train_samples_per_second": 1.994,
+    "train_steps_per_second": 0.125,
+    "valid_targets_mean": 7010.8,
+    "valid_targets_min": 805
 }

model-00001-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:085d2ef00968f85259608e7f4bae06bdebdeaad16dd78711045cf3f4fc02d385
 size 4902257696

 version https://git-lfs.github.com/spec/v1
+oid sha256:7aa7ce43de8f80cf45d174906fd9665c4cc3d4bd9f5710104a9b6a2b5d41de3f
 size 4902257696

model-00002-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f3eff0ab84d61efebaf58604a6690b456f54fa31aa78d055cddebba4e1c93f08
 size 4915960368

 version https://git-lfs.github.com/spec/v1
+oid sha256:5b794036ca219abc636ea3301feea6a4b17921859b3e4a2d3a4a7f9997d8f28f
 size 4915960368

model-00003-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:024d9f4a23750d66eae0a6389c4c5cd29813ea39e76bd64565c0c0fb5e4aae25
 size 4983068496

 version https://git-lfs.github.com/spec/v1
+oid sha256:f8bcb76294a475a9ce0009b57282da54a052eaeba6643247f28bb83e130c7595
 size 4983068496

model-00004-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:753be65a952a3db9de839d47e34638a8a385c51ebbe5c569552fbd2f9509ced8
 size 1580230264

 version https://git-lfs.github.com/spec/v1
+oid sha256:dddd527fc9b9c187160b3974b35654b79c0785379ff59b3e8d14be862e217afd
 size 1580230264

run_summary.json CHANGED Viewed

@@ -1,10 +1,10 @@
 {
-  "agent_name": "3ca27692cf3d7f3fa6ed3b83e00b3df43ad80fdc_thinking_preprocessed",
   "training_start": null,
   "training_end": null,
   "created_by": "raoof1",
   "base_model_name": "Qwen/Qwen3-8B",
-  "dataset_name": "/e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_codeelo-v2_10k_glm_4.7_traces_jupiter_upsampled_10k/snapshots/3ca27692cf3d7f3fa6ed3b83e00b3df43ad80fdc_thinking_preprocessed",
   "training_type": "SFT",
   "training_parameters": "https://huggingface.co/DCAgent/a1-codeelo/blob/main/config.json",
   "wandb_link": null,

 {
+  "agent_name": "82252f3ec14c532dcb0a1154c26432b8bcd8b10e_thinking_preprocessed",
   "training_start": null,
   "training_end": null,
   "created_by": "raoof1",
   "base_model_name": "Qwen/Qwen3-8B",
+  "dataset_name": "/e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_codeelo-v2_10k_glm_4.7_traces_jupiter/snapshots/82252f3ec14c532dcb0a1154c26432b8bcd8b10e_thinking_preprocessed",
   "training_type": "SFT",
   "training_parameters": "https://huggingface.co/DCAgent/a1-codeelo/blob/main/config.json",
   "wandb_link": null,

train_results.json CHANGED Viewed

@@ -1,16 +1,16 @@
 {
-    "achieved_tflops_per_gpu": 0.0037163499371083908,
-    "achieved_tflops_per_gpu_theoretical": 1063.8020327264749,
     "epoch": 7.0,
     "loss_nan_ranks": 0,
-    "loss_rank_avg": 0.22936706244945526,
-    "mfu_percent": 0.00026263957152709475,
-    "mfu_percent_theoretical": 75.18035566971554,
-    "total_flos": 791537577689088.0,
-    "train_loss": 0.17696809689205084,
-    "train_runtime": 13311.7439,
-    "train_samples_per_second": 5.027,
-    "train_steps_per_second": 0.314,
-    "valid_targets_mean": 3387.6,
-    "valid_targets_min": 1204
 }

 {
+    "achieved_tflops_per_gpu": 0.0022911149705146998,
+    "achieved_tflops_per_gpu_theoretical": 423.488070352896,
     "epoch": 7.0,
     "loss_nan_ranks": 0,
+    "loss_rank_avg": 0.4637048840522766,
+    "mfu_percent": 0.0001619162523331943,
+    "mfu_percent_theoretical": 29.928485537307136,
+    "total_flos": 1105775565668352.0,
+    "train_loss": 0.48887504853286395,
+    "train_runtime": 30164.7773,
+    "train_samples_per_second": 1.994,
+    "train_steps_per_second": 0.125,
+    "valid_targets_mean": 7010.8,
+    "valid_targets_min": 805
 }

trainer_log.jsonl CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_loss.png CHANGED Viewed