Instructions to use qing-yao/random_first_small_seed-42_1e-3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use qing-yao/random_first_small_seed-42_1e-3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="qing-yao/random_first_small_seed-42_1e-3")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("qing-yao/random_first_small_seed-42_1e-3")
model = AutoModelForCausalLM.from_pretrained("qing-yao/random_first_small_seed-42_1e-3", device_map="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use qing-yao/random_first_small_seed-42_1e-3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "qing-yao/random_first_small_seed-42_1e-3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "qing-yao/random_first_small_seed-42_1e-3",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/qing-yao/random_first_small_seed-42_1e-3

SGLang

How to use qing-yao/random_first_small_seed-42_1e-3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "qing-yao/random_first_small_seed-42_1e-3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "qing-yao/random_first_small_seed-42_1e-3",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "qing-yao/random_first_small_seed-42_1e-3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "qing-yao/random_first_small_seed-42_1e-3",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use qing-yao/random_first_small_seed-42_1e-3 with Docker Model Runner:
```
docker model run hf.co/qing-yao/random_first_small_seed-42_1e-3
```

random_first_small_seed-42_1e-3

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.4850
Accuracy: 0.3735

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 32
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 256
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 32000
num_epochs: 20.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
6.1686	0.9994	1506	4.6740	0.2640
4.2716	1.9997	3013	4.2036	0.3043
4.0429	2.9993	4519	3.9569	0.3245
3.767	3.9995	6026	3.8020	0.3384
3.6601	4.9998	7533	3.7014	0.3487
3.5377	5.9993	9039	3.6375	0.3552
3.4693	6.9996	10546	3.5951	0.3595
3.4137	7.9998	12053	3.5684	0.3621
3.3605	8.9994	13559	3.5475	0.3647
3.3388	9.9997	15066	3.5302	0.3672
3.2937	10.9993	16572	3.5231	0.3683
3.2868	11.9995	18079	3.5129	0.3693
3.2461	12.9998	19586	3.5054	0.3701
3.2512	13.9993	21092	3.4998	0.3712
3.2131	14.9996	22599	3.4938	0.3716
3.2273	15.9998	24106	3.4932	0.3720
3.1896	16.9994	25612	3.4873	0.3731
3.2082	17.9997	27119	3.4889	0.3727
3.1742	18.9993	28625	3.4861	0.3731
3.1977	19.9915	30120	3.4850	0.3735

Framework versions

Transformers 4.46.2
Pytorch 2.5.1+cu124
Datasets 3.2.0
Tokenizers 0.20.0

Downloads last month: 10

Safetensors

Model size

0.1B params

Tensor type

F32