YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Merged vLLM Model

This folder contains a merged vLLM model export produced by the project's merging workflow.

Contents typically include:

  • One or more .safetensors shard files (e.g. model-00001-of-00004.safetensors)
  • model.safetensors.index.json (index for shards)
  • Tokenizer files: tokenizer.json, tokenizer_config.json, special_tokens_map.json
  • config.json and generation_config.json

How to push this folder to Hugging Face Hub

  1. Install dependencies:
pip install huggingface-hub
  1. From the repository root run the helper script:
python scripts/push_to_hf.py --repo-id YOUR_USERNAME/YOUR_MODEL_NAME
# or with env token:
HF_TOKEN=xxx python scripts/push_to_hf.py --repo-id YOUR_USERNAME/YOUR_MODEL_NAME --private

Loading the model

With vLLM (recommended for inference server):

from vllm import Model
model = Model.from_pretrained("YOUR_USERNAME/YOUR_MODEL_NAME")
# then use vLLM APIs to run inference

With Hugging Face Transformers (if applicable):

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("YOUR_USERNAME/YOUR_MODEL_NAME")
model = AutoModelForCausalLM.from_pretrained("YOUR_USERNAME/YOUR_MODEL_NAME", trust_remote_code=True)

Notes

  • For large .safetensors shards, ensure you have sufficient bandwidth and storage.
  • If you plan to host the model publicly, review license and privacy requirements.
  • If you encounter upload limits, generate an access token with appropriate scopes and pass it via HF_TOKEN or --token.

If you want me to also commit the repo on Hugging Face with git lfs (instead of API uploads), say so and I can add an alternative script.

Downloads last month
6
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support