Text Generation
Transformers
Safetensors
English
mistral
Merge
conversational
text-generation-inference
Instructions to use jan-hq/trinity-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jan-hq/trinity-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jan-hq/trinity-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jan-hq/trinity-v1") model = AutoModelForCausalLM.from_pretrained("jan-hq/trinity-v1") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use jan-hq/trinity-v1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jan-hq/trinity-v1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jan-hq/trinity-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/jan-hq/trinity-v1
- SGLang
How to use jan-hq/trinity-v1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jan-hq/trinity-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jan-hq/trinity-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jan-hq/trinity-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jan-hq/trinity-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use jan-hq/trinity-v1 with Docker Model Runner:
docker model run hf.co/jan-hq/trinity-v1
Commit History
Update README.md 15fc7e9
Update README.md 016dfa2
Update README.md 8cc1383
Update README.md ef63ef7
Update README.md f914883
Update tokenizer_config.json d4052a1
Update README.md 34974ae
Jan commited on
Update README.md 038f277
Jan commited on
Update README.md 4b2fb94
Jan commited on
Update README.md 6220554
Jan commited on
Update README.md e24ab99
Jan commited on
Update README.md 2fcb077
Jan commited on
Update README.md a2d076b
Jan commited on
Create README.md e6e4843
Jan commited on
Upload folder using huggingface_hub 6eefd99
Jan commited on
initial commit 6276278
Jan commited on