Text Generation
Transformers
Safetensors
llama
Merge
mergekit
MTSAIR/MultiVerse_70B
davidkim205/Rhea-72b-v0.5
text-generation-inference
Instructions to use paloalma/TW3-JRGL-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use paloalma/TW3-JRGL-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="paloalma/TW3-JRGL-v2")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("paloalma/TW3-JRGL-v2") model = AutoModelForCausalLM.from_pretrained("paloalma/TW3-JRGL-v2") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use paloalma/TW3-JRGL-v2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "paloalma/TW3-JRGL-v2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "paloalma/TW3-JRGL-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/paloalma/TW3-JRGL-v2
- SGLang
How to use paloalma/TW3-JRGL-v2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "paloalma/TW3-JRGL-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "paloalma/TW3-JRGL-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "paloalma/TW3-JRGL-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "paloalma/TW3-JRGL-v2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use paloalma/TW3-JRGL-v2 with Docker Model Runner:
docker model run hf.co/paloalma/TW3-JRGL-v2
TW3-JRGL-v2
This model has been produced by :
- Louis Garcia, engineering student at French Engineering School ECE
- Matthieu Jollard, engineering student at French Engineering School ECE
Under the supervision of :
- Andre-Louis Rochet, Lecturer at ECE & Co-Founder of TW3 Partners
- Paul Lemaistre, CTO of TW3 Partners
With the contribution of :
- ECE engineering school as sponsor and financial contributor
- RunPod as financial contributor
About ECE
ECE, a multi-program, multi-campus, and multi-sector engineering school specializing in digital engineering, trains engineers and technology experts for the 21st century, capable of meeting the challenges of the dual digital and sustainable development revolutions. French Engineering School ECE
TW3-JRGL-v2
Le_Triomphant-ECE-TW3 is a merge of the following models using mergekit:
π§© Configuration
- Downloads last month
- 16,510