Text Generation
Transformers
PyTorch
Safetensors
English
gpt_neox
HelpingAI
vortex
Eval Results (legacy)
text-generation-inference
Instructions to use OEvortex/vortex-3b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OEvortex/vortex-3b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="OEvortex/vortex-3b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("OEvortex/vortex-3b") model = AutoModelForCausalLM.from_pretrained("OEvortex/vortex-3b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use OEvortex/vortex-3b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "OEvortex/vortex-3b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OEvortex/vortex-3b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/OEvortex/vortex-3b
- SGLang
How to use OEvortex/vortex-3b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "OEvortex/vortex-3b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OEvortex/vortex-3b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "OEvortex/vortex-3b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "OEvortex/vortex-3b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use OEvortex/vortex-3b with Docker Model Runner:
docker model run hf.co/OEvortex/vortex-3b
Update README.md
Browse files
README.md
CHANGED
|
@@ -14,50 +14,3 @@ tags:
|
|
| 14 |
|
| 15 |
vortex-3b is a 2.78 billion parameter causal language model created by OEvortex that is derived from EleutherAI's Pythia-2.8b and fine-tuned on Vortex-50k dataset'
|
| 16 |
|
| 17 |
-
**Usage**
|
| 18 |
-
|
| 19 |
-
To utilize the this model, you can access the provided Colab notebook. The notebook allows you to run the model on both CPU and GPU. Feel free to make any necessary changes to adapt it to your specific requirements.
|
| 20 |
-
|
| 21 |
-
**CPU and GPU code**
|
| 22 |
-
|
| 23 |
-
```bash
|
| 24 |
-
!pip install transformers
|
| 25 |
-
!pip install sentencepiece
|
| 26 |
-
!pip install accelerate
|
| 27 |
-
```
|
| 28 |
-
```python
|
| 29 |
-
import torch # allows Tensor computation with strong GPU acceleration
|
| 30 |
-
from transformers import pipeline # fast way to use pre-trained models for inference
|
| 31 |
-
import os
|
| 32 |
-
```
|
| 33 |
-
```python
|
| 34 |
-
# load model
|
| 35 |
-
HL_pipeline = pipeline(model="OEvortex/vortex-3b",
|
| 36 |
-
torch_dtype=torch.bfloat16,
|
| 37 |
-
trust_remote_code=True,
|
| 38 |
-
device_map="auto")
|
| 39 |
-
```
|
| 40 |
-
```python
|
| 41 |
-
# define helper function
|
| 42 |
-
def get_completion_HL(input):
|
| 43 |
-
system = f"""
|
| 44 |
-
You are an expert Physicist.
|
| 45 |
-
You are good at explaining Physics concepts in simple words.
|
| 46 |
-
Help as much as you can.
|
| 47 |
-
"""
|
| 48 |
-
prompt = f"#### System: {system}\n#### User: \n{input}\n\n#### Response from Lite:"
|
| 49 |
-
print(prompt)
|
| 50 |
-
HL_response = HL_pipeline(prompt,
|
| 51 |
-
max_new_tokens=500
|
| 52 |
-
)
|
| 53 |
-
return HL_response[0]["generated_text"]
|
| 54 |
-
```
|
| 55 |
-
```python
|
| 56 |
-
# let's prompt
|
| 57 |
-
prompt = "Explain the difference between nuclear fission and fusion."
|
| 58 |
-
# prompt = "Why is the Sky blue?"
|
| 59 |
-
|
| 60 |
-
print(get_completion_HL(prompt))
|
| 61 |
-
```
|
| 62 |
-
|
| 63 |
-
|
|
|
|
| 14 |
|
| 15 |
vortex-3b is a 2.78 billion parameter causal language model created by OEvortex that is derived from EleutherAI's Pythia-2.8b and fine-tuned on Vortex-50k dataset'
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|