File size: 1,655 Bytes
857620c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | # Replicate Setup Instructions
## Prerequisites
1. Install Cog: https://github.com/replicate/cog
```bash
sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_`uname -s`_`uname -m`
sudo chmod +x /usr/local/bin/cog
```
2. Create a Replicate account: https://replicate.com
## Local Testing
```bash
# Test the model locally
cog predict -i prompt="What makes Monad blockchain unique?"
# Build the Docker image
cog build
```
## Push to Replicate
```bash
# Login to Replicate
cog login
# Push the model (replace with your username)
cog push r8.im/YOUR_USERNAME/monad-mistral-7b
```
## Model Structure
- `cog.yaml`: Defines environment and dependencies
- `predict.py`: Contains the Predictor class for inference
- `monad-mistral-7b.gguf`: The model file (will be uploaded separately)
## Using the Model on Replicate
Once deployed, you can use it via:
### Python
```python
import replicate
output = replicate.run(
"YOUR_USERNAME/monad-mistral-7b:latest",
input={
"prompt": "Explain Monad's parallel execution",
"temperature": 0.7,
"max_tokens": 200
}
)
print(output)
```
### cURL
```bash
curl -s -X POST \
-H "Authorization: Token $REPLICATE_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"version": "latest",
"input": {
"prompt": "What is Monad?"
}
}' \
https://api.replicate.com/v1/predictions
```
## Notes
- The GGUF file needs to be included in the model package
- Replicate will automatically handle GPU allocation
- The model uses llama-cpp-python for efficient GGUF inference
- Context window is set to 4096 tokens |