| # Replicate Setup Instructions | |
| ## Prerequisites | |
| 1. Install Cog: https://github.com/replicate/cog | |
| ```bash | |
| sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_`uname -s`_`uname -m` | |
| sudo chmod +x /usr/local/bin/cog | |
| ``` | |
| 2. Create a Replicate account: https://replicate.com | |
| ## Local Testing | |
| ```bash | |
| # Test the model locally | |
| cog predict -i prompt="What makes Monad blockchain unique?" | |
| # Build the Docker image | |
| cog build | |
| ``` | |
| ## Push to Replicate | |
| ```bash | |
| # Login to Replicate | |
| cog login | |
| # Push the model (replace with your username) | |
| cog push r8.im/YOUR_USERNAME/monad-mistral-7b | |
| ``` | |
| ## Model Structure | |
| - `cog.yaml`: Defines environment and dependencies | |
| - `predict.py`: Contains the Predictor class for inference | |
| - `monad-mistral-7b.gguf`: The model file (will be uploaded separately) | |
| ## Using the Model on Replicate | |
| Once deployed, you can use it via: | |
| ### Python | |
| ```python | |
| import replicate | |
| output = replicate.run( | |
| "YOUR_USERNAME/monad-mistral-7b:latest", | |
| input={ | |
| "prompt": "Explain Monad's parallel execution", | |
| "temperature": 0.7, | |
| "max_tokens": 200 | |
| } | |
| ) | |
| print(output) | |
| ``` | |
| ### cURL | |
| ```bash | |
| curl -s -X POST \ | |
| -H "Authorization: Token $REPLICATE_API_TOKEN" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "version": "latest", | |
| "input": { | |
| "prompt": "What is Monad?" | |
| } | |
| }' \ | |
| https://api.replicate.com/v1/predictions | |
| ``` | |
| ## Notes | |
| - The GGUF file needs to be included in the model package | |
| - Replicate will automatically handle GPU allocation | |
| - The model uses llama-cpp-python for efficient GGUF inference | |
| - Context window is set to 4096 tokens |