NoCritics
/

salmonaude.ai

Model card Files Files and versions

salmonaude.ai / replicate_setup.md

NoCritics's picture

Upload folder using huggingface_hub

857620c verified 7 months ago

|

history blame contribute delete

1.66 kB

Replicate Setup Instructions

Prerequisites

Install Cog: https://github.com/replicate/cog

sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_`uname -s`_`uname -m`
sudo chmod +x /usr/local/bin/cog

Create a Replicate account: https://replicate.com

Local Testing

# Test the model locally
cog predict -i prompt="What makes Monad blockchain unique?"

# Build the Docker image
cog build

Push to Replicate

# Login to Replicate
cog login

# Push the model (replace with your username)
cog push r8.im/YOUR_USERNAME/monad-mistral-7b

Model Structure

cog.yaml: Defines environment and dependencies
predict.py: Contains the Predictor class for inference
monad-mistral-7b.gguf: The model file (will be uploaded separately)

Using the Model on Replicate

Once deployed, you can use it via:

Python

import replicate

output = replicate.run(
    "YOUR_USERNAME/monad-mistral-7b:latest",
    input={
        "prompt": "Explain Monad's parallel execution",
        "temperature": 0.7,
        "max_tokens": 200
    }
)
print(output)

cURL

curl -s -X POST \
  -H "Authorization: Token $REPLICATE_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "version": "latest",
    "input": {
      "prompt": "What is Monad?"
    }
  }' \
  https://api.replicate.com/v1/predictions

Notes

The GGUF file needs to be included in the model package
Replicate will automatically handle GPU allocation
The model uses llama-cpp-python for efficient GGUF inference
Context window is set to 4096 tokens