NoCritics
/

salmonaude.ai

Model card Files Files and versions

salmonaude.ai / replicate_setup.md

NoCritics's picture

Upload folder using huggingface_hub

857620c verified 7 months ago

|

history blame contribute delete

1.66 kB

	# Replicate Setup Instructions

	## Prerequisites
	1. Install Cog: https://github.com/replicate/cog
	```bash
	sudo curl -o /usr/local/bin/cog -L https://github.com/replicate/cog/releases/latest/download/cog_`uname -s`_`uname -m`
	sudo chmod +x /usr/local/bin/cog
	```

	2. Create a Replicate account: https://replicate.com

	## Local Testing
	```bash
	# Test the model locally
	cog predict -i prompt="What makes Monad blockchain unique?"

	# Build the Docker image
	cog build
	```

	## Push to Replicate
	```bash
	# Login to Replicate
	cog login

	# Push the model (replace with your username)
	cog push r8.im/YOUR_USERNAME/monad-mistral-7b
	```

	## Model Structure
	- `cog.yaml`: Defines environment and dependencies
	- `predict.py`: Contains the Predictor class for inference
	- `monad-mistral-7b.gguf`: The model file (will be uploaded separately)

	## Using the Model on Replicate

	Once deployed, you can use it via:

	### Python
	```python
	import replicate

	output = replicate.run(
	"YOUR_USERNAME/monad-mistral-7b:latest",
	input={
	"prompt": "Explain Monad's parallel execution",
	"temperature": 0.7,
	"max_tokens": 200
	}
	)
	print(output)
	```

	### cURL
	```bash
	curl -s -X POST \
	-H "Authorization: Token $REPLICATE_API_TOKEN" \
	-H "Content-Type: application/json" \
	-d '{
	"version": "latest",
	"input": {
	"prompt": "What is Monad?"
	}
	}' \
	https://api.replicate.com/v1/predictions
	```

	## Notes
	- The GGUF file needs to be included in the model package
	- Replicate will automatically handle GPU allocation
	- The model uses llama-cpp-python for efficient GGUF inference
	- Context window is set to 4096 tokens