Spaces:

batnyan
/

Ollamaapi

Sleeping

App Files Files Community

brendon-ai commited on Jun 17, 2025

Commit

c71461d

verified ·

1 Parent(s): 0165548

Update README.md

Browse files

Files changed (1) hide show

README.md +82 -23

README.md CHANGED Viewed

@@ -1,35 +1,94 @@
 ---
-title: NeuroBERT-Tiny API
-emoji: 🤖
 colorFrom: blue
-colorTo: green
 sdk: docker
 app_port: 7860
 ---
-# NeuroBERT-Tiny Masked Language Model API
-This Space hosts a FastAPI application that performs Masked Language Modeling using the [boltuix/NeuroBERT-Tiny](https://huggingface.co/boltuix/NeuroBERT-Tiny) model.
-## Endpoints:
-* **Health Check (GET /health):**
-    Returns a simple message to confirm the API is running.
-    Example: `curl https://brendon-ai-faq.hf.space/health`
-* **Predict (POST /predict):**
-    Accepts a JSON payload with a `text` field containing a sentence with `[MASK]` tokens.
-    Returns a list of top 5 predictions for each masked position.
-    Example `curl` request:
-    ```bash
-    curl -X POST \
-      -H "Content-Type: application/json" \
-      -d '{"text": "The quick brown fox jumps over the [MASK] dog."}' \
-      [https://brendon-ai-faq.hf.space/predict](https://brendon-ai-faq.hf.space/predict)
-    ```
-## Interactive API Documentation:
-You can find the full interactive API documentation at:
-* [Swagger UI](https://brendon-ai-faq.hf.space/docs)
-* [ReDoc](https://brendon-ai-faq.hf.space/redoc)

 ---
+title: Ollama API
+emoji: 🦙
 colorFrom: blue
+colorTo: purple
 sdk: docker
+pinned: false
 app_port: 7860
 ---
+# Ollama Model API
+A REST API for running Ollama models on Hugging Face Spaces.
+## Features
+- 🦙 Run Ollama models via REST API
+- 🔄 Model management (pull, list, delete)
+- 💬 Chat completions
+- 🎛️ Configurable parameters (temperature, top_p, etc.)
+- 📊 Health monitoring
+## API Endpoints
+### Health Check
+- `GET /health` - Check if the service is running
+- `GET /models` - List available models
+### Model Management
+- `POST /models/pull` - Pull a model from Ollama registry
+- `DELETE /models/{model_name}` - Delete a model
+### Chat & Completions
+- `POST /chat` - Chat with a model
+- `POST /generate` - Generate text completion
+## Usage Examples
+### Pull a Model
+```bash
+curl -X POST "https://your-space.hf.space/models/pull" \
+  -H "Content-Type: application/json" \
+  -d '{"model": "llama2:7b"}'
+```
+### Chat with Model
+```bash
+curl -X POST "https://your-space.hf.space/chat" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "llama2:7b",
+    "messages": [
+      {"role": "user", "content": "Hello, how are you?"}
+    ]
+  }'
+```
+### Generate Text
+```bash
+curl -X POST "https://your-space.hf.space/generate" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "llama2:7b",
+    "prompt": "The future of AI is",
+    "max_tokens": 100
+  }'
+```
+## Supported Models
+This setup supports any model available in the Ollama registry:
+- `llama2:7b`, `llama2:13b`
+- `mistral:7b`
+- `codellama:7b`
+- `vicuna:7b`
+- And many more...
+## Interactive Documentation
+Once deployed, visit `/docs` for interactive API documentation.
+## Notes
+- Model pulling may take several minutes depending on model size
+- Larger models require more memory and may not work on free tier
+- First inference may be slower as the model loads into memory
+## Resource Requirements
+- **Small models (7B)**: 8GB+ RAM recommended
+- **Medium models (13B)**: 16GB+ RAM recommended
+- **Large models (70B+)**: 32GB+ RAM required
+Consider using smaller models like `llama2:7b` or `mistral:7b` for better performance on limited resources.