Spaces:
Paused
Paused
A newer version of the Gradio SDK is available: 6.13.0
Setup Instructions
Initial Setup
Clone the repository:
git clone <your-repo-url> cd Agent_kitInstall dependencies:
pip install -r requirements.txtMerge the model (one-time):
python scripts/merge_model.py \ --base-model "Allanatrix/Nexa_Sci_distilled_Falcon-10B" \ --adapter-path models/adapter_model.safetensors \ --output-dir models/merged \ --torch-dtype bfloat16Note: If you don't have the adapter weights, the model will load directly from HuggingFace.
Running with Docker
Prerequisites
- Docker and Docker Compose
- NVIDIA Container Toolkit for GPU support
Quick Start
docker-compose up --build
This starts:
- Model server (port 8001)
- Tool server (port 8000)
Run Agent
docker-compose run --rm agent python examples/demo_agent.py \
--prompt "Your prompt here"
Running Manually
Three Terminal Setup
Terminal 1 - Model Server:
cd Agent_kit
source .venv/bin/activate
uvicorn agent.model_server:app --host 0.0.0.0 --port 8001
Terminal 2 - Tool Server:
cd Agent_kit
source .venv/bin/activate
uvicorn tools.server:app --host 0.0.0.0 --port 8000
Terminal 3 - Agent:
cd Agent_kit
source .venv/bin/activate
# Enable remote model in config.yaml
# Set: model_server.enabled: true
python examples/demo_agent.py --prompt "Your prompt here"
Configuration
Edit agent/config.yaml:
model_server.enabled: Set totrueto use remote model servermodel_server.base_url: Model server URL (default: http://127.0.0.1:8001)tool_server.base_url: Tool server URL (default: http://127.0.0.1:8000)
Testing
Test model server connection:
python examples/test_model_server.py
Test simple generation:
python examples/simple_test.py
Troubleshooting
See README.md for troubleshooting tips.