GodSpeed / README.md
AdithyaVardan's picture
fix: add HuggingFace Space metadata
89253f1
metadata
title: GodSpeed
emoji: πŸš€
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false

Godspeed

Enterprise Knowledge Copilot is a fully open-source, locally-compliant, agentic RAG platform that unifies internal and live external knowledge into a single cited, validated answer engine β€” purpose-built for IT enterprises operating under GDPR and India's DPDP Act.


Prerequisites

Tool Version Notes
Python 3.11+ 3.12 recommended
pnpm 9+ npm i -g pnpm
Docker 24+ for Qdrant, Redis, Neo4j
Node.js 20+ required by pnpm

1 β€” Clone and configure

git clone https://github.com/samyuktha2004/Godspeed.git
cd Godspeed
cp .env.example .env

Open .env and fill in every <...> placeholder.
Minimum required keys for a first run:

GOOGLE_API_KEY=          # Gemini API key β€” all LLM calls route here
NEO4J_PASSWORD=          # choose any password; must match docker-compose below
SUPABASE_URL=            # your Supabase project URL
SUPABASE_KEY=            # service-role key (not anon) for backend writes
REDIS_URL=redis://localhost:6379/0
QDRANT_HOST=localhost

For the frontend:

cp frontend/.env.example frontend/.env
# defaults (localhost:8000) are fine for local dev β€” no changes needed

2 β€” Start infrastructure

docker run -d --name qdrant  -p 6333:6333 qdrant/qdrant:latest
docker run -d --name redis   -p 6379:6379 redis:7-alpine
docker run -d --name neo4j   \
  -p 7474:7474 -p 7687:7687  \
  -e NEO4J_AUTH=neo4j/<your-NEO4J_PASSWORD> \
  neo4j:5

Wait ~10 seconds for Neo4j to finish its first-boot initialisation before continuing.


3 β€” Backend

# Install Python deps
pip install -r requirements.txt

# Download spaCy English model (required by chunking pipeline)
python -m spacy download en_core_web_sm

Start the API server

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Swagger UI: http://localhost:8000/docs

Start the Celery worker (ingestion tasks)

Open a second terminal:

celery -A src.celery_app worker -Q critical,default,polling -l info

Start Celery beat (periodic Confluence sync β€” optional)

Open a third terminal:

celery -A ingestion.jobs.celery_app beat --loglevel=info

4 β€” Frontend

cd frontend
pnpm install
pnpm dev

App: http://localhost:3000


5 β€” Verify the stack

Health check

curl -s http://localhost:8000/health | python -m json.tool

Post a query and stream the SSE response

curl -sN -X POST http://localhost:8000/agent/query \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{"query":"What services does the auth team own?","team_id":"default","session_id":"test-001"}' \
  | while IFS= read -r line; do echo "$line"; done

Expected event sequence: plan_ready β†’ agent_started (Γ—N) β†’ agent_done (Γ—N) β†’ synthesis_started β†’ answer_chunk (Γ—M) β†’ guardrail_result β†’ done

Fetch knowledge graph nodes

curl -s "http://localhost:8000/graph/nodes?team_id=default&limit=20" | python -m json.tool

Traverse the graph from a seed entity

curl -s -X POST http://localhost:8000/graph/traverse \
  -H "Content-Type: application/json" \
  -d '{"entity_id":"<node-id-from-above>","depth":2,"team_id":"default"}' \
  | python -m json.tool

Stream graph via WebSocket

# Requires: npm i -g wscat
wscat -c "ws://localhost:8000/graph/stream?team_id=default"

Trigger a manual file ingest

curl -s -X POST http://localhost:8000/ingest \
  -F "file=@/path/to/document.pdf" \
  -F "team_id=default" \
  | python -m json.tool

6 β€” Webhook setup (optional)

Each webhook endpoint verifies an HMAC-SHA256 signature. Generate secrets with:

python -c "import secrets; print(secrets.token_hex(32))"

Set the generated value in .env (JIRA_WEBHOOK_SECRET, CONFLUENCE_WEBHOOK_SECRET, etc.), then register the corresponding URL in the Atlassian / GitHub / Slack admin:

Source Endpoint
Jira POST /webhooks/jira
Confluence POST /webhooks/confluence
GitHub POST /webhooks/github
Slack POST /webhooks/slack

7 β€” Project layout

Godspeed/
β”œβ”€β”€ main.py                  # FastAPI entry point
β”œβ”€β”€ agent/                   # LangGraph multi-agent orchestration
β”œβ”€β”€ graph_store/             # Neo4j knowledge graph (extractor, writer, reader, API)
β”œβ”€β”€ ingestion/               # Ingest pipeline + Celery jobs
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ confluence_agent/
β”‚   β”œβ”€β”€ file_agent/
β”‚   β”œβ”€β”€ jira_agent/
β”‚   └── celery_app.py
β”œβ”€β”€ toolsforgitnotionslack/  # GitHub / Slack / Notion agent
β”œβ”€β”€ frontend/                # React 18 + TanStack Router + Vite
β”œβ”€β”€ Docs/                    # Architecture, API contracts, TODO list
β”œβ”€β”€ requirements.txt
└── .env.example

8 β€” Key environment variables reference

Variable Default Purpose
GOOGLE_API_KEY β€” Gemini API β€” all LLM calls
PLANNER_MODEL gemini-2.5-pro Query planning agent
SYNTHESISER_MODEL gemini-2.5-pro Answer synthesis
SUMMARISER_MODEL gemini-2.5-flash Document summarisation
GUARDRAIL_MODEL gemini-2.5-flash Output safety check
GRAPH_EXTRACTION_MODEL gemini-2.5-flash Neo4j entity extraction
NEO4J_URI bolt://localhost:7687 Graph store
QDRANT_HOST localhost Vector store
REDIS_URL redis://localhost:6379/0 Celery broker + cache
SUPABASE_URL β€” Metadata storage
TEAM_ID default RBAC team scope

Full variable list: .env.example


Docs