Spaces:
Sleeping
Sleeping
| # Guia da API REST - RAG Template | |
| API REST completa para o RAG Template usando FastAPI. | |
| --- | |
| ## Visao Geral | |
| A API REST permite integracao programatica com o sistema RAG, oferecendo endpoints para: | |
| - Ingestao de documentos (texto ou upload de arquivos) | |
| - Queries RAG | |
| - Gerenciamento de documentos | |
| - Estatisticas do sistema | |
| - Health checks | |
| **Base URL**: `http://localhost:8000/api/v1` | |
| **Documentacao Interativa**: `http://localhost:8000/api/docs` | |
| --- | |
| ## Autenticacao | |
| Todos os endpoints (exceto `/health`) requerem autenticacao via API key. | |
| ### Configurar API Keys | |
| No arquivo `.env`: | |
| ```bash | |
| API_KEYS=key1,key2,key3 | |
| ``` | |
| ### Usar API Key | |
| Inclua header em todas as requisicoes: | |
| ``` | |
| X-API-Key: sua_api_key_aqui | |
| ``` | |
| --- | |
| ## Iniciar Servidor | |
| ### Modo Desenvolvimento | |
| ```bash | |
| python api_server.py | |
| ``` | |
| ### Modo Producao | |
| ```bash | |
| uvicorn src.api:app --host 0.0.0.0 --port 8000 --workers 4 | |
| ``` | |
| ### Com Docker | |
| ```bash | |
| docker run -p 8000:8000 -e DATABASE_URL=... -e API_KEYS=... rag-template | |
| ``` | |
| --- | |
| ## Endpoints | |
| ### GET /api/v1/health | |
| Health check do sistema. | |
| **Autenticacao**: Nao requerida | |
| **Response**: | |
| ```json | |
| { | |
| "status": "healthy", | |
| "timestamp": "2026-01-23T10:30:00", | |
| "database": "healthy", | |
| "embeddings": "healthy", | |
| "version": "1.6.0" | |
| } | |
| ``` | |
| ### POST /api/v1/ingest | |
| Ingere texto no sistema. | |
| **Request Body**: | |
| ```json | |
| { | |
| "text": "Conteudo do documento...", | |
| "title": "Titulo do Documento", | |
| "chunk_size": 1000, | |
| "chunk_overlap": 200, | |
| "strategy": "recursive", | |
| "metadata": { | |
| "document_type": "TXT", | |
| "tags": ["tech", "ai"], | |
| "security_level": "public" | |
| } | |
| } | |
| ``` | |
| **Response**: | |
| ```json | |
| { | |
| "document_id": 123, | |
| "num_chunks": 15, | |
| "message": "Document ingested successfully", | |
| "metadata": {...} | |
| } | |
| ``` | |
| ### POST /api/v1/upload | |
| Upload e ingere arquivo (PDF ou TXT). | |
| **Request**: `multipart/form-data` | |
| - `file`: Arquivo a fazer upload | |
| - `chunk_size`: (opcional) Tamanho dos chunks | |
| - `chunk_overlap`: (opcional) Overlap entre chunks | |
| - `strategy`: (opcional) Estrategia de chunking | |
| **Response**: Similar ao `/ingest` | |
| ### POST /api/v1/query | |
| Executa query RAG. | |
| **Request Body**: | |
| ```json | |
| { | |
| "query": "O que e RAG?", | |
| "top_k": 5, | |
| "temperature": 0.3, | |
| "max_tokens": 512, | |
| "model": "huggingface", | |
| "filters": { | |
| "document_type": "PDF", | |
| "tags": ["tech"] | |
| } | |
| } | |
| ``` | |
| **Response**: | |
| ```json | |
| { | |
| "query": "O que e RAG?", | |
| "response": "RAG e Retrieval-Augmented Generation...", | |
| "contexts": [ | |
| { | |
| "content": "Contexto relevante...", | |
| "similarity": 0.92, | |
| "document_id": 123 | |
| } | |
| ], | |
| "metadata": { | |
| "num_contexts": 5, | |
| "model": "huggingface", | |
| "temperature": 0.3, | |
| "max_tokens": 512 | |
| } | |
| } | |
| ``` | |
| ### GET /api/v1/documents | |
| Lista documentos no sistema. | |
| **Query Parameters**: | |
| - `limit`: (opcional) Numero maximo de documentos (default: 100) | |
| - `offset`: (opcional) Offset para paginacao (default: 0) | |
| - `session_id`: (opcional) Filtrar por session_id | |
| **Response**: | |
| ```json | |
| [ | |
| { | |
| "id": 123, | |
| "title": "Documento 1", | |
| "content": "Conteudo...", | |
| "chunk_count": 15, | |
| "created_at": "2026-01-23T10:30:00", | |
| "metadata": {...} | |
| } | |
| ] | |
| ``` | |
| ### DELETE /api/v1/documents/{document_id} | |
| Deleta documento do sistema. | |
| **Path Parameters**: | |
| - `document_id`: ID do documento | |
| **Response**: | |
| ```json | |
| { | |
| "message": "Document deleted successfully", | |
| "document_id": 123 | |
| } | |
| ``` | |
| ### GET /api/v1/stats | |
| Retorna estatisticas do sistema. | |
| **Response**: | |
| ```json | |
| { | |
| "database": { | |
| "total_documents": 150, | |
| "total_chunks": 2500, | |
| "avg_chunks_per_doc": 16.67 | |
| }, | |
| "metadata": { | |
| "total": 150, | |
| "by_type": {"PDF": 100, "TXT": 50}, | |
| "by_security": {"public": 120, "internal": 30} | |
| }, | |
| "timestamp": "2026-01-23T10:30:00" | |
| } | |
| ``` | |
| --- | |
| ## Usando Python SDK | |
| ### Instalacao | |
| ```bash | |
| pip install -e . # Instalar localmente | |
| ``` | |
| ### Uso Basico | |
| ```python | |
| from sdk import RAGClient | |
| # Criar cliente | |
| client = RAGClient( | |
| base_url="http://localhost:8000", | |
| api_key="sua_api_key" | |
| ) | |
| # Health check | |
| health = client.health_check() | |
| print(health) | |
| # Ingerir texto | |
| result = client.ingest_text( | |
| text="Conteudo do documento...", | |
| title="Meu Documento", | |
| metadata={"tags": ["tech", "ai"]} | |
| ) | |
| print(f"Document ID: {result['document_id']}") | |
| # Upload arquivo | |
| result = client.upload_file("documento.pdf") | |
| print(f"Chunks: {result['num_chunks']}") | |
| # Query | |
| response = client.query( | |
| query="O que e RAG?", | |
| top_k=5, | |
| filters={"tags": ["tech"]} | |
| ) | |
| print(response['response']) | |
| # Listar documentos | |
| docs = client.list_documents(limit=10) | |
| for doc in docs: | |
| print(f"{doc['id']}: {doc['title']}") | |
| # Deletar documento | |
| client.delete_document(123) | |
| # Estatisticas | |
| stats = client.get_stats() | |
| print(stats) | |
| ``` | |
| --- | |
| ## Exemplos de Uso | |
| ### Exemplo 1: Pipeline de Ingestao | |
| ```python | |
| from sdk import RAGClient | |
| from pathlib import Path | |
| client = RAGClient(api_key="my_key") | |
| # Ingerir multiplos arquivos | |
| docs_dir = Path("./documents") | |
| for file in docs_dir.glob("*.pdf"): | |
| result = client.upload_file(str(file)) | |
| print(f"Ingested {file.name}: {result['num_chunks']} chunks") | |
| ``` | |
| ### Exemplo 2: Chatbot Simples | |
| ```python | |
| from sdk import RAGClient | |
| client = RAGClient(api_key="my_key") | |
| while True: | |
| query = input("Voce: ") | |
| if query.lower() in ["sair", "exit"]: | |
| break | |
| response = client.query(query, top_k=5) | |
| print(f"Bot: {response['response']}\n") | |
| ``` | |
| ### Exemplo 3: Busca Filtrada | |
| ```python | |
| from sdk import RAGClient | |
| client = RAGClient(api_key="my_key") | |
| # Buscar apenas em documentos publicos de tech | |
| response = client.query( | |
| query="Como funciona embedding?", | |
| filters={ | |
| "security_level": "public", | |
| "tags": ["tech", "ai"] | |
| } | |
| ) | |
| print(response['response']) | |
| print(f"Contextos usados: {response['metadata']['num_contexts']}") | |
| ``` | |
| --- | |
| ## Usando cURL | |
| ### Health Check | |
| ```bash | |
| curl http://localhost:8000/api/v1/health | |
| ``` | |
| ### Ingerir Texto | |
| ```bash | |
| curl -X POST http://localhost:8000/api/v1/ingest \ | |
| -H "Content-Type: application/json" \ | |
| -H "X-API-Key: sua_key" \ | |
| -d '{ | |
| "text": "Conteudo do documento", | |
| "title": "Titulo" | |
| }' | |
| ``` | |
| ### Query | |
| ```bash | |
| curl -X POST http://localhost:8000/api/v1/query \ | |
| -H "Content-Type: application/json" \ | |
| -H "X-API-Key: sua_key" \ | |
| -d '{ | |
| "query": "O que e RAG?", | |
| "top_k": 5 | |
| }' | |
| ``` | |
| ### Listar Documentos | |
| ```bash | |
| curl http://localhost:8000/api/v1/documents?limit=10 \ | |
| -H "X-API-Key: sua_key" | |
| ``` | |
| --- | |
| ## Rate Limiting | |
| A API nao implementa rate limiting por padrao. Para producao, considere usar: | |
| - **Nginx**: Com `limit_req_zone` | |
| - **Traefik**: Com middleware de rate limiting | |
| - **CloudFlare**: Rate limiting no CDN | |
| --- | |
| ## Erros | |
| ### Codigos de Status | |
| - `200`: Sucesso | |
| - `400`: Bad Request (parametros invalidos) | |
| - `401`: Unauthorized (API key invalida ou ausente) | |
| - `404`: Not Found (recurso nao encontrado) | |
| - `500`: Internal Server Error | |
| ### Formato de Erro | |
| ```json | |
| { | |
| "detail": "Error message here" | |
| } | |
| ``` | |
| --- | |
| ## Performance | |
| ### Benchmarks | |
| Testes em maquina local (M1 Pro, 16GB RAM): | |
| | Endpoint | Tempo Medio | Notas | | |
| |----------|-------------|-------| | |
| | /health | <10ms | Muito rapido | | |
| | /ingest | 500-2000ms | Depende do tamanho do documento | | |
| | /query | 200-1000ms | Depende do LLM escolhido | | |
| | /documents | <100ms | Paginado | | |
| ### Otimizacoes | |
| 1. **Cache de Embeddings**: Ativado automaticamente | |
| 2. **Connection Pooling**: Usar pgBouncer ou Supabase | |
| 3. **Workers**: Multiplos workers Uvicorn para producao | |
| 4. **Async**: Endpoints sao async por padrao | |
| --- | |
| ## Deploy em Producao | |
| ### Docker Compose | |
| ```yaml | |
| version: '3.8' | |
| services: | |
| api: | |
| build: . | |
| ports: | |
| - "8000:8000" | |
| environment: | |
| - DATABASE_URL=postgresql://... | |
| - HF_TOKEN=... | |
| - API_KEYS=key1,key2 | |
| command: uvicorn src.api:app --host 0.0.0.0 --port 8000 --workers 4 | |
| ``` | |
| ### Variavies de Ambiente | |
| ```bash | |
| # API Config | |
| API_HOST=0.0.0.0 | |
| API_PORT=8000 | |
| API_WORKERS=4 | |
| API_RELOAD=false | |
| API_KEYS=key1,key2,key3 | |
| # Database | |
| DATABASE_URL=postgresql://... | |
| # LLM | |
| HF_TOKEN=... | |
| ``` | |
| --- | |
| ## Seguranca | |
| ### Best Practices | |
| 1. **HTTPS**: Sempre use HTTPS em producao | |
| 2. **API Keys**: Gere keys fortes e rotacione regularmente | |
| 3. **Rate Limiting**: Implemente rate limiting | |
| 4. **CORS**: Configure CORS apropriadamente | |
| 5. **Input Validation**: Validacao automatica via Pydantic | |
| 6. **Logs**: Monitore logs de acesso | |
| --- | |
| ## Troubleshooting | |
| ### API nao inicia | |
| Verificar: | |
| - PostgreSQL esta rodando | |
| - `DATABASE_URL` esta correto | |
| - Porta 8000 esta disponivel | |
| ### Erros de autenticacao | |
| Verificar: | |
| - API key esta configurada no `.env` | |
| - Header `X-API-Key` esta presente | |
| - Key esta correta | |
| ### Queries lentas | |
| Verificar: | |
| - Indices do banco estao criados | |
| - Cache de embeddings esta ativo | |
| - Modelo LLM nao esta muito grande | |
| --- | |
| ## Proximos Passos | |
| 1. Implementar rate limiting | |
| 2. Adicionar autenticacao OAuth2 | |
| 3. Criar dashboard de monitoramento | |
| 4. Publicar SDK no PyPI | |
| 5. Adicionar webhooks para eventos | |
| --- | |
| ## Recursos | |
| - [Documentacao FastAPI](https://fastapi.tiangolo.com/) | |
| - [Documentacao Uvicorn](https://www.uvicorn.org/) | |
| - [OpenAPI/Swagger](http://localhost:8000/api/docs) | |
| - [ReDoc](http://localhost:8000/api/redoc) | |