Spaces:
Sleeping
Sleeping
| title: IriusRiskTestChallenge | |
| emoji: 🚀 | |
| colorFrom: green | |
| colorTo: indigo | |
| sdk: docker | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: LLM backend for IriusRisk Tech challenge | |
| # IriusRisk test challenge | |
| This project implements a FastAPI API that uses LangChain and LangGraph to generate text with the `SmolLM2-1.7B-Instruct` model from HuggingFace. I have chosen that model so that I could deploy it on a free GPU-only backend from Hugging Face for this test. The API includes security features such as API Key authentication and rate limiting to protect against abuse. | |
| ## API URLs | |
| - **Production**: `https://maximofn-iriusrisktestchallenge.hf.space` | |
| - **Local Development**: `http://localhost:7860` | |
| ## Main Features | |
| - 🤖 Text generation using SmolLM2-1.7B-Instruct | |
| - 📝 Text summarization capabilities | |
| - 🔑 API Key authentication | |
| - ⚡ Rate limiting for abuse protection | |
| - 🔄 Conversation thread support | |
| - 📚 Interactive documentation with Swagger and ReDoc | |
| ## Configuration | |
| ### Environment Variables | |
| For local deployment, create a `.env` file in the project root with the following variables: | |
| ```env | |
| API_KEY="your_secret_api_key" | |
| ``` | |
| ## Deployment | |
| ### In HuggingFace Spaces | |
| This project is designed to run in HuggingFace Spaces. To configure it: | |
| 1. Create a new Space in HuggingFace with blank Docker SDK | |
| 2. Add all the files to the Space | |
| 3. Configure the API_KEY in the Space's environment secrets | |
| ### Local Docker Deployment | |
| For local deployment: | |
| 1. Clone this repository | |
| 2. Create the `.env` file with your API_KEY | |
| 3. Install the dependencies: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### Local Docker Deployment | |
| For local Docker deployment: | |
| 1. Clone the repository | |
| 2. Create the `.env` file with your API_KEY | |
| 3. Build the Docker image: | |
| ```bash | |
| docker build -t iriusrisk-test-challenge . | |
| ``` | |
| 4. Run the Docker container: | |
| ```bash | |
| docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge | |
| ``` | |
| ## Local Execution | |
| ```bash | |
| uvicorn app:app --reload | |
| ``` | |
| The API will be available at `http://localhost:8000`. | |
| ## Local Docker Execution | |
| ```bash | |
| docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge | |
| ``` | |
| The API will be available at `http://localhost:8000`. | |
| ## Endpoints | |
| ### GET `/` | |
| Welcome endpoint that returns a greeting message. | |
| - Rate limit: 10 requests per minute | |
| ### POST `/generate` | |
| Endpoint to generate text using the language model. | |
| - Rate limit: 5 requests per minute | |
| - Requires API Key authentication | |
| **Request parameters:** | |
| ```json | |
| { | |
| "query": "Your question here", | |
| "thread_id": "optional_thread_identifier", | |
| "system_prompt": "optional_system_prompt" | |
| } | |
| ``` | |
| ### POST `/summarize` | |
| Endpoint to summarize text using the language model. | |
| - Rate limit: 5 requests per minute | |
| - Requires API Key authentication | |
| **Request parameters:** | |
| ```json | |
| { | |
| "text": "Text to summarize", | |
| "thread_id": "optional_thread_identifier", | |
| "max_length": 200 | |
| } | |
| ``` | |
| ## Authentication | |
| The API uses API Key authentication. You must include your API Key in the `X-API-Key` header for all protected endpoint requests. | |
| Example: | |
| ```bash | |
| # Production | |
| curl -X POST "https://maximofn-iriusrisktestchallenge.hf.space/generate" \ | |
| -H "X-API-Key: your_api_key" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"query": "What is FastAPI?"}' | |
| # Local development | |
| curl -X POST "http://localhost:7860/generate" \ | |
| -H "X-API-Key: your_api_key" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"query": "What is FastAPI?"}' | |
| ``` | |
| ## Rate Limiting | |
| To protect the API against abuse, the following limits have been implemented: | |
| - Endpoint `/`: 10 requests per minute | |
| - Endpoint `/generate`: 5 requests per minute | |
| - Endpoint `/summarize`: 5 requests per minute | |
| When these limits are exceeded, the API will return a 429 (Too Many Requests) error. | |
| ## API Documentation | |
| The interactive API documentation is available at: | |
| - Swagger UI: | |
| - Production: `https://maximofn-iriusrisktestchallenge.hf.space/docs` | |
| - Local: `http://localhost:7860/docs` | |
| - ReDoc: | |
| - Production: `https://maximofn-iriusrisktestchallenge.hf.space/redoc` | |
| - Local: `http://localhost:7860/redoc` | |
| ## Error Handling | |
| The API includes error handling for the following situations: | |
| - Error 401: API Key not provided | |
| - Error 403: Invalid API Key | |
| - Error 429: Rate limit exceeded | |
| - Error 500: Internal server error | |
| ## Code Examples | |
| ### Python | |
| Here are some examples of how to use the API with Python: | |
| #### Text Generation | |
| ```python | |
| import requests | |
| # API configuration | |
| API_URL = "https://maximofn-iriusrisktestchallenge.hf.space" # Production URL | |
| # API_URL = "http://localhost:7860" # Local development URL | |
| API_KEY = "your_api_key" # Replace with your API key | |
| # Headers for authentication | |
| headers = { | |
| "X-API-Key": API_KEY, | |
| "Content-Type": "application/json" | |
| } | |
| # Generate text | |
| def generate_text(query, thread_id="default", system_prompt=None): | |
| url = f"{API_URL}/generate" | |
| data = { | |
| "query": query, | |
| "thread_id": thread_id | |
| } | |
| # Add system prompt if provided | |
| if system_prompt: | |
| data["system_prompt"] = system_prompt | |
| try: | |
| response = requests.post(url, json=data, headers=headers) | |
| if response.status_code == 200: | |
| result = response.json() | |
| return result["generated_text"] | |
| else: | |
| print(f"Error: {response.status_code}") | |
| print(f"Details: {response.text}") | |
| return None | |
| except Exception as e: | |
| print(f"Request failed: {str(e)}") | |
| return None | |
| # Example usage | |
| query = "What are the main features of Python?" | |
| result = generate_text(query) | |
| if result: | |
| print("Response:", result) | |
| # Example with custom thread and system prompt | |
| result = generate_text( | |
| query="Explain object-oriented programming", | |
| thread_id="programming_tutorial", | |
| system_prompt="You are a programming teacher. Explain concepts in simple terms." | |
| ) | |
| ``` | |
| #### Text Summarization | |
| ```python | |
| import requests | |
| # Summarize text | |
| def summarize_text(text, max_length=200, thread_id="default"): | |
| url = f"{API_URL}/summarize" | |
| data = { | |
| "text": text, | |
| "max_length": max_length, | |
| "thread_id": thread_id | |
| } | |
| try: | |
| response = requests.post(url, json=data, headers=headers) | |
| if response.status_code == 200: | |
| result = response.json() | |
| return result["summary"] | |
| else: | |
| print(f"Error: {response.status_code}") | |
| print(f"Details: {response.text}") | |
| return None | |
| except Exception as e: | |
| print(f"Request failed: {str(e)}") | |
| return None | |
| # Example usage | |
| text_to_summarize = """ | |
| Python is a high-level, interpreted programming language created by Guido van Rossum | |
| and released in 1991. Python's design philosophy emphasizes code readability with | |
| the use of significant whitespace. Its language constructs and object-oriented | |
| approach aim to help programmers write clear, logical code for small and large-scale projects. | |
| """ | |
| summary = summarize_text(text_to_summarize, max_length=50) | |
| if summary: | |
| print("Summary:", summary) | |
| ``` | |
| #### Error Handling Example | |
| ```python | |
| def make_api_request(endpoint, data): | |
| url = f"{API_URL}/{endpoint}" | |
| try: | |
| response = requests.post(url, json=data, headers=headers) | |
| if response.status_code == 200: | |
| return response.json() | |
| elif response.status_code == 429: | |
| print("Rate limit exceeded. Please wait before making more requests.") | |
| elif response.status_code in (401, 403): | |
| print("Authentication error. Please check your API key.") | |
| else: | |
| print(f"Error {response.status_code}: {response.text}") | |
| return None | |
| except requests.exceptions.ConnectionError: | |
| print("Connection error. Please check if the API server is running.") | |
| except Exception as e: | |
| print(f"Unexpected error: {str(e)}") | |
| return None | |
| ``` | |
| These examples show how to: | |
| - Make requests to different endpoints | |
| - Handle authentication with API keys | |
| - Process successful responses | |
| - Handle various types of errors | |
| - Use optional parameters like thread_id and system_prompt | |
| Remember to: | |
| - Replace `API_URL` with your actual API endpoint | |
| - Set your API key in the headers | |
| - Handle rate limiting by implementing appropriate delays between requests | |
| - Implement proper error handling for your use case | |