--- title: IriusRiskTestChallenge emoji: 🚀 colorFrom: green colorTo: indigo sdk: docker pinned: false license: apache-2.0 short_description: LLM backend for IriusRisk Tech challenge --- # IriusRisk test challenge This project implements a FastAPI API that uses LangChain and LangGraph to generate text with the `SmolLM2-1.7B-Instruct` model from HuggingFace. I have chosen that model so that I could deploy it on a free GPU-only backend from Hugging Face for this test. The API includes security features such as API Key authentication and rate limiting to protect against abuse. ## API URLs - **Production**: `https://maximofn-iriusrisktestchallenge.hf.space` - **Local Development**: `http://localhost:7860` ## Main Features - 🤖 Text generation using SmolLM2-1.7B-Instruct - 📝 Text summarization capabilities - 🔑 API Key authentication - ⚡ Rate limiting for abuse protection - 🔄 Conversation thread support - 📚 Interactive documentation with Swagger and ReDoc ## Configuration ### Environment Variables For local deployment, create a `.env` file in the project root with the following variables: ```env API_KEY="your_secret_api_key" ``` ## Deployment ### In HuggingFace Spaces This project is designed to run in HuggingFace Spaces. To configure it: 1. Create a new Space in HuggingFace with blank Docker SDK 2. Add all the files to the Space 3. Configure the API_KEY in the Space's environment secrets ### Local Docker Deployment For local deployment: 1. Clone this repository 2. Create the `.env` file with your API_KEY 3. Install the dependencies: ```bash pip install -r requirements.txt ``` ### Local Docker Deployment For local Docker deployment: 1. Clone the repository 2. Create the `.env` file with your API_KEY 3. Build the Docker image: ```bash docker build -t iriusrisk-test-challenge . ``` 4. Run the Docker container: ```bash docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge ``` ## Local Execution ```bash uvicorn app:app --reload ``` The API will be available at `http://localhost:8000`. ## Local Docker Execution ```bash docker run -p 8000:8000 --env-file .env iriusrisk-test-challenge ``` The API will be available at `http://localhost:8000`. ## Endpoints ### GET `/` Welcome endpoint that returns a greeting message. - Rate limit: 10 requests per minute ### POST `/generate` Endpoint to generate text using the language model. - Rate limit: 5 requests per minute - Requires API Key authentication **Request parameters:** ```json { "query": "Your question here", "thread_id": "optional_thread_identifier", "system_prompt": "optional_system_prompt" } ``` ### POST `/summarize` Endpoint to summarize text using the language model. - Rate limit: 5 requests per minute - Requires API Key authentication **Request parameters:** ```json { "text": "Text to summarize", "thread_id": "optional_thread_identifier", "max_length": 200 } ``` ## Authentication The API uses API Key authentication. You must include your API Key in the `X-API-Key` header for all protected endpoint requests. Example: ```bash # Production curl -X POST "https://maximofn-iriusrisktestchallenge.hf.space/generate" \ -H "X-API-Key: your_api_key" \ -H "Content-Type: application/json" \ -d '{"query": "What is FastAPI?"}' # Local development curl -X POST "http://localhost:7860/generate" \ -H "X-API-Key: your_api_key" \ -H "Content-Type: application/json" \ -d '{"query": "What is FastAPI?"}' ``` ## Rate Limiting To protect the API against abuse, the following limits have been implemented: - Endpoint `/`: 10 requests per minute - Endpoint `/generate`: 5 requests per minute - Endpoint `/summarize`: 5 requests per minute When these limits are exceeded, the API will return a 429 (Too Many Requests) error. ## API Documentation The interactive API documentation is available at: - Swagger UI: - Production: `https://maximofn-iriusrisktestchallenge.hf.space/docs` - Local: `http://localhost:7860/docs` - ReDoc: - Production: `https://maximofn-iriusrisktestchallenge.hf.space/redoc` - Local: `http://localhost:7860/redoc` ## Error Handling The API includes error handling for the following situations: - Error 401: API Key not provided - Error 403: Invalid API Key - Error 429: Rate limit exceeded - Error 500: Internal server error ## Code Examples ### Python Here are some examples of how to use the API with Python: #### Text Generation ```python import requests # API configuration API_URL = "https://maximofn-iriusrisktestchallenge.hf.space" # Production URL # API_URL = "http://localhost:7860" # Local development URL API_KEY = "your_api_key" # Replace with your API key # Headers for authentication headers = { "X-API-Key": API_KEY, "Content-Type": "application/json" } # Generate text def generate_text(query, thread_id="default", system_prompt=None): url = f"{API_URL}/generate" data = { "query": query, "thread_id": thread_id } # Add system prompt if provided if system_prompt: data["system_prompt"] = system_prompt try: response = requests.post(url, json=data, headers=headers) if response.status_code == 200: result = response.json() return result["generated_text"] else: print(f"Error: {response.status_code}") print(f"Details: {response.text}") return None except Exception as e: print(f"Request failed: {str(e)}") return None # Example usage query = "What are the main features of Python?" result = generate_text(query) if result: print("Response:", result) # Example with custom thread and system prompt result = generate_text( query="Explain object-oriented programming", thread_id="programming_tutorial", system_prompt="You are a programming teacher. Explain concepts in simple terms." ) ``` #### Text Summarization ```python import requests # Summarize text def summarize_text(text, max_length=200, thread_id="default"): url = f"{API_URL}/summarize" data = { "text": text, "max_length": max_length, "thread_id": thread_id } try: response = requests.post(url, json=data, headers=headers) if response.status_code == 200: result = response.json() return result["summary"] else: print(f"Error: {response.status_code}") print(f"Details: {response.text}") return None except Exception as e: print(f"Request failed: {str(e)}") return None # Example usage text_to_summarize = """ Python is a high-level, interpreted programming language created by Guido van Rossum and released in 1991. Python's design philosophy emphasizes code readability with the use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects. """ summary = summarize_text(text_to_summarize, max_length=50) if summary: print("Summary:", summary) ``` #### Error Handling Example ```python def make_api_request(endpoint, data): url = f"{API_URL}/{endpoint}" try: response = requests.post(url, json=data, headers=headers) if response.status_code == 200: return response.json() elif response.status_code == 429: print("Rate limit exceeded. Please wait before making more requests.") elif response.status_code in (401, 403): print("Authentication error. Please check your API key.") else: print(f"Error {response.status_code}: {response.text}") return None except requests.exceptions.ConnectionError: print("Connection error. Please check if the API server is running.") except Exception as e: print(f"Unexpected error: {str(e)}") return None ``` These examples show how to: - Make requests to different endpoints - Handle authentication with API keys - Process successful responses - Handle various types of errors - Use optional parameters like thread_id and system_prompt Remember to: - Replace `API_URL` with your actual API endpoint - Set your API key in the headers - Handle rate limiting by implementing appropriate delays between requests - Implement proper error handling for your use case