# app.py – Nano AI Backend für Hugging Face Space # ✅ NEUER Endpoint: router.huggingface.co (März 2026 Update!) # ✅ CORS enabled für GitHub Pages # ✅ Qwen2.5-1.5B-Instruct via Hugging Face Inference API # ✅ System Prompt: "Du bist Nano AI" # ✅ Vollständige Fehlerbehandlung & Logging from flask import Flask, request, jsonify from flask_cors import CORS import requests import os import datetime # ============================================ # APP INIT # ============================================ app = Flask(__name__) CORS(app) # ✅ Wichtig: Erlaubt Requests von GitHub Pages! # ============================================ # CONFIG – Qwen2.5-1.5B-Instruct via Inference API # ============================================ MODEL_ID = "Qwen/Qwen2.5-1.5B-Instruct" # ✅ NEUER ENDPOINT (Hugging Face Update März 2026!) API_URL = f"https://router.huggingface.co/hf-inference/models/{MODEL_ID}" # HF Token aus Environment Variable (als Secret im Space setzen!) HF_TOKEN = os.getenv("HF_TOKEN", "") HEADERS = {"Authorization": f"Bearer {HF_TOKEN}"} if HF_TOKEN else {} print(f"✅ Nano AI Backend ready – Using: {MODEL_ID}") print(f"🔗 API URL: {API_URL}") print(f"🔐 HF Token: {'Set' if HF_TOKEN else 'NOT SET - Add as Secret!'}") # ============================================ # ✅ POST /api/generate – Haupt-Endpoint # ============================================ @app.route("/api/generate", methods=["POST"]) def generate(): """ AI Generate Endpoint Expects: {"prompt": "User message", "language": "de"} Returns: {"text": "AI response", "status": "success"} """ try: # Parse JSON body data = request.get_json() if not data: return jsonify({"error": "JSON body required", "status": "error"}), 400 user_prompt = data.get("prompt", "").strip() language = data.get("language", "de") if not user_prompt: return jsonify({"error": "Prompt is required", "status": "error"}), 400 # Language mapping for system prompt lang_names = { "de": "Deutsch", "en": "English", "fr": "Français", "es": "Español", "it": "Italiano", "ru": "Русский", "tr": "Türkçe", "pl": "Polski", "nl": "Nederlands", "pt": "Português" } lang_name = lang_names.get(language, "Deutsch") # ✅ System Prompt: "Du bist Nano AI" system_prompt = f"""Du bist Nano AI, ein hilfreicher und intelligenter KI-Assistent. Antworte immer auf {lang_name}. Sei freundlich, präzise und hilfreich.""" # Format for Qwen Instruct (ChatML style) prompt = f"""<|im_start|>system {system_prompt}<|im_end|> <|im_start|>user {user_prompt}<|im_end|> <|im_start|>assistant """ # Call Hugging Face Inference API with NEW endpoint payload = { "inputs": prompt, "parameters": { "max_new_tokens": 512, "temperature": 0.7, "top_p": 0.9, "do_sample": True, "return_full_text": False, "stop": ["<|im_end|>", "<|endoftext|>"] } } response = requests.post( API_URL, headers=HEADERS, json=payload, timeout=30 ) # Handle model loading (503) if response.status_code == 503: return jsonify({ "text": "⏳ Das Model wird gerade geladen. Bitte warte 30 Sekunden und versuche es erneut.", "status": "loading", "model": MODEL_ID }), 503 # Handle other errors if not response.ok: error_msg = response.text[:200] if response.text else "Unknown error" return jsonify({ "error": f"API Error {response.status_code}: {error_msg}", "status": "error" }), response.status_code # Parse successful response result = response.json() # Extract generated text (Inference API returns list of dicts) if isinstance(result, list) and len(result) > 0 and "generated_text" in result[0]: ai_text = result[0]["generated_text"].strip() elif isinstance(result, dict) and "generated_text" in result: ai_text = result["generated_text"].strip() else: ai_text = "❌ Unerwartete Antwort vom AI-Server." # Clean up response (remove any remaining special tokens) ai_text = ai_text.replace("<|im_end|>", "").replace("<|endoftext|>", "").strip() return jsonify({ "text": ai_text, "status": "success", "model": MODEL_ID, "language": language }) except requests.exceptions.Timeout: return jsonify({ "error": "⏱️ Timeout: Der AI-Server antwortet zu langsam. Bitte versuche es erneut.", "status": "timeout" }), 504 except requests.exceptions.ConnectionError: return jsonify({ "error": "🔌 Verbindungsfehler: Kann den AI-Server nicht erreichen.", "status": "connection_error" }), 502 except Exception as e: print(f"❌ Unexpected Error: {str(e)}") return jsonify({ "error": f"❌ Interner Fehler: {str(e)}", "status": "error" }), 500 # ============================================ # ✅ GET /api/generate – Info Endpoint (für Tests) # ============================================ @app.route("/api/generate", methods=["GET"]) def generate_info(): """ Info endpoint for testing - returns API status """ return jsonify({ "status": "ok", "message": "Nano AI Backend is running", "model": MODEL_ID, "endpoint": "/api/generate", "method": "POST", "expected_body": {"prompt": "Your message", "language": "de"}, "cors_enabled": True }) # ============================================ # ✅ GET /health – Health Check Endpoint # ============================================ @app.route("/health", methods=["GET"]) def health(): """ Health check endpoint for monitoring """ return jsonify({ "status": "healthy", "service": "Nano AI Backend", "version": "1.0.0", "model": MODEL_ID, "timestamp": datetime.datetime.utcnow().isoformat() + "Z" }) # ============================================ # ✅ GET / – Root Endpoint (Info Page) # ============================================ @app.route("/", methods=["GET"]) def root(): """ Root endpoint - shows API info in browser """ return jsonify({ "name": "Nano AI Backend", "version": "1.0.0", "description": "AI Chat Backend using Qwen2.5-1.5B-Instruct via Hugging Face Inference API", "endpoints": { "POST /api/generate": "Send a message to get AI response", "GET /api/generate": "Get API info", "GET /health": "Health check" }, "example_request": { "url": "/api/generate", "method": "POST", "headers": {"Content-Type": "application/json"}, "body": { "prompt": "Hallo, wer bist du?", "language": "de" } }, "github": "https://github.com/thenano-ai/Nano-AI", "note": "This backend is designed for use with the Nano AI Chat frontend on GitHub Pages" }) # ============================================ # ✅ Error Handlers # ============================================ @app.errorhandler(404) def not_found(e): return jsonify({"error": "Endpoint not found", "status": "404"}), 404 @app.errorhandler(405) def method_not_allowed(e): return jsonify({"error": "Method not allowed", "status": "405"}), 405 @app.errorhandler(500) def internal_error(e): return jsonify({"error": "Internal server error", "status": "500"}), 500 # ============================================ # ✅ App Entry Point # ============================================ if __name__ == "__main__": # Get port from environment (Hugging Face Spaces uses PORT env var) port = int(os.environ.get("PORT", 7860)) print(f"🚀 Starting Nano AI Backend on port {port}...") print(f"🌐 Local URL: http://localhost:{port}") print(f"🔗 Health Check: http://localhost:{port}/health") # Run the app app.run(host="0.0.0.0", port=port, debug=False)