--- title: Storm Recovery Agent ✈️ emoji: ⛈️ colorFrom: indigo colorTo: blue sdk: docker app_port: 7860 tags: - openenv - simulation - logistics - llama-3 - ai-agent --- # ✈️ Storm Recovery Agent: Fine-Tuning LLMs for High-Stakes Logistics > "The storm just cancelled 40 flights. You have 2,000 stranded passengers and only 500 available seats. Who gets home first?" This is the **Flight Rebooking OpenEnv**, a professional simulation designed to train AI agents to handle the complex, high-stakes trade-offs of airline irregular operations (IROPS). ## 🌟 The Challenge (Theme #3.1: Professional Tasks) When weather strikes, human operation desks must balance: - **Loyalty SLAs**: Ensuring Platinum and Gold members are prioritized. - **Connection Deadlines**: Rebooking passengers before their next vital flight. - **Budget Limits**: Deciding when to use expensive partner airlines or hotels. - **Inventory Scarcity**: Making every seat count in a zero-sum game. Generic LLMs often struggle with these "constrained optimization" tasks. This environment provides the structured feedback needed to turn a raw LLM into a **Disruption Specialist**. ## 🧠 The Solution: Fine-Tuned Llama 3 8B We didn't just build a simulator; we trained an agent to master it. - **Base Model**: Meta Llama-3-8B-Instruct. - **Training**: Fine-tuned on **800+ expert trajectories** using LoRA (Unsloth). - **Strategy**: The agent learned to prioritize by tier while simultaneously minimizing cost and connection delays. ## 📊 Evidence of Training (20% Weight) ### 📈 Training Progress Our agent showed consistent improvement across all metrics. By epoch 3, it mastered the delicate balance between passenger happiness and operational cost. ![Training Progress](artifacts/training_progress.png) ### 🏆 Performance Comparison The trained AI Agent now outperforms standard rule-based heuristics, especially in **"Hard" scenarios** where inventory is extremely scarce and requires strategic "triage" decisions. ![Performance Comparison](artifacts/performance_comparison.png) | Task | Heuristic Baseline | **Trained AI Agent** | |------|--------------------|----------------------| | Easy | 1.000 | **1.000** | | Medium | 0.972 | **0.990** (+2%) | | Hard | 0.958 | **0.980** (+2.3%) | ## 🕹️ Interactive Control Tower Explore the agent's behavior live on our **Hugging Face Space**! - **Live Observation**: Watch the passenger queue and flight inventory update in real-time. - **AI Auto-Play**: Watch the fine-tuned Llama 3 model solve disruptions autonomously. - **Manual Control**: Test your own rebooking skills against the AI. [**Launch the Control Tower UI**](https://huggingface.co/spaces/YOUR_USER/flight-rebooking-agent/ui) ## 🏗️ Technical Foundation - **Framework**: Built on **OpenEnv** for standard RL/LLM interaction. - **Backend**: FastAPI with 4-bit quantization (bitsandbytes) for efficient inference. - **Frontend**: Vanilla JS dashboard for real-time state visualization. - **Deployment**: Fully containerized with Docker for seamless HF Space integration. ## 🛠️ Local Setup & Evaluation ```bash # Install dependencies pip install -r requirements.txt # Run the OpenEnv Validator python pre_submission_validate.py --skip-docker # Start the Control Tower locally python app.py ``` --- *Developed for the Meta PyTorch Hackathon (India 2026).*