Spaces:
Build error
Build error
File size: 3,394 Bytes
9753ee2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | ---
title: Storm Recovery Agent ✈️
emoji: ⛈️
colorFrom: indigo
colorTo: blue
sdk: docker
app_port: 7860
tags:
- openenv
- simulation
- logistics
- llama-3
- ai-agent
---
# ✈️ Storm Recovery Agent: Fine-Tuning LLMs for High-Stakes Logistics
> "The storm just cancelled 40 flights. You have 2,000 stranded passengers and only 500 available seats. Who gets home first?"
This is the **Flight Rebooking OpenEnv**, a professional simulation designed to train AI agents to handle the complex, high-stakes trade-offs of airline irregular operations (IROPS).
## 🌟 The Challenge (Theme #3.1: Professional Tasks)
When weather strikes, human operation desks must balance:
- **Loyalty SLAs**: Ensuring Platinum and Gold members are prioritized.
- **Connection Deadlines**: Rebooking passengers before their next vital flight.
- **Budget Limits**: Deciding when to use expensive partner airlines or hotels.
- **Inventory Scarcity**: Making every seat count in a zero-sum game.
Generic LLMs often struggle with these "constrained optimization" tasks. This environment provides the structured feedback needed to turn a raw LLM into a **Disruption Specialist**.
## 🧠 The Solution: Fine-Tuned Llama 3 8B
We didn't just build a simulator; we trained an agent to master it.
- **Base Model**: Meta Llama-3-8B-Instruct.
- **Training**: Fine-tuned on **800+ expert trajectories** using LoRA (Unsloth).
- **Strategy**: The agent learned to prioritize by tier while simultaneously minimizing cost and connection delays.
## 📊 Evidence of Training (20% Weight)
### 📈 Training Progress
Our agent showed consistent improvement across all metrics. By epoch 3, it mastered the delicate balance between passenger happiness and operational cost.

### 🏆 Performance Comparison
The trained AI Agent now outperforms standard rule-based heuristics, especially in **"Hard" scenarios** where inventory is extremely scarce and requires strategic "triage" decisions.

| Task | Heuristic Baseline | **Trained AI Agent** |
|------|--------------------|----------------------|
| Easy | 1.000 | **1.000** |
| Medium | 0.972 | **0.990** (+2%) |
| Hard | 0.958 | **0.980** (+2.3%) |
## 🕹️ Interactive Control Tower
Explore the agent's behavior live on our **Hugging Face Space**!
- **Live Observation**: Watch the passenger queue and flight inventory update in real-time.
- **AI Auto-Play**: Watch the fine-tuned Llama 3 model solve disruptions autonomously.
- **Manual Control**: Test your own rebooking skills against the AI.
[**Launch the Control Tower UI**](https://huggingface.co/spaces/YOUR_USER/flight-rebooking-agent/ui)
## 🏗️ Technical Foundation
- **Framework**: Built on **OpenEnv** for standard RL/LLM interaction.
- **Backend**: FastAPI with 4-bit quantization (bitsandbytes) for efficient inference.
- **Frontend**: Vanilla JS dashboard for real-time state visualization.
- **Deployment**: Fully containerized with Docker for seamless HF Space integration.
## 🛠️ Local Setup & Evaluation
```bash
# Install dependencies
pip install -r requirements.txt
# Run the OpenEnv Validator
python pre_submission_validate.py --skip-docker
# Start the Control Tower locally
python app.py
```
---
*Developed for the Meta PyTorch Hackathon (India 2026).*
|