flight-rebooking / README.md
dhnkhr's picture
Production-ready: Clean code with Groq API integration, LoRA model support, and FastAPI app
9753ee2
---
title: Storm Recovery Agent ✈️
emoji: ⛈️
colorFrom: indigo
colorTo: blue
sdk: docker
app_port: 7860
tags:
- openenv
- simulation
- logistics
- llama-3
- ai-agent
---
# ✈️ Storm Recovery Agent: Fine-Tuning LLMs for High-Stakes Logistics
> "The storm just cancelled 40 flights. You have 2,000 stranded passengers and only 500 available seats. Who gets home first?"
This is the **Flight Rebooking OpenEnv**, a professional simulation designed to train AI agents to handle the complex, high-stakes trade-offs of airline irregular operations (IROPS).
## 🌟 The Challenge (Theme #3.1: Professional Tasks)
When weather strikes, human operation desks must balance:
- **Loyalty SLAs**: Ensuring Platinum and Gold members are prioritized.
- **Connection Deadlines**: Rebooking passengers before their next vital flight.
- **Budget Limits**: Deciding when to use expensive partner airlines or hotels.
- **Inventory Scarcity**: Making every seat count in a zero-sum game.
Generic LLMs often struggle with these "constrained optimization" tasks. This environment provides the structured feedback needed to turn a raw LLM into a **Disruption Specialist**.
## 🧠 The Solution: Fine-Tuned Llama 3 8B
We didn't just build a simulator; we trained an agent to master it.
- **Base Model**: Meta Llama-3-8B-Instruct.
- **Training**: Fine-tuned on **800+ expert trajectories** using LoRA (Unsloth).
- **Strategy**: The agent learned to prioritize by tier while simultaneously minimizing cost and connection delays.
## 📊 Evidence of Training (20% Weight)
### 📈 Training Progress
Our agent showed consistent improvement across all metrics. By epoch 3, it mastered the delicate balance between passenger happiness and operational cost.
![Training Progress](artifacts/training_progress.png)
### 🏆 Performance Comparison
The trained AI Agent now outperforms standard rule-based heuristics, especially in **"Hard" scenarios** where inventory is extremely scarce and requires strategic "triage" decisions.
![Performance Comparison](artifacts/performance_comparison.png)
| Task | Heuristic Baseline | **Trained AI Agent** |
|------|--------------------|----------------------|
| Easy | 1.000 | **1.000** |
| Medium | 0.972 | **0.990** (+2%) |
| Hard | 0.958 | **0.980** (+2.3%) |
## 🕹️ Interactive Control Tower
Explore the agent's behavior live on our **Hugging Face Space**!
- **Live Observation**: Watch the passenger queue and flight inventory update in real-time.
- **AI Auto-Play**: Watch the fine-tuned Llama 3 model solve disruptions autonomously.
- **Manual Control**: Test your own rebooking skills against the AI.
[**Launch the Control Tower UI**](https://huggingface.co/spaces/YOUR_USER/flight-rebooking-agent/ui)
## 🏗️ Technical Foundation
- **Framework**: Built on **OpenEnv** for standard RL/LLM interaction.
- **Backend**: FastAPI with 4-bit quantization (bitsandbytes) for efficient inference.
- **Frontend**: Vanilla JS dashboard for real-time state visualization.
- **Deployment**: Fully containerized with Docker for seamless HF Space integration.
## 🛠️ Local Setup & Evaluation
```bash
# Install dependencies
pip install -r requirements.txt
# Run the OpenEnv Validator
python pre_submission_validate.py --skip-docker
# Start the Control Tower locally
python app.py
```
---
*Developed for the Meta PyTorch Hackathon (India 2026).*