File size: 3,394 Bytes
9753ee2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
title: Storm Recovery Agent ✈️
emoji: ⛈️
colorFrom: indigo
colorTo: blue
sdk: docker
app_port: 7860
tags:
  - openenv
  - simulation
  - logistics
  - llama-3
  - ai-agent
---

# ✈️ Storm Recovery Agent: Fine-Tuning LLMs for High-Stakes Logistics

> "The storm just cancelled 40 flights. You have 2,000 stranded passengers and only 500 available seats. Who gets home first?"

This is the **Flight Rebooking OpenEnv**, a professional simulation designed to train AI agents to handle the complex, high-stakes trade-offs of airline irregular operations (IROPS).

## 🌟 The Challenge (Theme #3.1: Professional Tasks)
When weather strikes, human operation desks must balance:
- **Loyalty SLAs**: Ensuring Platinum and Gold members are prioritized.
- **Connection Deadlines**: Rebooking passengers before their next vital flight.
- **Budget Limits**: Deciding when to use expensive partner airlines or hotels.
- **Inventory Scarcity**: Making every seat count in a zero-sum game.

Generic LLMs often struggle with these "constrained optimization" tasks. This environment provides the structured feedback needed to turn a raw LLM into a **Disruption Specialist**.

## 🧠 The Solution: Fine-Tuned Llama 3 8B
We didn't just build a simulator; we trained an agent to master it.
- **Base Model**: Meta Llama-3-8B-Instruct.
- **Training**: Fine-tuned on **800+ expert trajectories** using LoRA (Unsloth).
- **Strategy**: The agent learned to prioritize by tier while simultaneously minimizing cost and connection delays.

## 📊 Evidence of Training (20% Weight)

### 📈 Training Progress
Our agent showed consistent improvement across all metrics. By epoch 3, it mastered the delicate balance between passenger happiness and operational cost.

![Training Progress](artifacts/training_progress.png)

### 🏆 Performance Comparison
The trained AI Agent now outperforms standard rule-based heuristics, especially in **"Hard" scenarios** where inventory is extremely scarce and requires strategic "triage" decisions.

![Performance Comparison](artifacts/performance_comparison.png)

| Task | Heuristic Baseline | **Trained AI Agent** |
|------|--------------------|----------------------|
| Easy | 1.000 | **1.000** |
| Medium | 0.972 | **0.990** (+2%) |
| Hard | 0.958 | **0.980** (+2.3%) |

## 🕹️ Interactive Control Tower
Explore the agent's behavior live on our **Hugging Face Space**!
- **Live Observation**: Watch the passenger queue and flight inventory update in real-time.
- **AI Auto-Play**: Watch the fine-tuned Llama 3 model solve disruptions autonomously.
- **Manual Control**: Test your own rebooking skills against the AI.

[**Launch the Control Tower UI**](https://huggingface.co/spaces/YOUR_USER/flight-rebooking-agent/ui)

## 🏗️ Technical Foundation
- **Framework**: Built on **OpenEnv** for standard RL/LLM interaction.
- **Backend**: FastAPI with 4-bit quantization (bitsandbytes) for efficient inference.
- **Frontend**: Vanilla JS dashboard for real-time state visualization.
- **Deployment**: Fully containerized with Docker for seamless HF Space integration.

## 🛠️ Local Setup & Evaluation
```bash
# Install dependencies
pip install -r requirements.txt

# Run the OpenEnv Validator
python pre_submission_validate.py --skip-docker

# Start the Control Tower locally
python app.py
```

---
*Developed for the Meta PyTorch Hackathon (India 2026).*