flight-rebooking / README.md
dhnkhr's picture
Production-ready: Clean code with Groq API integration, LoRA model support, and FastAPI app
9753ee2
metadata
title: Storm Recovery Agent ✈️
emoji: ⛈️
colorFrom: indigo
colorTo: blue
sdk: docker
app_port: 7860
tags:
  - openenv
  - simulation
  - logistics
  - llama-3
  - ai-agent

✈️ Storm Recovery Agent: Fine-Tuning LLMs for High-Stakes Logistics

"The storm just cancelled 40 flights. You have 2,000 stranded passengers and only 500 available seats. Who gets home first?"

This is the Flight Rebooking OpenEnv, a professional simulation designed to train AI agents to handle the complex, high-stakes trade-offs of airline irregular operations (IROPS).

🌟 The Challenge (Theme #3.1: Professional Tasks)

When weather strikes, human operation desks must balance:

  • Loyalty SLAs: Ensuring Platinum and Gold members are prioritized.
  • Connection Deadlines: Rebooking passengers before their next vital flight.
  • Budget Limits: Deciding when to use expensive partner airlines or hotels.
  • Inventory Scarcity: Making every seat count in a zero-sum game.

Generic LLMs often struggle with these "constrained optimization" tasks. This environment provides the structured feedback needed to turn a raw LLM into a Disruption Specialist.

🧠 The Solution: Fine-Tuned Llama 3 8B

We didn't just build a simulator; we trained an agent to master it.

  • Base Model: Meta Llama-3-8B-Instruct.
  • Training: Fine-tuned on 800+ expert trajectories using LoRA (Unsloth).
  • Strategy: The agent learned to prioritize by tier while simultaneously minimizing cost and connection delays.

📊 Evidence of Training (20% Weight)

📈 Training Progress

Our agent showed consistent improvement across all metrics. By epoch 3, it mastered the delicate balance between passenger happiness and operational cost.

Training Progress

🏆 Performance Comparison

The trained AI Agent now outperforms standard rule-based heuristics, especially in "Hard" scenarios where inventory is extremely scarce and requires strategic "triage" decisions.

Performance Comparison

Task Heuristic Baseline Trained AI Agent
Easy 1.000 1.000
Medium 0.972 0.990 (+2%)
Hard 0.958 0.980 (+2.3%)

🕹️ Interactive Control Tower

Explore the agent's behavior live on our Hugging Face Space!

  • Live Observation: Watch the passenger queue and flight inventory update in real-time.
  • AI Auto-Play: Watch the fine-tuned Llama 3 model solve disruptions autonomously.
  • Manual Control: Test your own rebooking skills against the AI.

Launch the Control Tower UI

🏗️ Technical Foundation

  • Framework: Built on OpenEnv for standard RL/LLM interaction.
  • Backend: FastAPI with 4-bit quantization (bitsandbytes) for efficient inference.
  • Frontend: Vanilla JS dashboard for real-time state visualization.
  • Deployment: Fully containerized with Docker for seamless HF Space integration.

🛠️ Local Setup & Evaluation

# Install dependencies
pip install -r requirements.txt

# Run the OpenEnv Validator
python pre_submission_validate.py --skip-docker

# Start the Control Tower locally
python app.py

Developed for the Meta PyTorch Hackathon (India 2026).