Spaces:
Running
Running
| title: README | |
| emoji: π | |
| colorFrom: red | |
| colorTo: gray | |
| sdk: static | |
| pinned: false | |
| license: mit | |
| # Airside Labs π« | |
| **Accelerating safe AI adoption in aviation through rigorous evaluation and benchmarking** | |
| ## About Us | |
| Airside Labs is a specialised AI research and development company focused on aviation sector innovation. We help businesses validate AI performance and achieve product-market fit faster through comprehensive testing frameworks and domain-specific benchmarks. | |
| ## Our Mission | |
| To bridge the gap between cutting-edge AI capabilities and safe, reliable deployment for passenger travel and aviation use cases. We believe that proper evaluation is essential before AI systems can be trusted in business environments. | |
| ## Key Projects | |
| ### π§ͺ Pre-flight Benchmark | |
| Our flagship aviation AI evaluation framework, accepted into the UK AI Security Institute's collection of evaluations. Pre-flight tests Large Language Models' understanding of aviation operations, safety protocols, and real-world constraints. | |
| - **Open Source**: Available for the entire aviation community | |
| - **Comprehensive**: Covers ICAO standards, airport operations, safety procedures | |
| - **Validated**: Developed with industry experts and regulatory input | |
| - **Evolving**: Continuously updated as AI models advance | |
| ### π― Domain-Specific Evaluations | |
| We create custom benchmarks that go beyond standard metrics to test: | |
| - Real-world operational understanding | |
| - Safety-related understanding and reasoning (not for safety critical deployment) | |
| - Regulatory compliance | |
| - Edge case handling | |
| ## Why Aviation-Specific AI Evaluation Matters | |
| Aviation AI systems must understand: | |
| - Physical constraints (aircraft can't occupy the same gate) | |
| - Regulatory requirements (ICAO, FAA, EASA standards) | |
| - Safety protocols and emergency procedures | |
| - International operational complexity | |
| Generic benchmarks like MMLU miss these critical domain requirements. | |
| ## Resources | |
| - **Website**: [airsidelabs.com](https://airsidelabs.com) | |
| - **Benchmark Details**: [Pre-flight Aviation Benchmark](https://airsidelabs.com/aviation-ai-benchmark/) | |
| - **Working Group**: [Join our AI Aviation Evaluation Community](https://airsidelabs.com/ai-aviation-eval-working-group/) | |
| ## Get Involved | |
| We're building a community of aviation professionals and AI researchers. Whether you're: | |
| - Developing AI for aviation applications | |
| - Working in airport/airline operations | |
| - Researching AI safety and evaluation | |
| - Building regulatory frameworks | |
| We'd love to collaborate! | |
| ## Contact | |
| **Alex Brooker** - Founder | |
| Previously VP of R&D at Cirium (RELX), with 15+ years building data and systems for aviation. | |
| Connect with us to discuss AI evaluation, benchmarking needs, or collaborative research opportunities. | |
| --- | |
| *"Better to be on the ground wishing you were in the air than in the air wishing you were on the ground" - This aviation principle guides our approach to AI safety.* |