Spaces:
Sleeping
Sleeping
A newer version of the Streamlit SDK is available:
1.52.2
metadata
title: LLM Evaluation Framework
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.28.0
app_file: app.py
pinned: false
license: mit
LLM Quantitative Evaluation Framework
A comprehensive tool for comparing and evaluating Large Language Models based on multiple quantitative criteria.
Features
- Multi-criteria evaluation: Performance, cost, speed, reliability, compliance, and integration
- Interactive weights: Adjust importance of each factor based on your use case
- Usage scenario modeling: Input your specific requirements for accurate cost analysis
- Visual comparisons: Charts and graphs for easy model comparison
- Transparent methodology: Clear scoring algorithms and explanations
How to Use
- Adjust the evaluation criteria weights in the sidebar based on your priorities
- Configure your usage scenario (monthly requests, token usage)
- Review the ranked results and detailed analysis
- Use the insights to make informed LLM selection decisions
Built with Streamlit and deployed on Hugging Face Spaces.