Spaces:
Sleeping
Sleeping
| title: LLM Evaluation Framework | |
| emoji: 🤖 | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: streamlit | |
| sdk_version: 1.28.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| # LLM Quantitative Evaluation Framework | |
| A comprehensive tool for comparing and evaluating Large Language Models based on multiple quantitative criteria. | |
| ## Features | |
| - **Multi-criteria evaluation**: Performance, cost, speed, reliability, compliance, and integration | |
| - **Interactive weights**: Adjust importance of each factor based on your use case | |
| - **Usage scenario modeling**: Input your specific requirements for accurate cost analysis | |
| - **Visual comparisons**: Charts and graphs for easy model comparison | |
| - **Transparent methodology**: Clear scoring algorithms and explanations | |
| ## How to Use | |
| 1. Adjust the evaluation criteria weights in the sidebar based on your priorities | |
| 2. Configure your usage scenario (monthly requests, token usage) | |
| 3. Review the ranked results and detailed analysis | |
| 4. Use the insights to make informed LLM selection decisions | |
| Built with Streamlit and deployed on Hugging Face Spaces. |