Spaces:
Sleeping
Sleeping
“pangjh3”
commited on
Commit
·
288fdbf
1
Parent(s):
4edba36
modified: README.md
Browse files
README.md
CHANGED
|
@@ -1,12 +1,30 @@
|
|
| 1 |
---
|
| 2 |
-
title: ATLAS
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: gradio
|
| 7 |
-
sdk_version: 5.49.1
|
| 8 |
app_file: app.py
|
| 9 |
-
pinned:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: ATLAS Benchmark
|
| 3 |
+
emoji: 🧪
|
| 4 |
+
colorFrom: green
|
| 5 |
+
colorTo: indigo
|
| 6 |
sdk: gradio
|
|
|
|
| 7 |
app_file: app.py
|
| 8 |
+
pinned: true
|
| 9 |
+
license: apache-2.0
|
| 10 |
+
short_description: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning
|
| 11 |
+
sdk_version: 5.43.1
|
| 12 |
+
hf_oauth: true
|
| 13 |
+
tags:
|
| 14 |
+
- leaderboard
|
| 15 |
+
- science
|
| 16 |
+
- benchmark
|
| 17 |
+
- evaluation
|
| 18 |
---
|
| 19 |
|
| 20 |
+
# ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning
|
| 21 |
+
|
| 22 |
+
ATLAS is a high-difficulty, multidisciplinary benchmark for frontier scientific reasoning. It is designed to evaluate the capabilities of large language models (LLMs) in scientific reasoning across seven core scientific fields covering the key domains of AI for Science (AI4S):
|
| 23 |
+
|
| 24 |
+
- Mathematics
|
| 25 |
+
- Physics
|
| 26 |
+
- Chemistry
|
| 27 |
+
- Biology
|
| 28 |
+
- Computer Science
|
| 29 |
+
- Earth Science
|
| 30 |
+
- Materials Science
|