Spaces:
Running
Running
organizational card for paper project
Browse files
README.md
CHANGED
|
@@ -7,4 +7,23 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# Open Multi-Label ASJC Classification
|
| 11 |
+
|
| 12 |
+
This Team Space hosts the **open, multi-label implementation of the All Science Journal Classification (ASJC) taxonomy**, designed to classify scientific documents at the individual level.
|
| 13 |
+
|
| 14 |
+
## Purpose
|
| 15 |
+
Traditional ASJC classification is limited by incomplete sources, journal-level labels, and single-label assignments. This project provides:
|
| 16 |
+
- **Multi-label classification across 307 subjects**
|
| 17 |
+
- Fine-tuned **SciBERT model** trained on Crossref metadata
|
| 18 |
+
- Methods for **collection-level analysis** (researcher portfolios, institutions, datasets)
|
| 19 |
+
|
| 20 |
+
## Features
|
| 21 |
+
- High accuracy: F1-score 0.892 (307 subjects), 0.934 (26 parent subjects)
|
| 22 |
+
- Works with or without source title metadata
|
| 23 |
+
- Open, reproducible, and ready for research use
|
| 24 |
+
|
| 25 |
+
## Contents
|
| 26 |
+
- Models, code, and notebooks for reproducing results
|
| 27 |
+
- Example datasets and label-averaging utilities
|
| 28 |
+
|
| 29 |
+
**Team:** Michael Gusenbauer, Jochen Endermann, Harald Huber, Simon Strasser, Andreas-Nizar Granitzer, Thomas Ströhle
|