Spaces:
Running
Running
| title: README | |
| emoji: 🚀 | |
| colorFrom: pink | |
| colorTo: purple | |
| sdk: static | |
| pinned: false | |
| license: mit | |
| # 🧠 Open Multi-Label ASJC Classification | |
| We present the first **multi-label classification model** built on the ASJC taxonomy that reliably assigns subject categories to individual documents—including those published in general-science or interdisciplinary journals—using Title, Container Title, and Abstract metadata. | |
| ## 👥 Team | |
| - **Michael Gusenbauer** – Johannes Kepler University Linz | ORCID: [https://orcid.org/0000-0001-7768-2351](https://orcid.org/0000-0001-7768-2351) | |
| - **Jochen Endermann** – University of Applied Sciences Kufstein | |
| - **Harald Huber** – University of Applied Sciences Kufstein | |
| - **Simon Strasser** – University of Applied Sciences Kufstein | |
| - **Andreas-Nizar Granitzer** – Norwegian Geotechnical Institute | ORCID: [https://orcid.org/0000-0002-5839-4300](https://orcid.org/0000-0002-5839-4300) | |
| - **Thomas Ströhle** – Universität Innsbruck | ORCID: [https://orcid.org/0000-0002-1954-6412](https://orcid.org/0000-0002-1954-6412) | |
| ## 🎯 Purpose | |
| Traditional ASJC classification approaches are limited by incomplete sources, journal-level labels, or single-label assignments. This project provides: | |
| - **Multi-label classification across 307 subjects** (compare [google sheet](https://docs.google.com/spreadsheets/d/1kqmGk2x0msodbaKDYt2RixyyB3MqOGrWS2azRGNsodw) for all labels) | |
| - Fine-tuned **SciBERT model** trained on Crossref metadata | |
| - Methods for **collection-level analysis** (researcher portfolios, institutions, datasets) | |
| ## ✨ Features | |
| - High performance | |
| - Works with or without source title metadata | |
| - Open, reproducible, and ready for research use | |
| ## 🗂 Content | |
| - Fine-tuned model | |
| - Sample code for model inference | |
| ## 📖 Citation | |
| If you use this work, please cite: | |
| ```bibtex | |
| @article{Gusenbauer.2025, | |
| author = {Gusenbauer, Michael and Endermann, Jochen and Huber, Harald and Strasser, Simon and Granitzer, Andreas-Nizar and Ströhle, Thomas}, | |
| year = {2025}, | |
| title = {Fine-tuning SciBERT to enable ASJC-based assessments of the disciplinary orientation of research collections}, | |
| keywords = {All Science Journal Classification;Disciplinary coverage;Fine-tuning;multi-label classification;SciBERT;Transformer-based language models}, | |
| issn = {0138-9130}, | |
| journal = {Scientometrics}, | |
| doi = {10.1007/s11192-025-05490-0}, | |
| } |