Add metadata for license, library, and pipeline tag and add paper/code links
Browse filesHi! I'm Niels from the Hugging Face community science team. I've opened this PR to enhance the model card with standardized metadata and improve its documentation.
Specifically, I've:
- Added `library_name: transformers` to enable the "Use in Transformers" button and automated code snippets.
- Added `license: cc-by-nc-4.0` to the metadata for proper indexing.
- Added `pipeline_tag: text-classification` for better discoverability in the Hub's model gallery.
- Included links to the original paper and the official GitHub repository at the top of the card.
- Fixed the label mapping in the "How to Use" Python snippet to align with the model's actual configuration.
These updates help users find, understand, and use your model more effectively!
README.md
CHANGED
|
@@ -1,18 +1,25 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
datasets:
|
| 3 |
- ExponentialScience/DLT-Sentiment-News
|
| 4 |
language:
|
| 5 |
- en
|
| 6 |
-
|
| 7 |
-
|
|
|
|
| 8 |
---
|
|
|
|
| 9 |
# LedgerBERT-Market-Sentiment
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
## Model Description
|
| 12 |
|
| 13 |
### Model Summary
|
| 14 |
|
| 15 |
-
LedgerBERT-Market-Sentiment is a fine-tuned version of LedgerBERT
|
| 16 |
|
| 17 |
This model is particularly effective for analyzing cryptocurrency news headlines, social media posts, and other DLT-related content where understanding market sentiment is important.
|
| 18 |
|
|
@@ -88,7 +95,7 @@ The dataset provides domain expertise through crowdsourced annotations from cryp
|
|
| 88 |
|
| 89 |
**Note:** News articles are absent from the DLT-Corpus used to pre-train LedgerBERT, making this an out-of-domain generalization test that demonstrates the model's robust language understanding.
|
| 90 |
|
| 91 |
-
For more details on the dataset used for
|
| 92 |
|
| 93 |
### Training Procedure
|
| 94 |
|
|
@@ -161,13 +168,14 @@ for text in texts:
|
|
| 161 |
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
|
| 162 |
predicted_class = predictions.argmax(dim=-1).item()
|
| 163 |
|
| 164 |
-
# Map to labels
|
| 165 |
-
labels = ["
|
| 166 |
sentiment = labels[predicted_class]
|
| 167 |
confidence = predictions[0][predicted_class].item()
|
| 168 |
|
| 169 |
print(f"Text: {text}")
|
| 170 |
-
print(f"Sentiment: {sentiment} (confidence: {confidence:.3f})
|
|
|
|
| 171 |
```
|
| 172 |
|
| 173 |
### Batch Processing
|
|
@@ -193,7 +201,8 @@ results = classifier(texts, truncation=True, max_length=512)
|
|
| 193 |
|
| 194 |
for text, result in zip(texts, results):
|
| 195 |
print(f"Text: {text}")
|
| 196 |
-
print(f"Sentiment: {result['label']} (score: {result['score']:.3f})
|
|
|
|
| 197 |
```
|
| 198 |
|
| 199 |
### Integration with News Feeds
|
|
@@ -218,7 +227,8 @@ for entry in feed.entries[:5]: # Process first 5 entries
|
|
| 218 |
|
| 219 |
print(f"Headline: {title}")
|
| 220 |
print(f"Market Sentiment: {result['label']} ({result['score']:.2%})")
|
| 221 |
-
print(f"Link: {entry.link}
|
|
|
|
| 222 |
```
|
| 223 |
|
| 224 |
## Citation
|
|
@@ -245,7 +255,7 @@ If you use LedgerBERT-Market-Sentiment in your research, please cite:
|
|
| 245 |
|
| 246 |
### Additional Fine-tuned Models
|
| 247 |
|
| 248 |
-
LedgerBERT can also be fine-tuned for other sentiment dimensions available in the DLT-Sentiment-News dataset
|
| 249 |
- **Content Characteristics** (liked, disliked, neutral)
|
| 250 |
- **Engagement Quality** (important, lol, neutral)
|
| 251 |
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: ExponentialScience/LedgerBERT
|
| 3 |
datasets:
|
| 4 |
- ExponentialScience/DLT-Sentiment-News
|
| 5 |
language:
|
| 6 |
- en
|
| 7 |
+
library_name: transformers
|
| 8 |
+
license: cc-by-nc-4.0
|
| 9 |
+
pipeline_tag: text-classification
|
| 10 |
---
|
| 11 |
+
|
| 12 |
# LedgerBERT-Market-Sentiment
|
| 13 |
|
| 14 |
+
This model was introduced in the paper [DLT-Corpus: A Large-Scale Text Collection for the Distributed Ledger Technology Domain](https://huggingface.co/papers/2602.22045).
|
| 15 |
+
|
| 16 |
+
The official code repository is available [here](https://github.com/dlt-science/DLT-Corpus).
|
| 17 |
+
|
| 18 |
## Model Description
|
| 19 |
|
| 20 |
### Model Summary
|
| 21 |
|
| 22 |
+
LedgerBERT-Market-Sentiment is a fine-tuned version of [LedgerBERT](https://huggingface.co/ExponentialScience/LedgerBERT) specialized for sentiment analysis of cryptocurrency and DLT-related content. The model classifies text into three market direction sentiment categories: **bullish** (positive market outlook), **bearish** (negative market outlook), and **neutral** (balanced or unclear market direction).
|
| 23 |
|
| 24 |
This model is particularly effective for analyzing cryptocurrency news headlines, social media posts, and other DLT-related content where understanding market sentiment is important.
|
| 25 |
|
|
|
|
| 95 |
|
| 96 |
**Note:** News articles are absent from the DLT-Corpus used to pre-train LedgerBERT, making this an out-of-domain generalization test that demonstrates the model's robust language understanding.
|
| 97 |
|
| 98 |
+
For more details on the dataset used for fine-tuning, see: https://huggingface.co/datasets/ExponentialScience/DLT-Sentiment-News
|
| 99 |
|
| 100 |
### Training Procedure
|
| 101 |
|
|
|
|
| 168 |
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
|
| 169 |
predicted_class = predictions.argmax(dim=-1).item()
|
| 170 |
|
| 171 |
+
# Map to labels based on config.json
|
| 172 |
+
labels = ["neutral", "bearish", "bullish"]
|
| 173 |
sentiment = labels[predicted_class]
|
| 174 |
confidence = predictions[0][predicted_class].item()
|
| 175 |
|
| 176 |
print(f"Text: {text}")
|
| 177 |
+
print(f"Sentiment: {sentiment} (confidence: {confidence:.3f})
|
| 178 |
+
")
|
| 179 |
```
|
| 180 |
|
| 181 |
### Batch Processing
|
|
|
|
| 201 |
|
| 202 |
for text, result in zip(texts, results):
|
| 203 |
print(f"Text: {text}")
|
| 204 |
+
print(f"Sentiment: {result['label']} (score: {result['score']:.3f})
|
| 205 |
+
")
|
| 206 |
```
|
| 207 |
|
| 208 |
### Integration with News Feeds
|
|
|
|
| 227 |
|
| 228 |
print(f"Headline: {title}")
|
| 229 |
print(f"Market Sentiment: {result['label']} ({result['score']:.2%})")
|
| 230 |
+
print(f"Link: {entry.link}
|
| 231 |
+
")
|
| 232 |
```
|
| 233 |
|
| 234 |
## Citation
|
|
|
|
| 255 |
|
| 256 |
### Additional Fine-tuned Models
|
| 257 |
|
| 258 |
+
LedgerBERT can also be fine-tuned for other sentiment dimensions available in the DLT-Sentiment-News dataset:
|
| 259 |
- **Content Characteristics** (liked, disliked, neutral)
|
| 260 |
- **Engagement Quality** (important, lol, neutral)
|
| 261 |
|