Update README.md
Browse files
README.md
CHANGED
|
@@ -15,22 +15,27 @@ base_model:
|
|
| 15 |
# Llama-FinSent-S: Financial Sentiment Analysis Model
|
| 16 |
|
| 17 |
## Model Overview
|
| 18 |
-
Llama-FinSent-S is a fine-tuned version of oopere/pruned40-llama-1b, a pruned model derived from LLaMA-3.2-1B. The pruning process reduces the number of neurons in the MLP layers by 40%, leading to lower power consumption and improved efficiency, while retaining competitive performance in key reasoning and instruction-following tasks.
|
| 19 |
|
| 20 |
The pruning has also reduced the expansion in the MLP layers from 300% to 140%, which, as seen in the paper Exploring GLU expansion ratios: Structured pruning in Llama-3.2 models, is a sweet spot for Llama-3.2 models.
|
| 21 |
|
| 22 |
-
Llama-FinSent-S is currently the smallest
|
| 23 |
|
| 24 |
The model has been fine-tuned on financial sentiment classification using the FinGPT/fingpt-sentiment-train dataset. It is designed to analyze financial news and reports, classifying them into sentiment categories to aid decision-making in financial contexts.
|
| 25 |
|
| 26 |
-
##
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
*
|
| 30 |
|
| 31 |
-
|
|
|
|
| 32 |
|
| 33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
## How to Use the Model
|
| 36 |
This model can be used with the transformers library from Hugging Face. Below is an example of how to load and use the model for sentiment classification.
|
|
|
|
| 15 |
# Llama-FinSent-S: Financial Sentiment Analysis Model
|
| 16 |
|
| 17 |
## Model Overview
|
| 18 |
+
Llama-FinSent-S is a fine-tuned version of [oopere/pruned40-llama-1b](https://huggingface.co/oopere/pruned40-llama-3.2-1B), a pruned model derived from [LLaMA-3.2-1B](meta-llama/Llama-3.2-1B). The pruning process reduces the number of neurons in the MLP layers by 40%, leading to lower power consumption and improved efficiency, while retaining competitive performance in key reasoning and instruction-following tasks.
|
| 19 |
|
| 20 |
The pruning has also reduced the expansion in the MLP layers from 300% to 140%, which, as seen in the paper Exploring GLU expansion ratios: Structured pruning in Llama-3.2 models, is a sweet spot for Llama-3.2 models.
|
| 21 |
|
| 22 |
+
Llama-FinSent-S is currently one of the smallest models dedicated to financial sentiment detection that can be deployed on modern edge devices, making it highly suitable for low-resource environments.
|
| 23 |
|
| 24 |
The model has been fine-tuned on financial sentiment classification using the FinGPT/fingpt-sentiment-train dataset. It is designed to analyze financial news and reports, classifying them into sentiment categories to aid decision-making in financial contexts.
|
| 25 |
|
| 26 |
+
## How the Model Was Created
|
| 27 |
+
The model was developed through a two-step process:
|
| 28 |
+
* Pruning: The base LLaMA-3.2-1B model was pruned, reducing its MLP neurons by 40%, which helped decrease computational requirements while preserving key capabilities.
|
| 29 |
+
* Fine-Tuning with LoRA: The pruned model was then fine-tuned using LoRA (Low-Rank Adaptation) on the FinGPT/fingpt-sentiment-train dataset. After training, the LoRA adapter was merged into the base model, creating a compact and efficient model.
|
| 30 |
|
| 31 |
+
This method significantly reduced the fine-tuning overhead, enabling model training in just 40 minutes on an A100 GPU while maintaining high-quality sentiment classification performance.
|
| 32 |
+
The model has been fine-tuned on financial sentiment classification using the FinGPT/fingpt-sentiment-train dataset. It is designed to analyze financial news and reports, classifying them into sentiment categories to aid decision-making in financial contexts.
|
| 33 |
|
| 34 |
+
## Why Use This Model?
|
| 35 |
+
* **Efficiency**: The pruned architecture reduces computational costs and memory footprint compared to the original LLaMA-3.2-1B model.
|
| 36 |
+
* **Performance Gains**: Despite pruning, the model retains or improves performance in key areas, such as instruction-following (IFEVAL), multi-step reasoning (MUSR), and structured information retrieval (Penguins in a Table, Ruin Names).
|
| 37 |
+
* **Financial Domain Optimization**: The model is trained specifically on financial sentiment classification, making it more suitable for this task than general-purpose LLMs.
|
| 38 |
+
* **Flexible Sentiment Classification**: The model can classify sentiment using both seven-category (fine-grained) and three-category (coarse) labeling schemes.
|
| 39 |
|
| 40 |
## How to Use the Model
|
| 41 |
This model can be used with the transformers library from Hugging Face. Below is an example of how to load and use the model for sentiment classification.
|