oopere commited on
Commit
0740011
·
verified ·
1 Parent(s): 42ba697

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -15,22 +15,27 @@ base_model:
15
  # Llama-FinSent-S: Financial Sentiment Analysis Model
16
 
17
  ## Model Overview
18
- Llama-FinSent-S is a fine-tuned version of oopere/pruned40-llama-1b, a pruned model derived from LLaMA-3.2-1B. The pruning process reduces the number of neurons in the MLP layers by 40%, leading to lower power consumption and improved efficiency, while retaining competitive performance in key reasoning and instruction-following tasks.
19
 
20
  The pruning has also reduced the expansion in the MLP layers from 300% to 140%, which, as seen in the paper Exploring GLU expansion ratios: Structured pruning in Llama-3.2 models, is a sweet spot for Llama-3.2 models.
21
 
22
- Llama-FinSent-S is currently the smallest model dedicated to financial sentiment detection that can be deployed on modern edge devices, making it highly suitable for low-resource environments.
23
 
24
  The model has been fine-tuned on financial sentiment classification using the FinGPT/fingpt-sentiment-train dataset. It is designed to analyze financial news and reports, classifying them into sentiment categories to aid decision-making in financial contexts.
25
 
26
- ## Why Use This Model?
27
- **Efficiency**: The pruned architecture reduces computational costs and memory footprint compared to the original LLaMA-3.2-1B model.
28
-
29
- **Performance Gains**: Despite pruning, the model retains or improves performance in key areas, such as instruction-following (IFEVAL), multi-step reasoning (MUSR), and structured information retrieval (Penguins in a Table, Ruin Names).
30
 
31
- **Financial Domain Optimization**: The model is trained specifically on financial sentiment classification, making it more suitable for this task than general-purpose LLMs.
 
32
 
33
- **Flexible Sentiment Classification**: The model can classify sentiment using both seven-category (fine-grained) and three-category (coarse) labeling schemes.
 
 
 
 
34
 
35
  ## How to Use the Model
36
  This model can be used with the transformers library from Hugging Face. Below is an example of how to load and use the model for sentiment classification.
 
15
  # Llama-FinSent-S: Financial Sentiment Analysis Model
16
 
17
  ## Model Overview
18
+ Llama-FinSent-S is a fine-tuned version of [oopere/pruned40-llama-1b](https://huggingface.co/oopere/pruned40-llama-3.2-1B), a pruned model derived from [LLaMA-3.2-1B](meta-llama/Llama-3.2-1B). The pruning process reduces the number of neurons in the MLP layers by 40%, leading to lower power consumption and improved efficiency, while retaining competitive performance in key reasoning and instruction-following tasks.
19
 
20
  The pruning has also reduced the expansion in the MLP layers from 300% to 140%, which, as seen in the paper Exploring GLU expansion ratios: Structured pruning in Llama-3.2 models, is a sweet spot for Llama-3.2 models.
21
 
22
+ Llama-FinSent-S is currently one of the smallest models dedicated to financial sentiment detection that can be deployed on modern edge devices, making it highly suitable for low-resource environments.
23
 
24
  The model has been fine-tuned on financial sentiment classification using the FinGPT/fingpt-sentiment-train dataset. It is designed to analyze financial news and reports, classifying them into sentiment categories to aid decision-making in financial contexts.
25
 
26
+ ## How the Model Was Created
27
+ The model was developed through a two-step process:
28
+ * Pruning: The base LLaMA-3.2-1B model was pruned, reducing its MLP neurons by 40%, which helped decrease computational requirements while preserving key capabilities.
29
+ * Fine-Tuning with LoRA: The pruned model was then fine-tuned using LoRA (Low-Rank Adaptation) on the FinGPT/fingpt-sentiment-train dataset. After training, the LoRA adapter was merged into the base model, creating a compact and efficient model.
30
 
31
+ This method significantly reduced the fine-tuning overhead, enabling model training in just 40 minutes on an A100 GPU while maintaining high-quality sentiment classification performance.
32
+ The model has been fine-tuned on financial sentiment classification using the FinGPT/fingpt-sentiment-train dataset. It is designed to analyze financial news and reports, classifying them into sentiment categories to aid decision-making in financial contexts.
33
 
34
+ ## Why Use This Model?
35
+ * **Efficiency**: The pruned architecture reduces computational costs and memory footprint compared to the original LLaMA-3.2-1B model.
36
+ * **Performance Gains**: Despite pruning, the model retains or improves performance in key areas, such as instruction-following (IFEVAL), multi-step reasoning (MUSR), and structured information retrieval (Penguins in a Table, Ruin Names).
37
+ * **Financial Domain Optimization**: The model is trained specifically on financial sentiment classification, making it more suitable for this task than general-purpose LLMs.
38
+ * **Flexible Sentiment Classification**: The model can classify sentiment using both seven-category (fine-grained) and three-category (coarse) labeling schemes.
39
 
40
  ## How to Use the Model
41
  This model can be used with the transformers library from Hugging Face. Below is an example of how to load and use the model for sentiment classification.