Update README.md
Browse files
README.md
CHANGED
|
@@ -25,17 +25,20 @@ This repository showcases the fine-tuning of the Llama-3.2-1B model on the SST-2
|
|
| 25 |
Model Used: meta-llama/Llama-3.2-1B
|
| 26 |
Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
|
| 27 |
Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
|
|
|
|
| 28 |
3. Dataset and Task Details
|
| 29 |
Dataset: SST-2
|
| 30 |
The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
|
| 31 |
The dataset consists of sentences labeled as either positive or negative sentiment.
|
| 32 |
Task Objective
|
| 33 |
Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
|
|
|
|
| 34 |
4. Fine-Tuning Approach
|
| 35 |
Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
|
| 36 |
Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
|
| 37 |
Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
|
| 38 |
Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
|
|
|
|
| 39 |
5. Results and Observations
|
| 40 |
Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
|
| 41 |
|
|
@@ -61,10 +64,12 @@ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
|
|
| 61 |
outputs = model(**inputs)
|
| 62 |
sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
|
| 63 |
print(f"Predicted Sentiment: {sentiment}")
|
|
|
|
| 64 |
7. Key Takeaways
|
| 65 |
Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
|
| 66 |
The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
|
| 67 |
This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
|
|
|
|
| 68 |
8. Acknowledgments
|
| 69 |
Hugging Face Transformers library for facilitating model fine-tuning.
|
| 70 |
Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.
|
|
|
|
| 25 |
Model Used: meta-llama/Llama-3.2-1B
|
| 26 |
Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
|
| 27 |
Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
|
| 28 |
+
|
| 29 |
3. Dataset and Task Details
|
| 30 |
Dataset: SST-2
|
| 31 |
The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
|
| 32 |
The dataset consists of sentences labeled as either positive or negative sentiment.
|
| 33 |
Task Objective
|
| 34 |
Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
|
| 35 |
+
|
| 36 |
4. Fine-Tuning Approach
|
| 37 |
Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
|
| 38 |
Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
|
| 39 |
Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
|
| 40 |
Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
|
| 41 |
+
|
| 42 |
5. Results and Observations
|
| 43 |
Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
|
| 44 |
|
|
|
|
| 64 |
outputs = model(**inputs)
|
| 65 |
sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
|
| 66 |
print(f"Predicted Sentiment: {sentiment}")
|
| 67 |
+
|
| 68 |
7. Key Takeaways
|
| 69 |
Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
|
| 70 |
The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
|
| 71 |
This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
|
| 72 |
+
|
| 73 |
8. Acknowledgments
|
| 74 |
Hugging Face Transformers library for facilitating model fine-tuning.
|
| 75 |
Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.
|