vaishnavkoka
/

fine_tuned_llama_sst2

@@ -25,17 +25,20 @@ This repository showcases the fine-tuning of the Llama-3.2-1B model on the SST-2
 Model Used: meta-llama/Llama-3.2-1B
 Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
 Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
 3. Dataset and Task Details
 Dataset: SST-2
 The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
 The dataset consists of sentences labeled as either positive or negative sentiment.
 Task Objective
 Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
 4. Fine-Tuning Approach
 Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
 Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
 Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
 Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
 5. Results and Observations
 Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
@@ -61,10 +64,12 @@ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
 outputs = model(**inputs)
 sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
 print(f"Predicted Sentiment: {sentiment}")
 7. Key Takeaways
 Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
 The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
 This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
 8. Acknowledgments
 Hugging Face Transformers library for facilitating model fine-tuning.
 Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.

 Model Used: meta-llama/Llama-3.2-1B
 Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
 Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
 3. Dataset and Task Details
 Dataset: SST-2
 The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
 The dataset consists of sentences labeled as either positive or negative sentiment.
 Task Objective
 Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
 4. Fine-Tuning Approach
 Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
 Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
 Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
 Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
 5. Results and Observations
 Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
 outputs = model(**inputs)
 sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
 print(f"Predicted Sentiment: {sentiment}")
 7. Key Takeaways
 Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
 The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
 This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
 8. Acknowledgments
 Hugging Face Transformers library for facilitating model fine-tuning.
 Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.