Jay24-AI's picture
updating the inference section
dd35727 verified
---
library_name: transformers
tags: [causal-lm, bloom, lora, peft, finetuning, english]
---
# Model Card for Jay24-AI/bloom-3b-lora-tagger
This model is a **LoRA fine-tuned version of BigScience’s BLOOM-3B** model, trained on a dataset of English quotes. The goal was to adapt BLOOM using the [PEFT](https://github.com/huggingface/peft) (Parameter-Efficient Fine-Tuning) approach with [LoRA](https://arxiv.org/abs/2106.09685), making it lightweight to train and efficient for deployment.
---
## Model Details
### Model Description
- **Developed by:** Jay24-AI
- **Funded by [optional]:** N/A
- **Shared by [optional]:** Jay24-AI
- **Model type:** Causal Language Model with LoRA adapters
- **Language(s):** English
- **License:** [Add the license you intend to apply — BLOOM is under the RAIL license, LoRA adapters are usually MIT-compatible]
- **Finetuned from model:** [bigscience/bloom-3b](https://huggingface.co/bigscience/bloom-3b)
### Model Sources
- **Repository:** https://huggingface.co/Jay24-AI/bloom-3b-lora-tagger
---
## Uses
### Direct Use
The model can be used for **text generation and tagging** based on quote-like prompts.
For example, you can input a quote, and the model will generate descriptive tags.
### Downstream Use
- Can be further fine-tuned on custom tagging or classification datasets.
- Could be integrated into applications that require lightweight **quote classification**, **text annotation**, or **prompt-based generation**.
### Out-of-Scope Use
- Not suitable for factual question answering.
- Not designed for sensitive or high-stakes decision-making (e.g., medical, legal, or financial advice).
---
## Bias, Risks, and Limitations
- Inherits limitations and biases from **BLOOM-3B** (which was trained on large-scale internet data).
- The fine-tuned dataset (English quotes) is small (~1k samples), so the model may overfit and generalize poorly outside similar data.
- Risk of generating irrelevant or biased tags if prompted outside the intended scope.
### Recommendations
Users should:
- Validate outputs before production use.
- Avoid relying on the model for critical applications.
---
## How to Get Started with the Model
```python
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
peft_model_id = "Jay24-AI/bloom-3b-lora-tagger"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)
batch = tokenizer("“The only way to do great work is to love what you do.” ->:", return_tensors='pt')
with torch.cuda.amp.autocast():
output_tokens = model.generate(**batch, max_new_tokens=50)
print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True))
```
## Training Details
### Training Data
- **Dataset used:** [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes)
- **Subset:** First 1,000 samples (`train[:1000]`).
- **Structure:** Each entry includes a `quote` and its corresponding `tags`.
- **Preprocessing:**
- Combined the `quote` and `tags` into a single text string:
```
"<quote>" ->: <tags>
```
- Tokenized using the `AutoTokenizer` from **bigscience/bloom-3b**.
- Applied batching via Hugging Face `datasets.map` with `batched=True`.
---
### Training Procedure
#### Preprocessing
- Converted text examples into the `"quote ->: tags"` format.
- Tokenized using Bloom’s tokenizer with default settings.
- Applied `DataCollatorForLanguageModeling` with `mlm=False` (causal LM objective).
#### Training Hyperparameters
- **Base model:** bigscience/bloom-3b
- **Adapter method:** LoRA via PEFT
- **LoRA configuration:**
- r = 16
- lora_alpha = 32
- lora_dropout = 0.05
- bias = "none"
- task_type = "CAUSAL_LM"
- **TrainingArguments:**
- per_device_train_batch_size = 4
- gradient_accumulation_steps = 4
- warmup_steps = 100
- max_steps = 200
- learning_rate = 2e-4
- fp16 = True
- logging_steps = 1
- output_dir = `outputs/`
- **Precision regime:** Mixed precision (fp16).
- **Caching:** `model.config.use_cache = False` during training to suppress warnings.
#### Hyperparameter Summary
| Hyperparameter | Value |
|-----------------------------|------------------------|
| Base model | bigscience/bloom-3b |
| Adapter method | LoRA (via PEFT) |
| LoRA r | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| Bias | none |
| Task type | Causal LM |
| Batch size (per device) | 4 |
| Gradient accumulation steps | 4 |
| Effective batch size | 16 |
| Warmup steps | 100 |
| Max steps | 200 |
| Learning rate | 2e-4 |
| Precision | fp16 (mixed precision) |
| Logging steps | 1 |
| Output directory | outputs/ |
| Gradient checkpointing | Enabled |
| Use cache | False (during training)|
---
### Speeds, Sizes, Times
- **Trainable parameters:** LoRA adapters only (a small fraction of BLOOM-3B).
- **Approx. size:** Much smaller than 3B full checkpoint since only adapters are stored.
- **Max steps:** 200 (~250 updates with gradient accumulation).
- **Training runtime:** Depends on GPU (not logged in script).
- **Batch size effective:** 16 (4 × accumulation steps of 4).
---
### Compute Infrastructure
- **Hardware:** Single CUDA GPU T4 (set with `os.environ["CUDA_VISIBLE_DEVICES"]="0"`).
- **Software:**
- PyTorch (torch)
- Hugging Face Transformers (main branch from GitHub)
- Hugging Face PEFT (main branch from GitHub)
- Hugging Face Datasets
- Accelerate
- Bitsandbytes (for 8-bit loading)
- **Gradient checkpointing:** Enabled to save memory.
- **Mixed precision:** Enabled with fp16.
---
## Evaluation
### Testing Data
- Same dataset (Abirate/english_quotes).
- No held-out test set reported in training script.
### Metrics
- No formal metrics logged; evaluation was qualitative (checking generated tags).
### Results
- The model successfully learns to generate tags for English quotes after training.
---
## Environmental Impact
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute).
- **Hardware Type:** Single CUDA GPU T4
- **Cloud Provider:** Colab
---
## Technical Specifications
### Model Architecture and Objective
- Base model: **BLOOM-3B**, causal language modeling objective.
- Fine-tuned with **LoRA adapters** using PEFT.
### Compute Infrastructure
- **Hardware:** Single GPU (CUDA device 0).
- **Software:**
- PyTorch
- Hugging Face Transformers
- Hugging Face PEFT
- Hugging Face Datasets
- Accelerate
- Bitsandbytes
---
## Citation
If you use this model, please cite:
**BibTeX:**
```bibtex
@misc{jay24ai2025bloomlora,
title={LoRA Fine-Tuned BLOOM-3B for Quote Tagging},
author={Jay24-AI},
year={2025},
howpublished={\url{https://huggingface.co/Jay24-AI/bloom-3b-lora-tagger}}
}
---
## Model Card Contact
For questions or issues, contact the maintainer via Hugging Face discussions: https://huggingface.co/Jay24-AI/bloom-3b-lora-tagger/discussions