| | --- |
| | library_name: transformers |
| | tags: [causal-lm, bloom, lora, peft, finetuning, english] |
| | --- |
| | |
| | # Model Card for Jay24-AI/bloom-7b1-lora-tagger |
| |
|
| | This model is a **LoRA fine-tuned version of BigScience’s BLOOM-7B1** model, trained on a dataset of English quotes. The goal was to adapt BLOOM using the [PEFT](https://github.com/huggingface/peft) (Parameter-Efficient Fine-Tuning) approach with [LoRA](https://arxiv.org/abs/2106.09685), making it lightweight to train and efficient for deployment. |
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| |
|
| | - **Developed by:** Jay24-AI |
| | - **Funded by [optional]:** N/A |
| | - **Shared by [optional]:** Jay24-AI |
| | - **Model type:** Causal Language Model with LoRA adapters |
| | - **Language(s):** English |
| | - **License:** Apache-2.0 (inherited from `bigscience/bloom-7b1`; LoRA adapters are MIT-compatible) |
| | - **Finetuned from model:** [bigscience/bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) |
| |
|
| | ### Model Sources |
| |
|
| | - **Repository:** https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger |
| |
|
| | ## Uses |
| |
|
| | ### Direct Use |
| |
|
| | The model can be used for **text generation and tagging** based on quote-like prompts. For example, you can input a quote, and the model will generate descriptive tags. |
| |
|
| | ### Downstream Use |
| |
|
| | - Can be further fine-tuned on custom tagging or classification datasets. |
| | - Could be integrated into applications that require lightweight **quote classification**, **text annotation**, or **prompt-based generation**. |
| |
|
| | ### Out-of-Scope Use |
| |
|
| | - Not suitable for factual question answering. |
| | - Not designed for sensitive or high-stakes decision-making (e.g., medical, legal, or financial advice). |
| |
|
| | ## Bias, Risks, and Limitations |
| |
|
| | - Inherits limitations and biases from **BLOOM-7B1** (trained on large-scale internet data). |
| | - The fine-tuned dataset (`Abirate/english_quotes`) is relatively small, so the model may overfit and generalize poorly outside similar data. |
| | - Risk of generating irrelevant or biased tags if prompted outside the intended scope. |
| | - Limited training (50 steps) may result in suboptimal performance. |
| |
|
| | ### Recommendations |
| |
|
| | Users should: |
| |
|
| | - Validate outputs before production use. |
| | - Avoid relying on the model for critical applications. |
| |
|
| | ## How to Get Started with the Model |
| |
|
| | ```python |
| | import torch |
| | from peft import PeftModel, PeftConfig |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | peft_model_id = "Jay24-AI/bloom-7b1-lora-tagger" |
| | config = PeftConfig.from_pretrained(peft_model_id) |
| | model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto') |
| | tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) |
| | |
| | # Load the Lora model |
| | model = PeftModel.from_pretrained(model, peft_model_id) |
| | |
| | batch = tokenizer("“The only way to do great work is to love what you do.” ->: ", return_tensors='pt') |
| | |
| | with torch.cuda.amp.autocast(): |
| | output_tokens = model.generate(**batch, max_new_tokens=50) |
| | |
| | print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | ### Training Data |
| |
|
| | - **Dataset used:** [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes) |
| | - **Subset:** Entire training split (exact size not specified in script). |
| | - **Structure:** Each entry includes a `quote` and its corresponding `tags`. |
| | - **Preprocessing:** |
| | - Combined the `quote` and `tags` into a single text string: `<quote> ->: <tags>` |
| | - Tokenized using the `AutoTokenizer` from **bigscience/bloom-7b1**. |
| | - Applied batching via Hugging Face `datasets.map` with `batched=True`. |
| |
|
| | ### Training Procedure |
| |
|
| | #### Preprocessing |
| |
|
| | - Converted text examples into the `"quote ->: tags"` format. |
| | - Tokenized using Bloom’s tokenizer with default settings. |
| | - Applied `DataCollatorForLanguageModeling` with `mlm=False` (causal LM objective). |
| |
|
| | #### Training Hyperparameters |
| |
|
| | - **Base model:** bigscience/bloom-7b1 |
| | - **Adapter method:** LoRA via PEFT |
| | - **LoRA configuration:** |
| | - `r`: 8 |
| | - `lora_alpha`: 16 |
| | - `lora_dropout`: 0.05 |
| | - `bias`: "none" |
| | - `task_type`: "CAUSAL_LM" |
| | - **TrainingArguments:** |
| | - `per_device_train_batch_size`: 2 |
| | - `gradient_accumulation_steps`: 2 |
| | - `warmup_steps`: 100 |
| | - `max_steps`: 50 |
| | - `learning_rate`: 2e-4 |
| | - `fp16`: True |
| | - `logging_steps`: 1 |
| | - `output_dir`: `outputs/` |
| | - **Precision regime:** Mixed precision (fp16) with 8-bit quantization via `bitsandbytes`. |
| | - **Caching:** `model.config.use_cache = False` during training to suppress warnings. |
| | - **Additional Settings:** |
| | - Original model weights frozen; small parameters (e.g., layer normalization) cast to FP32 for stability. |
| | - Gradient checkpointing enabled to reduce memory usage. |
| | - `lm_head` modified to output FP32 for stability. |
| |
|
| | #### Hyperparameter Summary |
| |
|
| | | Hyperparameter | Value | |
| | |-----------------------------|------------------------| |
| | | Base model | bigscience/bloom-7b1 | |
| | | Adapter method | LoRA (via PEFT) | |
| | | LoRA r | 8 | |
| | | LoRA alpha | 16 | |
| | | LoRA dropout | 0.05 | |
| | | Bias | none | |
| | | Task type | Causal LM | |
| | | Batch size (per device) | 2 | |
| | | Gradient accumulation steps | 2 | |
| | | Effective batch size | 4 | |
| | | Warmup steps | 100 | |
| | | Max steps | 50 | |
| | | Learning rate | 2e-4 | |
| | | Precision | fp16 (mixed precision) | |
| | | Quantization | 8-bit (bitsandbytes) | |
| | | Logging steps | 1 | |
| | | Output directory | outputs/ | |
| | | Gradient checkpointing | Enabled | |
| | | Use cache | False (during training)| |
| |
|
| | ### Speeds, Sizes, Times |
| |
|
| | - **Trainable parameters:** LoRA adapters only (~0.1% of BLOOM-7B1’s ~7.1 billion parameters, exact count printed via `print_trainable_parameters`). |
| | - **Approx. size:** Much smaller than 7B full checkpoint since only adapters are stored. |
| | - **Max steps:** 50 (~100 updates with gradient accumulation). |
| | - **Training runtime:** Not logged in script; depends on GPU. |
| | - **Batch size effective:** 4 (2 × accumulation steps of 2). |
| |
|
| | ### Compute Infrastructure |
| |
|
| | - **Hardware:** Single CUDA GPU (set with `os.environ["CUDA_VISIBLE_DEVICES"]="0"`; specific GPU model not specified, e.g., A100, T4, V100). |
| | - **Software:** |
| | - PyTorch |
| | - Hugging Face Transformers (main branch from GitHub) |
| | - Hugging Face PEFT (main branch from GitHub) |
| | - Hugging Face Datasets |
| | - Accelerate |
| | - Bitsandbytes (for 8-bit quantization) |
| | - **Gradient checkpointing:** Enabled to save memory. |
| | - **Mixed precision:** Enabled with fp16. |
| | - **Quantization:** 8-bit with double quantization, `nf8` type, `torch.float16` compute dtype. |
| |
|
| | ## Evaluation |
| |
|
| | ### Testing Data |
| |
|
| | - Same dataset (`Abirate/english_quotes`). |
| | - No held-out test set reported in training script. |
| |
|
| | ### Metrics |
| |
|
| | - No formal metrics logged; evaluation was qualitative (checking generated tags). |
| |
|
| | ### Results |
| |
|
| | - The model successfully learns to generate tags for English quotes after training, as demonstrated by the inference example. |
| |
|
| | ## Environmental Impact |
| |
|
| | Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute). |
| |
|
| | - **Hardware Type:** CUDA Single GPU: T4 |
| | - **Cloud Provider:** Colab |
| |
|
| | ## Technical Specifications |
| |
|
| | ### Model Architecture and Objective |
| |
|
| | - Base model: **BLOOM-7B1**, causal language modeling objective. |
| | - Fine-tuned with **LoRA adapters** using PEFT. |
| |
|
| | ### Compute Infrastructure |
| |
|
| | - **Hardware:** Single GPU (CUDA device 0). |
| | - **Software:** |
| | - PyTorch |
| | - Hugging Face Transformers |
| | - Hugging Face PEFT |
| | - Hugging Face Datasets |
| | - Accelerate |
| | - Bitsandbytes |
| |
|
| | ## Citation |
| |
|
| | If you use this model, please cite: |
| |
|
| | ```bibtex |
| | @misc{jay24ai2025bloomlora, |
| | title={LoRA Fine-Tuned BLOOM-7B1 for Quote Tagging}, |
| | author={Jay24-AI}, |
| | year={2025}, |
| | howpublished={\url{https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger}} |
| | } |
| | ``` |
| |
|
| | ## Model Card Contact |
| |
|
| | For questions or issues, contact the maintainer via Hugging Face discussions: https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger/discussions |