--- library_name: transformers tags: [causal-lm, bloom, lora, peft, finetuning, english] --- # Model Card for Jay24-AI/bloom-7b1-lora-tagger This model is a **LoRA fine-tuned version of BigScience’s BLOOM-7B1** model, trained on a dataset of English quotes. The goal was to adapt BLOOM using the [PEFT](https://github.com/huggingface/peft) (Parameter-Efficient Fine-Tuning) approach with [LoRA](https://arxiv.org/abs/2106.09685), making it lightweight to train and efficient for deployment. ## Model Details ### Model Description - **Developed by:** Jay24-AI - **Funded by [optional]:** N/A - **Shared by [optional]:** Jay24-AI - **Model type:** Causal Language Model with LoRA adapters - **Language(s):** English - **License:** Apache-2.0 (inherited from `bigscience/bloom-7b1`; LoRA adapters are MIT-compatible) - **Finetuned from model:** [bigscience/bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) ### Model Sources - **Repository:** https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger ## Uses ### Direct Use The model can be used for **text generation and tagging** based on quote-like prompts. For example, you can input a quote, and the model will generate descriptive tags. ### Downstream Use - Can be further fine-tuned on custom tagging or classification datasets. - Could be integrated into applications that require lightweight **quote classification**, **text annotation**, or **prompt-based generation**. ### Out-of-Scope Use - Not suitable for factual question answering. - Not designed for sensitive or high-stakes decision-making (e.g., medical, legal, or financial advice). ## Bias, Risks, and Limitations - Inherits limitations and biases from **BLOOM-7B1** (trained on large-scale internet data). - The fine-tuned dataset (`Abirate/english_quotes`) is relatively small, so the model may overfit and generalize poorly outside similar data. - Risk of generating irrelevant or biased tags if prompted outside the intended scope. - Limited training (50 steps) may result in suboptimal performance. ### Recommendations Users should: - Validate outputs before production use. - Avoid relying on the model for critical applications. ## How to Get Started with the Model ```python import torch from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer peft_model_id = "Jay24-AI/bloom-7b1-lora-tagger" config = PeftConfig.from_pretrained(peft_model_id) model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=True, device_map='auto') tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path) # Load the Lora model model = PeftModel.from_pretrained(model, peft_model_id) batch = tokenizer("“The only way to do great work is to love what you do.” ->: ", return_tensors='pt') with torch.cuda.amp.autocast(): output_tokens = model.generate(**batch, max_new_tokens=50) print('\n\n', tokenizer.decode(output_tokens[0], skip_special_tokens=True)) ``` ## Training Details ### Training Data - **Dataset used:** [Abirate/english_quotes](https://huggingface.co/datasets/Abirate/english_quotes) - **Subset:** Entire training split (exact size not specified in script). - **Structure:** Each entry includes a `quote` and its corresponding `tags`. - **Preprocessing:** - Combined the `quote` and `tags` into a single text string: ` ->: ` - Tokenized using the `AutoTokenizer` from **bigscience/bloom-7b1**. - Applied batching via Hugging Face `datasets.map` with `batched=True`. ### Training Procedure #### Preprocessing - Converted text examples into the `"quote ->: tags"` format. - Tokenized using Bloom’s tokenizer with default settings. - Applied `DataCollatorForLanguageModeling` with `mlm=False` (causal LM objective). #### Training Hyperparameters - **Base model:** bigscience/bloom-7b1 - **Adapter method:** LoRA via PEFT - **LoRA configuration:** - `r`: 8 - `lora_alpha`: 16 - `lora_dropout`: 0.05 - `bias`: "none" - `task_type`: "CAUSAL_LM" - **TrainingArguments:** - `per_device_train_batch_size`: 2 - `gradient_accumulation_steps`: 2 - `warmup_steps`: 100 - `max_steps`: 50 - `learning_rate`: 2e-4 - `fp16`: True - `logging_steps`: 1 - `output_dir`: `outputs/` - **Precision regime:** Mixed precision (fp16) with 8-bit quantization via `bitsandbytes`. - **Caching:** `model.config.use_cache = False` during training to suppress warnings. - **Additional Settings:** - Original model weights frozen; small parameters (e.g., layer normalization) cast to FP32 for stability. - Gradient checkpointing enabled to reduce memory usage. - `lm_head` modified to output FP32 for stability. #### Hyperparameter Summary | Hyperparameter | Value | |-----------------------------|------------------------| | Base model | bigscience/bloom-7b1 | | Adapter method | LoRA (via PEFT) | | LoRA r | 8 | | LoRA alpha | 16 | | LoRA dropout | 0.05 | | Bias | none | | Task type | Causal LM | | Batch size (per device) | 2 | | Gradient accumulation steps | 2 | | Effective batch size | 4 | | Warmup steps | 100 | | Max steps | 50 | | Learning rate | 2e-4 | | Precision | fp16 (mixed precision) | | Quantization | 8-bit (bitsandbytes) | | Logging steps | 1 | | Output directory | outputs/ | | Gradient checkpointing | Enabled | | Use cache | False (during training)| ### Speeds, Sizes, Times - **Trainable parameters:** LoRA adapters only (~0.1% of BLOOM-7B1’s ~7.1 billion parameters, exact count printed via `print_trainable_parameters`). - **Approx. size:** Much smaller than 7B full checkpoint since only adapters are stored. - **Max steps:** 50 (~100 updates with gradient accumulation). - **Training runtime:** Not logged in script; depends on GPU. - **Batch size effective:** 4 (2 × accumulation steps of 2). ### Compute Infrastructure - **Hardware:** Single CUDA GPU (set with `os.environ["CUDA_VISIBLE_DEVICES"]="0"`; specific GPU model not specified, e.g., A100, T4, V100). - **Software:** - PyTorch - Hugging Face Transformers (main branch from GitHub) - Hugging Face PEFT (main branch from GitHub) - Hugging Face Datasets - Accelerate - Bitsandbytes (for 8-bit quantization) - **Gradient checkpointing:** Enabled to save memory. - **Mixed precision:** Enabled with fp16. - **Quantization:** 8-bit with double quantization, `nf8` type, `torch.float16` compute dtype. ## Evaluation ### Testing Data - Same dataset (`Abirate/english_quotes`). - No held-out test set reported in training script. ### Metrics - No formal metrics logged; evaluation was qualitative (checking generated tags). ### Results - The model successfully learns to generate tags for English quotes after training, as demonstrated by the inference example. ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute). - **Hardware Type:** CUDA Single GPU: T4 - **Cloud Provider:** Colab ## Technical Specifications ### Model Architecture and Objective - Base model: **BLOOM-7B1**, causal language modeling objective. - Fine-tuned with **LoRA adapters** using PEFT. ### Compute Infrastructure - **Hardware:** Single GPU (CUDA device 0). - **Software:** - PyTorch - Hugging Face Transformers - Hugging Face PEFT - Hugging Face Datasets - Accelerate - Bitsandbytes ## Citation If you use this model, please cite: ```bibtex @misc{jay24ai2025bloomlora, title={LoRA Fine-Tuned BLOOM-7B1 for Quote Tagging}, author={Jay24-AI}, year={2025}, howpublished={\url{https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger}} } ``` ## Model Card Contact For questions or issues, contact the maintainer via Hugging Face discussions: https://huggingface.co/Jay24-AI/bloom-7b1-lora-tagger/discussions