|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-generation |
|
|
- ad-generation |
|
|
- marketing |
|
|
- transformers |
|
|
- pytorch |
|
|
- beam-search |
|
|
--- |
|
|
|
|
|
# # Model Card for Falcon-RW-1B Fine-Tuned Model |
|
|
|
|
|
This model is a fine-tuned version of `tiiuae/falcon-rw-1b` trained on an advertising-related dataset to generate ad text based on prompts. |
|
|
|
|
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This model is a fine-tuned version of the Falcon-RW-1B model, specifically adapted for generating advertising content. The fine-tuning process utilized a dataset containing ad-related text, formatted as structured prompt-response pairs. |
|
|
|
|
|
- **Developed by:** Adnane Touiyate |
|
|
- **Funded by [optional]:** [Adnane10](https://huggingface.co/Adnane10) |
|
|
- **Shared by [optional]:** [Adnane10](https://huggingface.co/Adnane10) |
|
|
- **Model type:** Falcon-RW-1B (Causal Language Model) |
|
|
- **Language(s) (NLP):** English |
|
|
- **License:** MIT |
|
|
- **Finetuned from model [optional]:** `tiiuae/falcon-rw-1b` |
|
|
|
|
|
|
|
|
## Uses |
|
|
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
This model can be used for generating advertising content based on structured prompts. It is useful for marketers and advertisers who need AI-generated ad copies. |
|
|
|
|
|
|
|
|
### Downstream Use [optional] |
|
|
|
|
|
The model can be further fine-tuned for specific ad categories or integrated into larger marketing automation workflows. |
|
|
|
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
This model is not intended for generating non-advertising-related content, and its performance may be suboptimal in general text generation tasks beyond its training scope. |
|
|
|
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
Since the model has been fine-tuned on advertising content, it may inherit biases present in the dataset. Users should be cautious when generating ads to ensure they meet ethical and regulatory standards. |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
Users should validate the generated content for appropriateness, compliance, and factual accuracy before using it in real-world applications. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Use the code below to load and use the model: |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("tiiuae/falcon-rw-1b") |
|
|
model = AutoModelForCausalLM.from_pretrained("path_to_finetuned_model") |
|
|
|
|
|
def generate_ad(prompt): |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to('cuda') |
|
|
outputs = model.generate(**inputs, max_length=100) |
|
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
print(generate_ad("Introducing our latest product: ")) |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
The model was trained on `fixed_ads_list.json`, a dataset containing structured ad-related prompts and responses. |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
- **Preprocessing:** Tokenized text in the format `### Prompt: [User Input] ### Response: [Ad Text]` |
|
|
- **Quantization:** Used 4-bit quantization (NF4) with `bitsandbytes` for efficiency. |
|
|
- **Fine-tuning method:** LoRA (Low-Rank Adaptation) for efficient adaptation. |
|
|
- **Hardware:** GPU-accelerated training. |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Learning Rate:** 1e-4 |
|
|
- **Batch Size:** 2 (per device) |
|
|
- **Gradient Accumulation:** 8 steps |
|
|
- **Epochs:** 6 |
|
|
- **Precision:** BF16 |
|
|
- **Evaluation Strategy:** Epoch-based |
|
|
- **Early Stopping:** Enabled after 2 epochs without improvement |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
|
|
- **Metrics:** BLEU and ROUGE scores |
|
|
- **Results:** Sample evaluation showed: |
|
|
|
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
- **Hardware Type:** NVIDIA P100 GPU |
|
|
- **Hours used:** ~54 minutes |
|
|
- **Cloud Provider:** Kaggle |
|
|
|
|
|
### Model Architecture and Objective |
|
|
|
|
|
The Falcon-RW-1B model is a causal language model optimized for text generation. |
|
|
|
|
|
### Compute Infrastructure |
|
|
|
|
|
#### Hardware |
|
|
|
|
|
- GPUs (NVIDIA P100) |
|
|
- Used `bitsandbytes` for memory-efficient training |
|
|
|
|
|
#### Software |
|
|
|
|
|
- `transformers` |
|
|
- `datasets` |
|
|
- `peft` |
|
|
- `torch` |
|
|
- `accelerate` |
|
|
- `bitsandbytes` |
|
|
|
|
|
## Model Card Authors |
|
|
|
|
|
**Adnane Touiyate** ([@Adnane10](https://huggingface.co/Adnane10)) |
|
|
|
|
|
## Contact |
|
|
|
|
|
For questions or collaborations, reach out via [LinkedIn](https://www.linkedin.com/in/adnanetouiyate/) or email: [adnanetouiayte11@gmail.com](mailto:adnanetouiayte11@gmail.com) |