|
|
--- |
|
|
license: other |
|
|
base_model: microsoft/phi-1_5 |
|
|
tags: |
|
|
- bees |
|
|
- honey |
|
|
- bzz |
|
|
metrics: |
|
|
- accuracy |
|
|
datasets: |
|
|
- BEE-spoke-data/bees-internal |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# phi-1bee5 🐝 |
|
|
|
|
|
> Where Code Meets Beekeeping: An Unbeelievable Synergy! |
|
|
|
|
|
<a href="https://colab.research.google.com/gist/pszemraj/7ea68b3b71ee4e6c0729d2318f3f4158/we-bee-testing.ipynb"> |
|
|
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> |
|
|
</a> |
|
|
|
|
|
Have you ever found yourself in the depths of a debugging session and thought, "I wish I could be basking in the glory of a blooming beehive right now"? Or maybe you've been donning your beekeeping suit, puffing on your smoker, and longed for the sweet aroma of freshly written code? |
|
|
|
|
|
Well, brace yourselves, hive-minded humans and syntax-loving sapiens, for `phi-1bee5`, a groundbreaking transformer model that's here to disrupt your apiary and your IDE! |
|
|
|
|
|
|
|
|
## Details |
|
|
|
|
|
This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the `BEE-spoke-data/bees-internal` dataset. |
|
|
|
|
|
It achieves the following results on the evaluation set: |
|
|
- Loss: 2.6982 |
|
|
- Accuracy: 0.4597 |
|
|
|
|
|
## Usage |
|
|
|
|
|
load model: |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
# !pip install -U -q transformers accelerate einops |
|
|
|
|
|
checkpoint = "BEE-spoke-data/phi-1bee5" |
|
|
tokenizer = AutoTokenizer.from_pretrained(checkpoint) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
checkpoint, |
|
|
device_map="auto", |
|
|
torch_dtype=torch.float16, |
|
|
trust_remote_code=True |
|
|
) |
|
|
``` |
|
|
Run inference: |
|
|
|
|
|
```python |
|
|
prompt = "Today was an amazing day because" |
|
|
inputs = tokenizer(prompt, return_tensors="pt", return_attention_mask=False).to( |
|
|
model.device |
|
|
) |
|
|
|
|
|
outputs = model.generate( |
|
|
**inputs, do_sample=True, max_new_tokens=128, epsilon_cutoff=7e-4 |
|
|
) |
|
|
result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0] |
|
|
print(result) |
|
|
# output will probably contain a story/info about bees |
|
|
``` |
|
|
|
|
|
|
|
|
### Intended Uses: |
|
|
|
|
|
1. **Educational Edification**: Are you a coding novice with a budding interest in beekeeping? Or perhaps a seasoned developer whose curiosity has been piqued by the buzzing in your backyard? phi-1bee5 aims to serve as a fun, informative bridge between these two worlds. |
|
|
2. **Casual Queries**: This model can generate code examples and beekeeping tips. It's perfect for those late-night coding sessions when you feel like taking a virtual stroll through an apiary. |
|
|
3. **Academic & Research Insights**: Interested in interdisciplinary studies that explore the intersection of technology and ecology? phi-1bee5 might offer some amusing, if not entirely accurate, insights. |
|
|
|
|
|
### Limitations: |
|
|
|
|
|
1. **Not a beekeeping expert**: For the love of all things hexagonal, please do not use phi-1bee5 to make serious beekeeping decisions. While our model is well read in the beekeeping literature, it lacks the practical experience and nuanced understanding that professional beekeepers possess. |
|
|
2. **Licensing**: This model is derived from a base model under the Microsoft Research License. Any use must comply with the terms of that license. |
|
|
3. **Infallibility:** Like any machine learning model, phi-1bee5 can make mistakes. Always double check the code and bee facts before using it in production or in your hive. |
|
|
4. **Ethical Constraints**: This model may not be used for illegal or unethical activities, including but not limited to terrorism, harassment, or spreading disinformation. |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
While the full dataset is not yet complete and therefore not yet released for "safety reasons", you can check out a preliminary sample at: [bees-v0](https://huggingface.co/datasets/BEE-spoke-data/bees-v0) |
|
|
|
|
|
### Training hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- learning_rate: 0.0001 |
|
|
- train_batch_size: 1 |
|
|
- eval_batch_size: 2 |
|
|
- gradient_accumulation_steps: 32 |
|
|
- total_train_batch_size: 32 |
|
|
- optimizer: Adam with betas=(0.9,0.995) and epsilon=1e-08 |
|
|
- lr_scheduler_type: cosine |
|
|
- lr_scheduler_warmup_ratio: 0.03 |
|
|
- num_epochs: 2.0 |