phi-1bee5 / README.md

Super-squash branch 'main' using huggingface_hub

9583df7 verified about 1 month ago

4.09 kB

	---
	license: other
	base_model: microsoft/phi-1_5
	tags:
	- bees
	- honey
	- bzz
	metrics:
	- accuracy
	datasets:
	- BEE-spoke-data/bees-internal
	language:
	- en
	pipeline_tag: text-generation
	---

	# phi-1bee5 🐝

	> Where Code Meets Beekeeping: An Unbeelievable Synergy!

	<a href="https://colab.research.google.com/gist/pszemraj/7ea68b3b71ee4e6c0729d2318f3f4158/we-bee-testing.ipynb">
	<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
	</a>

	Have you ever found yourself in the depths of a debugging session and thought, "I wish I could be basking in the glory of a blooming beehive right now"? Or maybe you've been donning your beekeeping suit, puffing on your smoker, and longed for the sweet aroma of freshly written code?

	Well, brace yourselves, hive-minded humans and syntax-loving sapiens, for `phi-1bee5`, a groundbreaking transformer model that's here to disrupt your apiary and your IDE!


	## Details

	This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-1_5) on the `BEE-spoke-data/bees-internal` dataset.

	It achieves the following results on the evaluation set:
	- Loss: 2.6982
	- Accuracy: 0.4597

	## Usage

	load model:

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	# !pip install -U -q transformers accelerate einops

	checkpoint = "BEE-spoke-data/phi-1bee5"
	tokenizer = AutoTokenizer.from_pretrained(checkpoint)
	model = AutoModelForCausalLM.from_pretrained(
	checkpoint,
	device_map="auto",
	torch_dtype=torch.float16,
	trust_remote_code=True
	)
	```
	Run inference:

	```python
	prompt = "Today was an amazing day because"
	inputs = tokenizer(prompt, return_tensors="pt", return_attention_mask=False).to(
	model.device
	)

	outputs = model.generate(
	**inputs, do_sample=True, max_new_tokens=128, epsilon_cutoff=7e-4
	)
	result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
	print(result)
	# output will probably contain a story/info about bees
	```


	### Intended Uses:

	1. Educational Edification: Are you a coding novice with a budding interest in beekeeping? Or perhaps a seasoned developer whose curiosity has been piqued by the buzzing in your backyard? phi-1bee5 aims to serve as a fun, informative bridge between these two worlds.
	2. Casual Queries: This model can generate code examples and beekeeping tips. It's perfect for those late-night coding sessions when you feel like taking a virtual stroll through an apiary.
	3. Academic & Research Insights: Interested in interdisciplinary studies that explore the intersection of technology and ecology? phi-1bee5 might offer some amusing, if not entirely accurate, insights.

	### Limitations:

	1. Not a beekeeping expert: For the love of all things hexagonal, please do not use phi-1bee5 to make serious beekeeping decisions. While our model is well read in the beekeeping literature, it lacks the practical experience and nuanced understanding that professional beekeepers possess.
	2. Licensing: This model is derived from a base model under the Microsoft Research License. Any use must comply with the terms of that license.
	3. Infallibility: Like any machine learning model, phi-1bee5 can make mistakes. Always double check the code and bee facts before using it in production or in your hive.
	4. Ethical Constraints: This model may not be used for illegal or unethical activities, including but not limited to terrorism, harassment, or spreading disinformation.

	## Training procedure

	While the full dataset is not yet complete and therefore not yet released for "safety reasons", you can check out a preliminary sample at: [bees-v0](https://huggingface.co/datasets/BEE-spoke-data/bees-v0)

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0001
	- train_batch_size: 1
	- eval_batch_size: 2
	- gradient_accumulation_steps: 32
	- total_train_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.995) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.03
	- num_epochs: 2.0