| | --- |
| | license: other |
| | license_name: cc-by-nc-sa-4 |
| | license_link: https://creativecommons.org/licenses/by-nc-sa/4.0/ |
| | language: |
| | - en |
| | base_model: |
| | - Qwen/Qwen3-1.7B-Base |
| | tags: |
| | - WFPB |
| | - vegan |
| | - nutrition |
| | - biology |
| | - medical |
| | --- |
| | |
| | Worlds first Whole Foods Plant Based (WFPB) LLM. A Qwen3 1.7B model fine tuned with a LoRA adapter trained on 1255 synthetically generated training pairs from blog posts and video transcripts from Nutritionfacts.org. |
| | May output Chinese characters (Qwen3 is from Alibaba and multilingual) and long responses unless properly prompted. Created by Toby Miller. |
| | Linkedin: https://www.linkedin.com/in/robertmilleree/ |
| |
|
| | **note**: Chat template needs adjustments, but works with the following python script: |
| |
|
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| |
|
| | repo = "tobymiller2/Plant-Based-LLM" |
| | tokenizer = AutoTokenizer.from_pretrained(repo) |
| | model = AutoModelForCausalLM.from_pretrained(repo, dtype="auto") |
| |
|
| | prompt = "What is the best diet for human health and longevity? what about meat?" |
| | inputs = tokenizer(prompt, return_tensors="pt") |
| | outputs = model.generate( |
| | **inputs, |
| | max_new_tokens=500, |
| | do_sample=False, |
| | no_repeat_ngram_size=3, |
| | repetition_penalty=1.3, |
| | pad_token_id=tokenizer.eos_token_id, |
| | eos_token_id=tokenizer.eos_token_id |
| | ) |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | |
| | |
| | |
| | **Example output:** |
| | Q: ¨What is the best diet for human health and longevity?¨ |
| | A. ¨Plant-based diets rich in vegetables, fruits, legumes, grains, and seeds support long life with minimal chronic disease. Animal products increase inflammation and risk of cancer, while plants reduce it; this illustrates how whole food plant foods can prevent age-related diseases naturally. 植物性饮食是维持健康和延长寿命的最佳选择,它通过减少慢性病风险提供最大保护。¨ |
| | |
| | Note: The Chinese output translates to: |
| | “A plant-based diet is the best choice for maintaining health and extending lifespan, as it provides maximum protection by reducing the risk of chronic diseases.” |
| | |
| | The base model previously talked favoribly about meat. |
| | Qwen3-1.7B-base |
| | Q: ´´Does that include meat? is meat consumption ideal for optimal health? |
| | A: [long response, so here is the conclusion only] ¨**Conclusion** |
| | Meat can be ideal for optimal health when consumed in moderation and as part of a balanced diet. The key is to prioritize lean sources, limit processed options, and ensure variety in protein intake. For those with specific dietary needs (e.g., vegetarianism or low-sodium diets), adjustments are possible while maintaining overall health.¨ |
| | |
| | **Training data** |
| | 1255 synthetically generated training pairs from blog posts and video transcripts from Nutritionfacts.org. |
| | |
| | **Fine tuning setting for Llama Factory:** |
| | |
| | create_new_adapter: true |
| | cutoff_len: 512 |
| | dataset: - |
| | dataset_dir: - |
| | ddp_timeout: 180000000 |
| | do_train: true |
| | double_quantization: true |
| | enable_thinking: false |
| | eval_steps: 100 |
| | eval_strategy: steps |
| | finetuning_type: lora |
| | flash_attn: auto |
| | fp16: true |
| | gradient_accumulation_steps: 4 |
| | include_num_input_tokens_seen: true |
| | learning_rate: 0.0002 |
| | logging_steps: 5 |
| | lora_alpha: 128 |
| | lora_dropout: 0.1 |
| | lora_rank: 128 |
| | lora_target: all |
| | lr_scheduler_type: cosine |
| | max_grad_norm: 1.0 |
| | max_samples: 100000 |
| | model_name_or_path: - |
| | num_train_epochs: 4.0 |
| | optim: adamw_torch |
| | output_dir: - |
| | packing: true |
| | per_device_eval_batch_size: 1 |
| | per_device_train_batch_size: 1 |
| | plot_loss: true |
| | preprocessing_num_workers: 16 |
| | quantization_bit: 4 |
| | quantization_method: bnb |
| | report_to: none |
| | save_steps: 100 |
| | stage: sft |
| | template: default |
| | trust_remote_code: true |
| | use_rslora: true |
| | val_size: 0.15 |
| | warmup_steps: 0 |