Librarian Bot: Add base_model information to model

c5018f4 over 2 years ago

3.55 kB

	---
	language:
	- en
	license: llama2
	library_name: peft
	tags:
	- law
	- startups
	- finance
	- tax
	- Algerian
	datasets:
	- TuningAI/Startups_V2
	pipeline_tag: conversational
	base_model: meta-llama/Llama-2-13b-chat-hf
	---

	## Model Name: Llama2_13B_startup_Assistant

	## Description:

	Llama2_13B_startup_Assistant is a highly specialized language model fine-tuned from Meta's Llama2_13B.
	It has been tailored to assist with inquiries related to Algerian startups, offering valuable insights and guidance in these domains.

	## Base Model:
	This model is based on the Meta's meta-llama/Llama-2-13b-chat-hf architecture,
	making it a highly capable foundation for generating human-like text responses.

	## Dataset :
	This model was fine-tuned on a custom dataset meticulously curated with more than 200 unique examples.
	The dataset incorporates both manual entries and contributions from GPT3.5, GPT4, and Falcon 180B models.

	## Fine-tuning Techniques:
	Fine-tuning was performed using QLoRA (Quantized LoRA), an extension of LoRA that introduces quantization for enhanced parameter efficiency.
	The model benefits from 4-bit NormalFloat (NF4) quantization and Double Quantization techniques, ensuring optimized performance.

	## Performance:
	Llama2_13B_startup_Assistant exhibits improved performance and efficiency in addressing queries related to Algerian tax law and startups,
	making it a valuable resource for individuals and businesses navigating these areas.

	## Limitations:

	* While highly specialized, this model may not cover every nuanced aspect of Algerian tax law or the startup ecosystem.
	* Accuracy may vary depending on the complexity and specificity of questions.
	* It may not provide legal advice, and users should seek professional consultation for critical legal matters.

	## Training procedure
	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: False
	- load_in_4bit: True
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: nf4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float16
	### Framework versions
	- PEFT 0.4.0
	```
	! huggingface-cli login
	```
	```python
	from transformers import pipeline
	from transformers import AutoTokenizer
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM , BitsAndBytesConfig
	import torch

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=getattr(torch, "float16"),
	bnb_4bit_use_double_quant=False)
	model = AutoModelForCausalLM.from_pretrained(
	"meta-llama/Llama-2-13b-chat-hf",
	quantization_config=bnb_config,
	device_map={"": 0})
	model.config.use_cache = False
	model.config.pretraining_tp = 1
	model = PeftModel.from_pretrained(model, "TuningAI/Llama2_13B_startup_Assistant")
	tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-13b-chat-hf" , trust_remote_code=True)
	tokenizer.pad_token = tokenizer.eos_token
	tokenizer.padding_side = "right"
	system_message = "Given a user's startup-related question in English, you will generate a thoughtful answer in English."
	while 1:
	input_text = input(">>>")
	logging.set_verbosity(logging.CRITICAL)
	prompt = f"[INST] <<SYS>>\n{system_message}\n<</SYS>>\n\n {input_text}. [/INST]"
	pipe = pipeline(task="text-generation", model=new_model, tokenizer=tokenizer, max_length=512)
	result = pipe(prompt)
	print(result[0]['generated_text'].replace(prompt, ''))
	```