Update README.md

a36f28e about 2 years ago

3.7 kB

	---
	library_name: peft
	---
	## Training procedure


	The following `bitsandbytes` quantization config was used during training:
	- quant_method: bitsandbytes
	- load_in_8bit: False
	- load_in_4bit: True
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: nf4
	- bnb_4bit_use_double_quant: True
	- bnb_4bit_compute_dtype: float16

	The following `bitsandbytes` quantization config was used during training:
	- quant_method: bitsandbytes
	- load_in_8bit: False
	- load_in_4bit: True
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: nf4
	- bnb_4bit_use_double_quant: True
	- bnb_4bit_compute_dtype: float16
	### Framework versions

	- PEFT 0.5.0

	- PEFT 0.5.0
	# Inference Code

	### Install required libraries

	```python
	!pip install transformers peft
	```
	### Login
	```python
	from huggingface_hub import login

	token = "Your Key"
	login(token)
	```
	#### Import necessary modules
	```python
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch
	from transformers import BitsAndBytesConfig
	from peft import prepare_model_for_kbit_training
	```
	#### Load PEFT model and configuration
	```python
	config = PeftConfig.from_pretrained("Shreyas45/Llama2_Text-to-SQL_Fintuned")
	peft_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
	peft_model = PeftModel.from_pretrained(peft_model, "Shreyas45/Llama2_Text-to-SQL_Fintuned")
	```
	### Load trained model and tokenizer
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	from peft import prepare_model_for_kbit_training

	trained_model_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path, trust_remote_code=True)
	trained_model_tokenizer.pad_token = trained_model_tokenizer.eos_token
	```

	### Define a SQL query
	```python
	query = '''In the table named management with columns (department_id VARCHAR, temporary_acting VARCHAR);
	CREATE TABLE department (name VARCHAR, num_employees VARCHAR, department_id VARCHAR),
	Show the name and number of employees for the departments managed by heads whose temporary acting value is 'Yes'?'''
	```

	### Construct prompt
	```python
	prompt = f'''### Instruction: Below is an instruction that describes a task and the schema of the table in the database.
	Write a response that generates a request in the form of a SQL query.
	Here the schema of the table is mentioned first followed by the question for which the query needs to be generated.
	And the question is: {query}
	###Output: '''
	```
	### Tokenize the prompt
	```python
	encodings = trained_model_tokenizer(prompt, return_tensors='pt')
	```
	#### Configure generation parameters
	```python

	generation_config = peft_model.generation_config
	generation_config.max_new_token = 1024
	generation_config.temperature = 0.7
	generation_config.top_p = 0.7
	generation_config.num_return_sequence = 1
	generation_config.pad_token_id = trained_model_tokenizer.pad_token_id
	generation_config.eos_token_id = trained_model_tokenizer.eos_token_id
	```
	### Generate SQL query using the model
	```python
	with torch.inference_mode():
	outputs = peft_model.generate(
	input_ids=encodings.input_ids,
	attention_mask=encodings.attention_mask,
	generation_config=generation_config,
	max_new_tokens=100
	)
	```
	### Decode and print the generated SQL query
	```python
	generated_query = trained_model_tokenizer.decode(outputs[0])
	print("Generated SQL Query:")
	print(generated_query)
	```