ThinkTim21 commited on
Commit
0ccd3cb
·
verified ·
1 Parent(s): 5b2eee0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -12
README.md CHANGED
@@ -103,18 +103,88 @@ for the budget task while both Fino-1 8B and FinPlan-1 performed well on the goa
103
  were benchmarked on GSM8K (grade school mathematics reasoning) as well as MMLU (general reasoning). While the domain specific LoRA tuning certainly led to a degredation in FinPlan-1's
104
  benchmark scores with respect to its underlying model Fino-1 8B, the drop in performance is rather small for MMLU and GSM8K performance remains above Llama 3.2 -3B Instruct.
105
 
106
-
107
-
108
-
109
- ## Uses
110
-
111
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
112
-
113
- ### Direct Use
114
-
115
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
116
-
117
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
118
 
119
  ### Downstream Use [optional]
120
 
 
103
  were benchmarked on GSM8K (grade school mathematics reasoning) as well as MMLU (general reasoning). While the domain specific LoRA tuning certainly led to a degredation in FinPlan-1's
104
  benchmark scores with respect to its underlying model Fino-1 8B, the drop in performance is rather small for MMLU and GSM8K performance remains above Llama 3.2 -3B Instruct.
105
 
106
+ ## Intended Useage
107
+
108
+ As described above this model is intended to be used to assist with the creation of simple financial plans for individuals, specifically for assistance with the creation of a budget
109
+ spreadsheet for tracking expenseses as well as planning for, short, medium and long term savings goals. While this model can be prompted on a wide range of other tasks, it is
110
+ not recommended to use this model for those purposes as it has been speficially fine tuned for these two tasks and perforamnce on tasks outside that scope could be diminished.
111
+
112
+ See below for the basic code required in order to import the model from huggingface using torch. Note the tokenizer is pulled from the Fino-1 8B repository as it was not changed
113
+ from the base Fino-1 8B model.
114
+
115
+ ```{python}\n
116
+ import os
117
+ os.environ['HF_HOME'] = "your/directory/here"
118
+ import torch
119
+ from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
120
+ from datasets import load_dataset #datasets is huggingface's dataset package
121
+ from peft import get_peft_model, LoraConfig, TaskType
122
+ import matplotlib.pyplot as plt
123
+ import numpy as np
124
+ import pandas as pd
125
+ import PIL
126
+
127
+ import lm_eval
128
+
129
+ tokenizer = AutoTokenizer.from_pretrained("TheFinAI/Fino1-8B")
130
+ model = AutoModelForCausalLM.from_pretrained("ThinkTim21/FinPlan-1")
131
+
132
+ # Prepare the model and tokenizer
133
+ tokenizer.pad_token = tokenizer.eos_token # set padding token to EOS token
134
+ model.config.poad_token_id = tokenizer.pad_token_id # set the padding token for model
135
+
136
+
137
+ budget = pd.read_csv("budget_dataset.csv") # use the dataset attached to this repo
138
+ goals = pd.read_csv("goals_dataset.csv") # use the dataset attached to this repo
139
+
140
+ budget['instruct_lora'] = budget.apply(
141
+ lambda row: f"Q: {row['question']}\n\nA: ",
142
+ axis=1
143
+ )
144
+
145
+ goals['instruct_lora'] = goals.apply(
146
+ lambda row: f"Q: {row['question']}\n\nA: ",
147
+ axis=1
148
+ )
149
+
150
+ from datasets import load_dataset, Dataset #datasets is huggingface's dataset package
151
+ budget = budget.sample(frac = 1, random_state = 42) # randomly shuffle DF
152
+ train_budget = budget[:2500]
153
+ val_budget = budget[2500:]
154
+ train_budget = Dataset.from_pandas(train_budget)
155
+ val_budget = Dataset.from_pandas(val_budget)
156
+ train_budget = train_budget.map(lambda samples: tokenizer(samples['instruct']), batched = True)
157
+ val_budget = val_budget.map(lambda samples: tokenizer(samples['instruct']), batched = True)
158
+
159
+ goals = goals.sample(frac = 1, random_state = 42) # randomly shuffle DF
160
+ train_goals = goals[:2500]
161
+ val_goals = goals[2500:]
162
+ train_goals = Dataset.from_pandas(train_goals)
163
+ val_goals = Dataset.from_pandas(val_goals)
164
+ train_goals = train_goals.map(lambda samples: tokenizer(samples['instruct']), batched = True)
165
+ val_goals = val_goals.map(lambda samples: tokenizer(samples['instruct']), batched = True)
166
+
167
+ formatted_prompt = f"Q: {val_goals[0]['question']}\n\nA: "
168
+ inputs = tokenizer.encode(formatted_prompt, return_tensors = "pt").to(model.device)
169
+ output = model.generate(inputs, max_new_tokens = 800, pad_token_id = tokenizer.pad_token_id, do_sample = False)
170
+ generated_text = tokenizer.decode(output[0], skip_special_tokens = True)
171
+ print(generated_text)
172
+ ```
173
+
174
+ ### Prompt Format
175
+
176
+ The prompt format varies between the budget task and the goals task.
177
+
178
+ For the budget task, the following prompt method is reccomended.
179
+
180
+ ```Q: I have an income of about 53255 a year and my monthly expenses include 2208 a month in rent and utilities, a 700 car payment, $300 in food, and about 205 a month in other expenses. Using python, can you create for me a budget spreadsheet and export it to excel?
181
+ ```
182
+
183
+ For the goals task, I reccomend using Few Shot Prompting, making use of the goals_dataset.csv file as your base and then adding your prefered prompt in the following format
184
+ to the few shot examples derived from the goals dataset.
185
+
186
+ ```Q: My short term goal is to save for a $3357 vacation in the next year, my medium term goal is to save for down payment for a new car, around 6867 in the next 2 or 3 years, and my long term goal is to save for a down payment for a house around 115061 in the next ten years, can you help me integrate these goals into my budget as well as where I should store these savings?
187
+ ```
188
 
189
  ### Downstream Use [optional]
190