JamesConley
/

glados_starcoder

Model card Files Files and versions

JamesConley commited on May 20, 2023

Commit

c87ed0b

·

1 Parent(s): ff3e270

Create README.md

Files changed (1) hide show

README.md +74 -0

README.md ADDED Viewed

	@@ -0,0 +1,74 @@

+GLaDOS speaks Markdown!
+Usage
+To use this model, you must first navigate to the bigcode starcoder model and accept their license, then create a token for your account and update the below code with it.
+```
+import torch
+from peft import PeftModel, PeftConfig
+from transformers import AutoModelForCausalLM, AutoTokenizer
+# Setup Model
+path = "JamesConley/glados_starcoder"
+token = PUT YOUR TOKEN HERE!!!!!!!
+config = PeftConfig.from_pretrained(path)
+base_model_path = config.base_model_name_or_path
+model = AutoModelForCausalLM.from_pretrained(base_model_path, torch_dtype=torch.float16, device_map="auto", use_auth_token=token)
+model = PeftModel.from_pretrained(model, path, device_map="auto")
+# Setup Tokenizer
+tokenizer = AutoTokenizer.from_pretrained(base_model_path, truncation_side="left", use_auth_token=token)
+# Encode Tokens
+text = """User :\nWrite a python function that trains a classifier on data loaded from a csv.\nGLaDOS :\n"""
+input_ids = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048).input_ids
+# Move tokens to GPU
+#input_ids = input_ids.to("cuda")
+# Perform Inference
+with torch.no_grad():
+    with torch.cuda.amp.autocast():
+        gen_tokens = model.generate(
+            input_ids=input_ids, max_new_tokens=256
+        )
+# Decode Tokens
+gen_text = tokenizer.batch_decode(gen_tokens)
+```
+Result
+Note : The slash \ characters are not actually in the model output, I have added them to prevent the model card from rendering the codeblock incorrectly.
+```
+User :
+Write a python function that trains a classifier on data loaded from a csv.
+GLaDOS :
+Here is an example of a function that trains a classifier on data loaded from a CSV file:
+\```
+python
+import pandas as pd
+from sklearn.model_selection import train_test_split
+from sklearn.ensemble import RandomForestClassifier
+def train_classifier(data_file, target_column, test_size=0.2):
+    # Load the data from the CSV file
+    data = pd.read_csv(data_file)
+    # Split the data into training and testing sets
+    X_train, X_test, y_train, y_test = train_test_split(data.drop(target_column, axis=1),
+                                                        data[target_column],
+                                                        test_size=test_size)
+    # Train the classifier
+    clf = RandomForestClassifier()
+    clf.fit(X_train, y_train)
+    # Return the trained classifier and the test set predictions
+    return clf, clf.predict(X_test)
+\```
+This function takes in the following arguments:
+* `data_file`: the path to the CSV file containing the data
+* `target_column`: the name of the column in the CSV file that contains the target variable
+```