avinashm commited on
Commit
04ccbd6
·
verified ·
1 Parent(s): 3f8f80d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - neo4j/text2cypher-2024v1
4
+ base_model:
5
+ - google/gemma-2-9b-it
6
+ ---
7
+
8
+ # Model Card for Model ID
9
+
10
+ This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
11
+
12
+ ## Model Details
13
+ This is gguf format model for neo4j/text2cypher-gemma-2-9b-it-finetuned-2024v1
14
+
15
+ ### Model Description
16
+ This model serves as a demonstration of how fine-tuning foundational models using the Neo4j-Text2Cypher(2024) Dataset (https://huggingface.co/datasets/neo4j/text2cypher-2024v1) can enhance performance on the Text2Cypher task.
17
+ Please note, this is part of ongoing research and exploration, aimed at highlighting the dataset's potential rather than a production-ready solution.
18
+
19
+ Base model: google/gemma-2-9b-it
20
+ Dataset: neo4j/text2cypher-2024v1
21
+
22
+ An overview of the finetuned models and benchmarking results are shared at https://medium.com/p/d77be96ab65a and https://medium.com/p/b2203d1173b0
23
+
24
+
25
+ ## Bias, Risks, and Limitations
26
+
27
+ We need to be cautious about a few risks:
28
+
29
+ In our evaluation setup, the training and test sets come from the same data distribution (sampled from a larger dataset). If the data distribution changes, the results may not follow the same pattern.
30
+ The datasets used were gathered from publicly available sources. Over time, foundational models may access both the training and test sets, potentially achieving similar or even better results.
31
+
32
+ ## Training Details
33
+ Training Procedure
34
+ Used RunPod with following setup:
35
+
36
+ 1 x A100 PCIe
37
+ 31 vCPU 117 GB RAM
38
+ runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04
39
+ On-Demand - Secure Cloud
40
+ 60 GB Disk
41
+ 60 GB Pod Volume
42
+ Training Hyperparameters
43
+ lora_config = LoraConfig( r=64, lora_alpha=64, target_modules=target_modules, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM", )
44
+ sft_config = SFTConfig( dataset_text_field=dataset_text_field, per_device_train_batch_size=4, gradient_accumulation_steps=8, dataset_num_proc=16, max_seq_length=1600, logging_dir="./logs", num_train_epochs=1, learning_rate=2e-5, save_steps=5, save_total_limit=1, logging_steps=5, output_dir="outputs", optim="paged_adamw_8bit", save_strategy="steps", )
45
+ bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, )
46
+
47
+ ### NOTE on creating your own schemas:
48
+ In the dataset we used, the schemas are already provided. They are created either by
49
+ Directly using the schema the input data source provided OR
50
+ Creating schema using neo4j-graphrag package (Check: SchemaReader.get_schema(...) function)
51
+ In your own Neo4j database, you can utilize neo4j-graphrag package::SchemaReader functions