avinashm
/

text2cypher

Model card Files Files and versions

avinashm commited on Mar 25, 2025

Commit

04ccbd6

·

verified ·

1 Parent(s): 3f8f80d

Create README.md

Files changed (1) hide show

README.md +51 -0

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+---
+datasets:
+- neo4j/text2cypher-2024v1
+base_model:
+- google/gemma-2-9b-it
+---
+# Model Card for Model ID
+This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).
+## Model Details
+This is gguf format model for neo4j/text2cypher-gemma-2-9b-it-finetuned-2024v1
+### Model Description
+This model serves as a demonstration of how fine-tuning foundational models using the Neo4j-Text2Cypher(2024) Dataset (https://huggingface.co/datasets/neo4j/text2cypher-2024v1) can enhance performance on the Text2Cypher task.
+Please note, this is part of ongoing research and exploration, aimed at highlighting the dataset's potential rather than a production-ready solution.
+Base model: google/gemma-2-9b-it
+Dataset: neo4j/text2cypher-2024v1
+An overview of the finetuned models and benchmarking results are shared at https://medium.com/p/d77be96ab65a and https://medium.com/p/b2203d1173b0
+## Bias, Risks, and Limitations
+We need to be cautious about a few risks:
+In our evaluation setup, the training and test sets come from the same data distribution (sampled from a larger dataset). If the data distribution changes, the results may not follow the same pattern.
+The datasets used were gathered from publicly available sources. Over time, foundational models may access both the training and test sets, potentially achieving similar or even better results.
+## Training Details
+Training Procedure
+Used RunPod with following setup:
+1 x A100 PCIe
+31 vCPU 117 GB RAM
+runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04
+On-Demand - Secure Cloud
+60 GB Disk
+60 GB Pod Volume
+Training Hyperparameters
+lora_config = LoraConfig( r=64, lora_alpha=64, target_modules=target_modules, lora_dropout=0.05, bias="none", task_type="CAUSAL_LM", )
+sft_config = SFTConfig( dataset_text_field=dataset_text_field, per_device_train_batch_size=4, gradient_accumulation_steps=8, dataset_num_proc=16, max_seq_length=1600, logging_dir="./logs", num_train_epochs=1, learning_rate=2e-5, save_steps=5, save_total_limit=1, logging_steps=5, output_dir="outputs", optim="paged_adamw_8bit", save_strategy="steps", )
+bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16, )
+### NOTE on creating your own schemas:
+In the dataset we used, the schemas are already provided. They are created either by
+Directly using the schema the input data source provided OR
+Creating schema using neo4j-graphrag package (Check: SchemaReader.get_schema(...) function)
+In your own Neo4j database, you can utilize neo4j-graphrag package::SchemaReader functions