PleIAs
/

Cassandre-RAG

Model card Files Files and versions

Carlos Rosas commited on Sep 24, 2024

Commit

dc49182

·

verified ·

1 Parent(s): 3e546be

Update README.md

Files changed (1) hide show

README.md +29 -21

README.md CHANGED Viewed

@@ -10,27 +10,35 @@ The model was fine-tuned on a specialized corpus consisting of:
 2. Retrieved documents: For each synthetic query, relevant documents were retrieved using the BM25 ranking algorithm.
 3. Generated answers: Responses to the synthetic queries were created based on the retrieved documents.
-### Training Hyperparameters
-- Max Steps: 3000
-- Learning Rate: 3e-4
-- Batch Size: 2 per device
-- Gradient Accumulation Steps: 4
-- Max Sequence Length: 8192
-- Weight Decay: 0.001
-- Warmup Ratio: 0.03
-- LR Scheduler: Linear
-- Optimizer: paged_adamw_32bit
-### LoRA Configuration
-- LoRA Alpha: 16
-- LoRA Dropout: 0.1
-- LoRA R: 64
-- Target Modules: ["gate_proj", "down_proj", "up_proj", "q_proj", "v_proj", "k_proj", "o_proj"]
-### Quantization
-- Quantization: 4-bit
-- Quantization Type: nf4
-- Compute Dtype: float16
 ## Usage

 2. Retrieved documents: For each synthetic query, relevant documents were retrieved using the BM25 ranking algorithm.
 3. Generated answers: Responses to the synthetic queries were created based on the retrieved documents.
+```yaml
+Training Hyperparameters:
+  Max Steps: 3000
+  Learning Rate: 3e-4
+  Batch Size: 2 per device
+  Gradient Accumulation Steps: 4
+  Max Sequence Length: 8192
+  Weight Decay: 0.001
+  Warmup Ratio: 0.03
+  LR Scheduler: Linear
+  Optimizer: paged_adamw_32bit
+LoRA Configuration:
+  LoRA Alpha: 16
+  LoRA Dropout: 0.1
+  LoRA R: 64
+  Target Modules:
+    - gate_proj
+    - down_proj
+    - up_proj
+    - q_proj
+    - v_proj
+    - k_proj
+    - o_proj
+Quantization:
+  Quantization: 4-bit
+  Quantization Type: nf4
+  Compute Dtype: float16
 ## Usage