HardikJha commited on
Commit
5d5ce50
·
verified ·
1 Parent(s): ccc9e32

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -17
README.md CHANGED
@@ -1,7 +1,10 @@
1
  ---
2
  base_model: unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit
3
  library_name: peft
4
- model_name: sft_warmup
 
 
 
5
  tags:
6
  - base_model:adapter:unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit
7
  - lora
@@ -9,30 +12,59 @@ tags:
9
  - transformers
10
  - trl
11
  - unsloth
12
- licence: license
13
- pipeline_tag: text-generation
 
 
14
  ---
15
 
16
- # Model Card for sft_warmup
 
 
17
 
18
- This model is a fine-tuned version of [unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit](https://huggingface.co/unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit).
19
- It has been trained using [TRL](https://github.com/huggingface/trl).
20
 
21
- ## Quick start
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ```python
24
- from transformers import pipeline
 
 
25
 
26
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
27
- generator = pipeline("text-generation", model="None", device="cuda")
28
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
29
- print(output["generated_text"])
30
- ```
31
 
32
- ## Training procedure
33
 
34
-
 
 
 
 
 
 
 
35
 
 
 
 
36
 
37
  This model was trained with SFT.
38
 
@@ -47,8 +79,6 @@ This model was trained with SFT.
47
 
48
  ## Citations
49
 
50
-
51
-
52
  Cite TRL as:
53
 
54
  ```bibtex
 
1
  ---
2
  base_model: unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit
3
  library_name: peft
4
+ language:
5
+ - en
6
+ license: apache-2.0
7
+ pipeline_tag: text-generation
8
  tags:
9
  - base_model:adapter:unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit
10
  - lora
 
12
  - transformers
13
  - trl
14
  - unsloth
15
+ - openenv
16
+ - adversarial-robustness
17
+ - structured-extraction
18
+ - json-schema
19
  ---
20
 
21
+ # Extractor (SFT warmup) — Adversarial Structured-Extraction Arena
22
+
23
+ This model repo hosts the **SFT warmup LoRA adapter** trained for the OpenEnv project **Adversarial Structured-Extraction Arena**: an adversary perturbs messy documents/schemas (under a budget) and the extractor must output **valid JSON** matching a target schema.
24
 
25
+ ## Links (submission)
 
26
 
27
+ - **GitHub repo**: https://github.com/Hardikjha09/openenv-adversarial-extraction-arena
28
+ - **Runnable Space**: https://huggingface.co/spaces/HardikJha/extraction-arena
29
+ - **Colab (re-run training)**: https://colab.research.google.com/github/Hardikjha09/openenv-adversarial-extraction-arena/blob/main/notebooks/Train_Extractor_Colab.ipynb
30
+ -
31
+ ## Evidence (plots + logs)
32
+
33
+ - **Training loss**: https://huggingface.co/HardikJha/extractor-aea/blob/main/plots/sft_loss.png
34
+ - **Eval reward (moving average)**: https://huggingface.co/HardikJha/extractor-aea/blob/main/plots/rewards.png
35
+ - **Eval Elo**: https://huggingface.co/HardikJha/extractor-aea/blob/main/plots/elo_ratings.png
36
+ - **Eval metrics JSON**: https://huggingface.co/HardikJha/extractor-aea/blob/main/eval_metrics.json
37
+ - **SFT trainer log (raw JSON)**: https://huggingface.co/HardikJha/extractor-aea/blob/main/trainer_log_history.json
38
+ -
39
+ ## What this checkpoint is
40
+
41
+ - **Base model**: `unsloth/Qwen2.5-1.5B-Instruct` (4-bit Unsloth bundle: `unsloth/qwen2.5-1.5b-instruct-unsloth-bnb-4bit`)
42
+ - **Adapter**: LoRA (`peft`), saved from `training/sft_warmup.py`
43
+ -
44
+ ## Quick start (load base + adapter)
45
 
46
  ```python
47
+ import torch
48
+ from transformers import AutoModelForCausalLM, AutoTokenizer
49
+ from peft import PeftModel
50
 
51
+ base_model_id = "unsloth/Qwen2.5-1.5B-Instruct"
52
+ adapter_id = "HardikJha/extractor-aea"
 
 
 
53
 
54
+ tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
55
 
56
+ base = AutoModelForCausalLM.from_pretrained(
57
+ base_model_id,
58
+ torch_dtype=torch.float16,
59
+ device_map="auto",
60
+ trust_remote_code=True,
61
+ )
62
+ model = PeftModel.from_pretrained(base, adapter_id)
63
+ ```
64
 
65
+ ## Training procedure
66
+ - **Objective**: supervised JSON extraction formatting aligned to the repo’s extractor prompt (`training/prompts.py`)
67
+ - **Framework**: TRL SFTTrainer + Unsloth FastLanguageModel (see `training/sft_warmup.py`)
68
 
69
  This model was trained with SFT.
70
 
 
79
 
80
  ## Citations
81
 
 
 
82
  Cite TRL as:
83
 
84
  ```bibtex