a6188466 commited on
Commit
c454f10
Β·
verified Β·
1 Parent(s): f2d0fa7

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. LICENSE +21 -0
  2. README.md +113 -0
  3. adapter_config.json +36 -0
  4. adapter_model.safetensors +3 -0
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Florent Gastoud
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,3 +1,116 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: mit
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: microsoft/Phi-4-mini-instruct
3
+ library_name: peft
4
+ tags:
5
+ - phi-4-mini
6
+ - LoRA
7
+ - peft
8
+ - adapter
9
+ - intent-sequencing
10
+ - DSL
11
+ - robotics
12
+ - robotic-arm
13
+ - interactive-agents
14
  license: mit
15
+ datasets:
16
+ - a6188466/dia-intent-sequencer-robot-arm-dataset
17
+ language:
18
+ - en
19
  ---
20
+
21
+ # πŸ€– Model Card for `dia-intent-sequencer-robot-arm-adapter`
22
+
23
+ ## 🧠 Model Details
24
+
25
+ This is a parameter-efficient fine-tuning (PEFT) LoRA adapter trained on real-world robotic control data using the [**DIA (DSL for Interactive Agents)**](https://github.com/gh9869827/fifo-dev-dsl/tree/main/fifo_dev_dsl/dia) framework. It enables `microsoft/Phi-4-mini-instruct` to function as an **intent sequencer** for controlling a robotic arm tasked with managing an inventory of screws.
26
+
27
+ This adapter is designed as a **demonstration and testbed** for evaluating the model's ability to:
28
+
29
+ - πŸ”„ Dynamically resolve missing or ambiguous parameters using runtime-accessible context
30
+ - 🧠 Clarify intent through multi-turn interaction
31
+ - πŸ›  Map natural language instructions to executable tool calls for physical actions
32
+
33
+ - **Model type:** LoRA adapter (PEFT)
34
+ - **Language(s):** English
35
+ - **License:** MIT
36
+ - **Finetuned from model:** `microsoft/Phi-4-mini-instruct`
37
+
38
+ ## πŸš€ Uses
39
+
40
+ ### Direct Use
41
+
42
+ This adapter augments `Phi-4-mini-instruct` to convert user commands into structured tool calls using the DIA DSL. It supports:
43
+
44
+ - `QUERY_FILL`, `QUERY_GATHER`, and `QUERY_USER` for dynamic slot resolution and clarification when information is missing or ambiguous
45
+ - All [system prompts defined by DIA](https://github.com/gh9869827/fifo-dev-dsl/tree/main/fifo_dev_dsl/dia#-llm-invocation-strategy), including:
46
+ - `system_prompt_intent_sequencer`
47
+ - `system_prompt_slot_resolver`
48
+ - `system_prompt_error_resolver`
49
+
50
+ For a complete example of how this model can be used, see [`robot_arm.py`](https://github.com/gh9869827/fifo-dev-dsl/blob/main/fifo_dev_dsl/dia/demo/robot_arm.py).
51
+
52
+ ### Downstream Use
53
+
54
+ This adapter can also serve as a starting point for further fine-tuning a DIA intent sequencer on a related domain.
55
+
56
+
57
+ ## πŸ—οΈ Training Details
58
+
59
+ Trained on [a6188466/dia-intent-sequencer-robot-arm-dataset](https://huggingface.co/datasets/a6188466/dia-intent-sequencer-robot-arm-dataset) using the `dsl` adapter from [`fifo-tool-datasets`](https://github.com/gh9869827/fifo-tool-datasets) and the `fine_tune.py` script from [`fifo-tool-airlock-model-env`](https://github.com/gh9869827/fifo-tool-airlock-model-env).
60
+
61
+ - **Dataset:** 210 curated examples designed for a robotic arm handling an inventory of screws
62
+ - **Epochs:** 42
63
+ - **Batch size:** 1
64
+ - **Precision:** bf16
65
+ - **Framework:** `transformers`, `peft`, `trl` (SFTTrainer)
66
+
67
+ ### βš™οΈ Training Hyperparameters
68
+
69
+ ```json
70
+ {
71
+ "num_train_epochs": 42,
72
+ "train_batch_size": 1,
73
+ "learning_rate": 5e-06,
74
+ "lr_scheduler_type": "cosine",
75
+ "warmup_ratio": 0.2,
76
+ "bf16": true,
77
+ "seed": 0
78
+ }
79
+ ```
80
+
81
+ ### πŸ“ˆ Training Results
82
+
83
+ ```json
84
+ {
85
+ "mean_token_accuracy": 0.9903121322393418,
86
+ "total_flos": 5.500569475886285e+16,
87
+ "train_loss": 0.2587171609439547,
88
+ "train_runtime": 2130.5227,
89
+ "train_samples_per_second": 4.14,
90
+ "train_steps_per_second": 4.14,
91
+ "final_learning_rate": 2.4779503893235247e-13
92
+ }
93
+ ```
94
+
95
+ ## βœ… Evaluation
96
+
97
+ - **Two types of evaluation were performed**:
98
+ 1. **Held-out test set**:
99
+ Since evaluating the generated DSL would normally require simulation or execution on the robot, the tool `robot_arm_eval_performance.py` was used to compare DSL strings. Manual review was conducted for mismatches to assess **functional equivalence**.
100
+ 2. **Live evaluation on the robot**:
101
+ Intent-to-action translation was tested directly on the robot. The goal was to verify that the generated DSL led to the correct behavior by comparing the robot's actual actions with the expected outcomes.
102
+
103
+ - **Results**:
104
+ - On the held-out test set (26 tests covering robotic arm functions and inventory interactions):
105
+ - βœ… 20 were exactly as expected
106
+ - πŸ”„ 3 were functionally equivalent and resulted in the same execution path
107
+ - ⚠️ 3 were functionally equivalent but used a **suboptimal execution path**
108
+ - Live evaluation confirmed the above results
109
+
110
+ ## πŸͺͺ License
111
+
112
+ MIT License. See [LICENSE](LICENSE) for details.
113
+
114
+ ## πŸ“¬ Contact
115
+
116
+ For questions, suggestions, or bug reports, please open an issue on GitHub or start a discussion on the Hugging Face Hub.
adapter_config.json ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "microsoft/Phi-4-mini-instruct",
5
+ "bias": "none",
6
+ "corda_config": null,
7
+ "eva_config": null,
8
+ "exclude_modules": null,
9
+ "fan_in_fan_out": false,
10
+ "inference_mode": true,
11
+ "init_lora_weights": true,
12
+ "layer_replication": null,
13
+ "layers_pattern": null,
14
+ "layers_to_transform": null,
15
+ "loftq_config": {},
16
+ "lora_alpha": 32,
17
+ "lora_bias": false,
18
+ "lora_dropout": 0.05,
19
+ "megatron_config": null,
20
+ "megatron_core": "megatron.core",
21
+ "modules_to_save": null,
22
+ "peft_type": "LORA",
23
+ "r": 16,
24
+ "rank_pattern": {},
25
+ "revision": null,
26
+ "target_modules": [
27
+ "down_proj",
28
+ "o_proj",
29
+ "gate_up_proj",
30
+ "qkv_proj"
31
+ ],
32
+ "task_type": "CAUSAL_LM",
33
+ "trainable_token_indices": null,
34
+ "use_dora": false,
35
+ "use_rslora": false
36
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:531f7f2fb18bbad33d2ae1dc0c41c2bd867a7ece954839f670fc16f865a18a63
3
+ size 92309112