TymofiiNasobko commited on
Commit
4e2ee15
·
verified ·
1 Parent(s): 59b7434

TymofiiNasobko/LapaLM-function-calling-assistant

Browse files
README.md CHANGED
@@ -7,12 +7,6 @@ tags:
7
  - trl
8
  - sft
9
  licence: license
10
- license: gemma
11
- datasets:
12
- - lmassaron/hermes-function-calling-v1
13
- language:
14
- - uk
15
- - en
16
  ---
17
 
18
  # Model Card for Lapa-function-calling
@@ -20,38 +14,15 @@ language:
20
  This model is a fine-tuned version of [lapa-llm/lapa-v0.1.2-instruct](https://huggingface.co/lapa-llm/lapa-v0.1.2-instruct).
21
  It has been trained using [TRL](https://github.com/huggingface/trl).
22
 
23
- # Evaluation
24
- This is the first iteration of fine-tuning Lapa for function calling. In the future, we plan to add metrics and improve training. <br>
25
- During this phase new tokens (including tool_call) were introduced to the model and we evaluated how well it uses and understands the purpose of tool_call.<br>
26
 
27
- # Metrics
28
- Accuracy in function calling (if response contains tool_call token) - find_longest_common_sequence_length(ground_truth_tokens, generated_tokens) / len(ground_truth_tokens)<br>
29
- Match in helpful exchange (if response does not contain tool_call token) - Computes the percentage of matching elements between generated tokens and ground truth tokens<br>
30
- ## Performance before fine-tuning:
31
- Accuracy in function calling: 0.48022<br>
32
- Match in helpful exchange: 0.09064<br>
33
 
34
- ## Performance after fine-tuning:
35
- Accuracy in function calling: 0.94833<br>
36
- Match in helpful exchange: 0.09829
37
-
38
- # Quick start
39
- ```
40
- from peft import PeftModel, PeftConfig
41
- from transformers import AutoModelForCausalLM, AutoTokenizer
42
-
43
- peft_model_id = "TymofiiNasobko/Lapa-function-calling"
44
- peftconfig = PeftConfig.from_pretrained(peft_model_id)
45
- model = AutoModelForCausalLM.from_pretrained(
46
- peftconfig.base_model_name_or_path,
47
- attn_implementation="eager",
48
- device_map=device,
49
- )
50
- tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
51
- model.resize_token_embeddings(len(tokenizer))
52
- model = PeftModel.from_pretrained(model, peft_model_id)
53
- model = model.to(compute_dtype)
54
- model = model.eval()
55
  ```
56
 
57
  ## Training procedure
@@ -64,8 +35,8 @@ This model was trained with SFT.
64
  ### Framework versions
65
 
66
  - TRL: 0.25.1
67
- - Transformers: 4.57.1
68
- - Pytorch: 2.9.1
69
  - Datasets: 4.4.1
70
  - Tokenizers: 0.22.1
71
 
 
7
  - trl
8
  - sft
9
  licence: license
 
 
 
 
 
 
10
  ---
11
 
12
  # Model Card for Lapa-function-calling
 
14
  This model is a fine-tuned version of [lapa-llm/lapa-v0.1.2-instruct](https://huggingface.co/lapa-llm/lapa-v0.1.2-instruct).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
+ ## Quick start
 
 
18
 
19
+ ```python
20
+ from transformers import pipeline
 
 
 
 
21
 
22
+ question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
23
+ generator = pipeline("text-generation", model="TymofiiNasobko/Lapa-function-calling", device="cuda")
24
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
25
+ print(output["generated_text"])
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ```
27
 
28
  ## Training procedure
 
35
  ### Framework versions
36
 
37
  - TRL: 0.25.1
38
+ - Transformers: 4.57.3
39
+ - Pytorch: 2.8.0+cu128
40
  - Datasets: 4.4.1
41
  - Tokenizers: 0.22.1
42
 
adapter_config.json CHANGED
@@ -16,7 +16,7 @@
16
  "layers_pattern": null,
17
  "layers_to_transform": null,
18
  "loftq_config": {},
19
- "lora_alpha": 256,
20
  "lora_bias": false,
21
  "lora_dropout": 0.05,
22
  "megatron_config": null,
@@ -25,19 +25,12 @@
25
  "peft_type": "LORA",
26
  "peft_version": "0.18.0",
27
  "qalora_group_size": 16,
28
- "r": 128,
29
  "rank_pattern": {},
30
  "revision": null,
31
  "target_modules": [
32
- "embed_tokens",
33
- "k_proj",
34
- "q_proj",
35
- "o_proj",
36
- "gate_proj",
37
- "down_proj",
38
- "lm_head",
39
- "up_proj",
40
- "v_proj"
41
  ],
42
  "target_parameters": null,
43
  "task_type": "CAUSAL_LM",
 
16
  "layers_pattern": null,
17
  "layers_to_transform": null,
18
  "loftq_config": {},
19
+ "lora_alpha": 16,
20
  "lora_bias": false,
21
  "lora_dropout": 0.05,
22
  "megatron_config": null,
 
25
  "peft_type": "LORA",
26
  "peft_version": "0.18.0",
27
  "qalora_group_size": 16,
28
+ "r": 256,
29
  "rank_pattern": {},
30
  "revision": null,
31
  "target_modules": [
32
+ "all-linear",
33
+ "embed_tokens"
 
 
 
 
 
 
 
34
  ],
35
  "target_parameters": null,
36
  "task_type": "CAUSAL_LM",
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:af00212be9a7dcf50cf91855d57bed865d94c4a151252e80042aa013da256c77
3
- size 6489752072
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6db50ce8797893a87b0bff0a640b7bdef6d557a5eddaa1341d3a435a4fa031a
3
+ size 2285694936
chat_template.jinja CHANGED
@@ -2,11 +2,9 @@
2
  {%- if messages[0]['role'] == 'system' -%}
3
  {%- if messages[0]['content'] is string -%}
4
  {%- set first_user_prefix = messages[0]['content'] + '
5
-
6
  ' -%}
7
  {%- else -%}
8
  {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
9
-
10
  ' -%}
11
  {%- endif -%}
12
  {%- set loop_messages = messages[1:] -%}
@@ -15,21 +13,30 @@
15
  {%- set loop_messages = messages -%}
16
  {%- endif -%}
17
  {%- for message in loop_messages -%}
18
- {%- if (message['role'] == 'assistant') -%}
19
  {%- set role = "model" -%}
20
  {%- else -%}
21
  {%- set role = message['role'] -%}
22
  {%- endif -%}
 
23
  {{ '<start_of_turn>' + role + '
24
  ' + (first_user_prefix if loop.first else "") }}
25
  {%- if message['content'] is string -%}
26
- {{ message['content'] | trim }}
 
 
 
 
27
  {%- elif message['content'] is iterable -%}
28
  {%- for item in message['content'] -%}
29
  {%- if item['type'] == 'image' -%}
30
  {{ '<start_of_image>' }}
31
  {%- elif item['type'] == 'text' -%}
32
- {{ item['text'] | trim }}
 
 
 
 
33
  {%- endif -%}
34
  {%- endfor -%}
35
  {%- else -%}
 
2
  {%- if messages[0]['role'] == 'system' -%}
3
  {%- if messages[0]['content'] is string -%}
4
  {%- set first_user_prefix = messages[0]['content'] + '
 
5
  ' -%}
6
  {%- else -%}
7
  {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
 
8
  ' -%}
9
  {%- endif -%}
10
  {%- set loop_messages = messages[1:] -%}
 
13
  {%- set loop_messages = messages -%}
14
  {%- endif -%}
15
  {%- for message in loop_messages -%}
16
+ {%- if message['role'] == 'assistant' or message['role'] == 'model' -%}
17
  {%- set role = "model" -%}
18
  {%- else -%}
19
  {%- set role = message['role'] -%}
20
  {%- endif -%}
21
+
22
  {{ '<start_of_turn>' + role + '
23
  ' + (first_user_prefix if loop.first else "") }}
24
  {%- if message['content'] is string -%}
25
+ {%- if role == "model" -%}
26
+ {% generation %} {{ message['content'] | trim }} {% endgeneration %}
27
+ {%- else -%}
28
+ {{ message['content'] | trim }}
29
+ {%- endif -%}
30
  {%- elif message['content'] is iterable -%}
31
  {%- for item in message['content'] -%}
32
  {%- if item['type'] == 'image' -%}
33
  {{ '<start_of_image>' }}
34
  {%- elif item['type'] == 'text' -%}
35
+ {%- if role == "model" -%}
36
+ {% generation %} {{ item['text'] | trim }} {% endgeneration %}
37
+ {%- else -%}
38
+ {{ item['text'] | trim }}
39
+ {%- endif -%}
40
  {%- endif -%}
41
  {%- endfor -%}
42
  {%- else -%}
special_tokens_map.json CHANGED
@@ -16,13 +16,7 @@
16
  "single_word": false
17
  },
18
  "eoi_token": "<end_of_image>",
19
- "eos_token": {
20
- "content": "<eos>",
21
- "lstrip": false,
22
- "normalized": false,
23
- "rstrip": false,
24
- "single_word": false
25
- },
26
  "image_token": "<image_soft_token>",
27
  "pad_token": {
28
  "content": "<pad>",
 
16
  "single_word": false
17
  },
18
  "eoi_token": "<end_of_image>",
19
+ "eos_token": "<eos>",
 
 
 
 
 
 
20
  "image_token": "<image_soft_token>",
21
  "pad_token": {
22
  "content": "<pad>",
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c6ab89697ec2b7f984bc8f0b5136321c8dd877057cf0372cd31ec1aa32e9b5f6
3
  size 6225
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ff25ad6383f18b7fe891c349f04b4502488f62babf7cd9551544ff0e6b3f6da
3
  size 6225