File size: 3,848 Bytes
e61a858
b77a007
e61a858
 
 
 
 
 
b77a007
 
e61a858
4aa5657
e61a858
 
 
bd61df6
4aa5657
 
 
e61a858
 
 
4aa5657
 
e61a858
4aa5657
b2a0bc7
e61a858
b77a007
 
 
 
 
 
 
9cd551b
b77a007
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e61a858
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b77a007
 
 
 
 
 
 
 
 
 
 
e61a858
 
 
 
 
 
 
b77a007
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
base_model: Qwen2.5-32B-Instruct
library_name: transformers
license: other
tags:
- llama-factory
- full
- generated_from_trainer
- analog-circuit-design
pipeline_tag: text-generation
model-index:
- name: "to be named"
  results: []
---

# A 32B Fine-tuned Model for Analog Circuit Knowledge Learning
This model is a fine-tuned version of `Qwen2.5-32B-Instruct` trained on a textual dataset for analog circuit knowledge learning.
*   **Model Page**: [https://huggingface.co/analogllm/analog_model](https://huggingface.co/analogllm/analog_model)
*   **Dataset Page**: [https://huggingface.co/datasets/analogllm/analog_data](https://huggingface.co/datasets/analogllm/analog_data)


## Model description
This model is fine-tuned on a textual dataset for analog circuit knowledge learning. The training dataset is constructed from high-quality textbooks using a knowledge distillation approach to extract structured question-answer pairs.
The model achieves **85.04% accuracy** on the AMSBench-TQA benchmark, showing a **15.67% improvement** over the initial Qwen2.5-32B-Instruct model.

## Limitations
While this model demonstrates good performance on the AMSBench-TQA benchmark, it is specialized for this domain. Its applicability and performance in other, unrelated domains may be limited. Users should be aware that, like all language models, it may occasionally generate incorrect or nonsensical information, especially for highly novel or unrepresented concepts within its training data.

## Sample Usage
You can use this model with the Hugging Face `transformers` library:

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

model_id = "analogllm/analog_model"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Example chat interaction (Qwen2.5 Instruct format)
messages = [
    {"role": "user", "content": "What is the primary function of a common-emitter amplifier in analog circuits?"}
]

# Apply the chat template and prepare inputs
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors='pt').to(model.device)

# Configure generation parameters
generation_config = GenerationConfig(
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.8,
    repetition_penalty=1.05,
    eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|im_end|>")] # Ensure it stops correctly
)

# Generate response
outputs = model.generate(
    inputs=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    generation_config=generation_config
)

# Decode and print the response
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(response)
```

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-06
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- gradient_accumulation_steps: 8
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1.0

### Training results

```json
{
  "epoch": 1.0,
  "num_input_tokens_seen": 113180672,
  "total_flos": 759612479373312.0,
  "train_loss": 1.1406613362056237,
  "train_runtime": 17617.7573,
  "train_samples_per_second": 0.784,
  "train_steps_per_second": 0.012
}
```

### Framework versions

- Transformers 4.52.4
- Pytorch 2.5.1+cu124
- Datasets 3.6.0
- Tokenizers 0.21.1
```