|
|
--- |
|
|
base_model: Qwen2.5-32B-Instruct |
|
|
library_name: transformers |
|
|
license: other |
|
|
tags: |
|
|
- llama-factory |
|
|
- full |
|
|
- generated_from_trainer |
|
|
- analog-circuit-design |
|
|
pipeline_tag: text-generation |
|
|
model-index: |
|
|
- name: "to be named" |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
# A 32B Fine-tuned Model for Analog Circuit Knowledge Learning |
|
|
This model is a fine-tuned version of `Qwen2.5-32B-Instruct` trained on a textual dataset for analog circuit knowledge learning. |
|
|
* **Model Page**: [https://huggingface.co/analogllm/analog_model](https://huggingface.co/analogllm/analog_model) |
|
|
* **Dataset Page**: [https://huggingface.co/datasets/analogllm/analog_data](https://huggingface.co/datasets/analogllm/analog_data) |
|
|
|
|
|
|
|
|
## Model description |
|
|
This model is fine-tuned on a textual dataset for analog circuit knowledge learning. The training dataset is constructed from high-quality textbooks using a knowledge distillation approach to extract structured question-answer pairs. |
|
|
The model achieves **85.04% accuracy** on the AMSBench-TQA benchmark, showing a **15.67% improvement** over the initial Qwen2.5-32B-Instruct model. |
|
|
|
|
|
## Limitations |
|
|
While this model demonstrates good performance on the AMSBench-TQA benchmark, it is specialized for this domain. Its applicability and performance in other, unrelated domains may be limited. Users should be aware that, like all language models, it may occasionally generate incorrect or nonsensical information, especially for highly novel or unrepresented concepts within its training data. |
|
|
|
|
|
## Sample Usage |
|
|
You can use this model with the Hugging Face `transformers` library: |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig |
|
|
|
|
|
model_id = "analogllm/analog_model" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_id, |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
|
|
|
# Example chat interaction (Qwen2.5 Instruct format) |
|
|
messages = [ |
|
|
{"role": "user", "content": "What is the primary function of a common-emitter amplifier in analog circuits?"} |
|
|
] |
|
|
|
|
|
# Apply the chat template and prepare inputs |
|
|
text = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tokenize=False, |
|
|
add_generation_prompt=True |
|
|
) |
|
|
inputs = tokenizer(text, return_tensors='pt').to(model.device) |
|
|
|
|
|
# Configure generation parameters |
|
|
generation_config = GenerationConfig( |
|
|
max_new_tokens=512, |
|
|
do_sample=True, |
|
|
temperature=0.7, |
|
|
top_p=0.8, |
|
|
repetition_penalty=1.05, |
|
|
eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|im_end|>")] # Ensure it stops correctly |
|
|
) |
|
|
|
|
|
# Generate response |
|
|
outputs = model.generate( |
|
|
inputs=inputs.input_ids, |
|
|
attention_mask=inputs.attention_mask, |
|
|
generation_config=generation_config |
|
|
) |
|
|
|
|
|
# Decode and print the response |
|
|
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- learning_rate: 2e-06 |
|
|
- train_batch_size: 1 |
|
|
- eval_batch_size: 8 |
|
|
- seed: 42 |
|
|
- distributed_type: multi-GPU |
|
|
- num_devices: 8 |
|
|
- gradient_accumulation_steps: 8 |
|
|
- total_train_batch_size: 64 |
|
|
- total_eval_batch_size: 64 |
|
|
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments |
|
|
- lr_scheduler_type: cosine |
|
|
- lr_scheduler_warmup_ratio: 0.1 |
|
|
- num_epochs: 1.0 |
|
|
|
|
|
### Training results |
|
|
|
|
|
```json |
|
|
{ |
|
|
"epoch": 1.0, |
|
|
"num_input_tokens_seen": 113180672, |
|
|
"total_flos": 759612479373312.0, |
|
|
"train_loss": 1.1406613362056237, |
|
|
"train_runtime": 17617.7573, |
|
|
"train_samples_per_second": 0.784, |
|
|
"train_steps_per_second": 0.012 |
|
|
} |
|
|
``` |
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- Transformers 4.52.4 |
|
|
- Pytorch 2.5.1+cu124 |
|
|
- Datasets 3.6.0 |
|
|
- Tokenizers 0.21.1 |
|
|
``` |