Improve model card: Add `library_name`, abstract, and GitHub details

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +41 -7
README.md CHANGED
@@ -1,25 +1,47 @@
1
  ---
2
- license: apache-2.0
 
3
  datasets:
4
  - Ashenone3/LM-Searcher-Trajectory-228K
5
  language:
6
  - en
 
7
  metrics:
8
  - accuracy
9
- base_model:
10
- - meta-llama/Llama-3.1-8B
11
  pipeline_tag: text-generation
12
  tags:
13
  - nas
14
  - optimization
15
  - agent
 
16
  ---
 
17
  # LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
18
 
19
- Repo: [https://github.com/Ashone3/LM-Searcher](https://github.com/Ashone3/LM-Searcher)
 
 
 
 
 
 
20
 
21
- ## Introduction
22
- We introduce LM-Searcher, a task-agnostic neural architecture search framework powered by LLMs.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  ## Usage
25
 
@@ -31,7 +53,7 @@ vllm serve path-to-the-checkpoint --dtype auto --api-key token-abc123 --chat-tem
31
  ```
32
 
33
  ### Inference
34
- An example is provided to show how LM-Searcher can be used to search for the optimal solution to a given problem
35
  ```python
36
  import os
37
  import re
@@ -99,4 +121,16 @@ for iteration in range(num_iters, args.trial_num):
99
  # Save all historical results to file
100
  with open('{}/historical_results.json'.format(args.output_dir), 'w') as f:
101
  json.dump(trial_dict, f)
 
 
 
 
 
 
 
 
 
 
 
 
102
  ```
 
1
  ---
2
+ base_model:
3
+ - meta-llama/Llama-3.1-8B
4
  datasets:
5
  - Ashenone3/LM-Searcher-Trajectory-228K
6
  language:
7
  - en
8
+ license: apache-2.0
9
  metrics:
10
  - accuracy
 
 
11
  pipeline_tag: text-generation
12
  tags:
13
  - nas
14
  - optimization
15
  - agent
16
+ library_name: transformers
17
  ---
18
+
19
  # LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
20
 
21
+ Paper: [LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding](https://huggingface.co/papers/2509.05657)
22
+
23
+ ## Abstract
24
+ Recent progress in Large Language Models (LLMs) has opened new avenues for solving complex optimization problems, including Neural Architecture Search (NAS). However, existing LLM-driven NAS approaches rely heavily on prompt engineering and domain-specific tuning, limiting their practicality and scalability across diverse tasks. In this work, we propose LM-Searcher, a novel framework that leverages LLMs for cross-domain neural architecture optimization without the need for extensive domain-specific adaptation. Central to our approach is NCode, a universal numerical string representation for neural architectures, which enables cross-domain architecture encoding and search. We also reformulate the NAS problem as a ranking task, training LLMs to select high-performing architectures from candidate pools using instruction-tuning samples derived from a novel pruning-based subspace sampling strategy. Our curated dataset, encompassing a wide range of architecture-performance pairs, encourages robust and transferable learning. Comprehensive experiments demonstrate that LM-Searcher achieves competitive performance in both in-domain (e.g., CNNs for image classification) and out-of-domain (e.g., LoRA configurations for segmentation and generation) tasks, establishing a new paradigm for flexible and generalizable LLM-based architecture search. The datasets and models will be released at this https URL .
25
+
26
+ ## GitHub Repository
27
+ Code: [https://github.com/Ashone3/LM-Searcher](https://github.com/Ashone3/LM-Searcher)
28
 
29
+ <br>
30
+ <div align="center">
31
+ <img src="https://github.com/Ashone3/LM-Searcher/raw/main/figures/lm_searcher_fig2.png" width="100%" title="Figure2">
32
+ </div>
33
+
34
+ ## Datasets and Models
35
+ 🤗 [LM-Searcher-Trajectory-228k Dataset](https://huggingface.co/datasets/Ashenone3/LM-Searcher-Trajectory-228K)
36
+
37
+ 🤗 [LM-Searcher Checkpoint](https://huggingface.co/Ashenone3/LM-Searcher/tree/main)
38
+
39
+ ## Training
40
+
41
+ We leverage [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to train LM-Searcher. Below is the script we use for full fine-tuning on the LLaMA-3.1 model:
42
+ ```shell
43
+ FORCE_TORCHRUN=1 llamafactory-cli train configs/llama3_full_sft_ds2.yaml
44
+ ```
45
 
46
  ## Usage
47
 
 
53
  ```
54
 
55
  ### Inference
56
+ A minimal example, `search.py`, is provided to show how LM-Searcher can be used to search for the optimal solution to a given problem:
57
  ```python
58
  import os
59
  import re
 
121
  # Save all historical results to file
122
  with open('{}/historical_results.json'.format(args.output_dir), 'w') as f:
123
  json.dump(trial_dict, f)
124
+ ```
125
+
126
+ ## Citation
127
+ If our work has been helpful to you, please consider citing it. Your citation serves as encouragement for our research.
128
+
129
+ ```bibtex
130
+ @article{luo2024lmsearcher,
131
+ title={LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding},
132
+ author={Luo, Junyu and Luo, Xiao and Chen, Xiusi and Xiao, Zhiping and Ju, Wei and Zhang, Ming},
133
+ journal={arXiv preprint arXiv:2509.05657},
134
+ year={2024}
135
+ }
136
  ```