Ashenone3
/

LM-Searcher

@@ -1,25 +1,47 @@
 ---
-license: apache-2.0
 datasets:
 - Ashenone3/LM-Searcher-Trajectory-228K
 language:
 - en
 metrics:
 - accuracy
-base_model:
-- meta-llama/Llama-3.1-8B
 pipeline_tag: text-generation
 tags:
 - nas
 - optimization
 - agent
 ---
 # LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
-Repo: [https://github.com/Ashone3/LM-Searcher](https://github.com/Ashone3/LM-Searcher)
-## Introduction
-We introduce LM-Searcher, a task-agnostic neural architecture search framework powered by LLMs.
 ## Usage
@@ -31,7 +53,7 @@ vllm serve path-to-the-checkpoint --dtype auto --api-key token-abc123 --chat-tem
 ```
 ### Inference
-An example is provided to show how LM-Searcher can be used to search for the optimal solution to a given problem：
 ```python
 import os
 import re
@@ -99,4 +121,16 @@ for iteration in range(num_iters, args.trial_num):
     # Save all historical results to file
     with open('{}/historical_results.json'.format(args.output_dir), 'w') as f:
         json.dump(trial_dict, f)
 ```

 ---
+base_model:
+- meta-llama/Llama-3.1-8B
 datasets:
 - Ashenone3/LM-Searcher-Trajectory-228K
 language:
 - en
+license: apache-2.0
 metrics:
 - accuracy
 pipeline_tag: text-generation
 tags:
 - nas
 - optimization
 - agent
+library_name: transformers
 ---
 # LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
+Paper: [LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding](https://huggingface.co/papers/2509.05657)
+## Abstract
+Recent progress in Large Language Models (LLMs) has opened new avenues for solving complex optimization problems, including Neural Architecture Search (NAS). However, existing LLM-driven NAS approaches rely heavily on prompt engineering and domain-specific tuning, limiting their practicality and scalability across diverse tasks. In this work, we propose LM-Searcher, a novel framework that leverages LLMs for cross-domain neural architecture optimization without the need for extensive domain-specific adaptation. Central to our approach is NCode, a universal numerical string representation for neural architectures, which enables cross-domain architecture encoding and search. We also reformulate the NAS problem as a ranking task, training LLMs to select high-performing architectures from candidate pools using instruction-tuning samples derived from a novel pruning-based subspace sampling strategy. Our curated dataset, encompassing a wide range of architecture-performance pairs, encourages robust and transferable learning. Comprehensive experiments demonstrate that LM-Searcher achieves competitive performance in both in-domain (e.g., CNNs for image classification) and out-of-domain (e.g., LoRA configurations for segmentation and generation) tasks, establishing a new paradigm for flexible and generalizable LLM-based architecture search. The datasets and models will be released at this https URL .
+## GitHub Repository
+Code: [https://github.com/Ashone3/LM-Searcher](https://github.com/Ashone3/LM-Searcher)
+<br>
+<div align="center">
+  <img src="https://github.com/Ashone3/LM-Searcher/raw/main/figures/lm_searcher_fig2.png" width="100%" title="Figure2">
+</div>
+## Datasets and Models
+🤗 [LM-Searcher-Trajectory-228k Dataset](https://huggingface.co/datasets/Ashenone3/LM-Searcher-Trajectory-228K)
+🤗 [LM-Searcher Checkpoint](https://huggingface.co/Ashenone3/LM-Searcher/tree/main)
+## Training
+We leverage [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) to train LM-Searcher. Below is the script we use for full fine-tuning on the LLaMA-3.1 model:
+```shell
+FORCE_TORCHRUN=1 llamafactory-cli train configs/llama3_full_sft_ds2.yaml
+```
 ## Usage
 ```
 ### Inference
+A minimal example, `search.py`, is provided to show how LM-Searcher can be used to search for the optimal solution to a given problem:
 ```python
 import os
 import re
     # Save all historical results to file
     with open('{}/historical_results.json'.format(args.output_dir), 'w') as f:
         json.dump(trial_dict, f)
+```
+## Citation
+If our work has been helpful to you, please consider citing it. Your citation serves as encouragement for our research.
+```bibtex
+@article{luo2024lmsearcher,
+  title={LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding},
+  author={Luo, Junyu and Luo, Xiao and Chen, Xiusi and Xiao, Zhiping and Ju, Wei and Zhang, Ming},
+  journal={arXiv preprint arXiv:2509.05657},
+  year={2024}
+}
 ```