| | --- |
| | license: apache-2.0 |
| | language: en |
| | datasets: |
| | - lmms-lab/Math10K |
| | base_model: |
| | - Qwen/Qwen3-1.7B-Base |
| | --- |
| | |
| | # NdLinear-LoRA Fine-Tuned Models |
| |
|
| | This repository contains a collection of language models fine-tuned using a custom NdLinear-LoRA architecture. NdLinear-LoRA is a variant of Low-Rank Adaptation (LoRA) that reshapes weight matrices into N-dimensional tensors and applies a factorized linear transformation for parameter-efficient fine-tuning. |
| |
|
| | ## Available Models |
| |
|
| | Below is a list of the fine-tuned models. For best results, it's recommended to host each model in its own repository on the Hugging Face Hub. |
| |
|
| | | Fine-Tuned Model Name | Base Model | Fine-Tuning Dataset | |
| | | ------------------------------------------------ | -------------------------- | ------------------- | |
| | | `Meta-Llama-3-8B-CSQA-NdLinearLoRA` | `meta-llama/Llama-3-8B` | `commonsense_qa` | |
| | | `Meta-Llama-3-8B-Math10K-NdLinearLoRA` | `meta-llama/Llama-3-8B` | `lmms-lab/Math10K` | |
| | | `Qwen3-1.7B-CSQA-NdLinearLoRA` | `Qwen/Qwen3-1.7B-Base` | `commonsense_qa` | |
| | | `Qwen3-1.7B-Math10K-NdLinearLoRA` | `Qwen/Qwen3-1.7B-Base` | `lmms-lab/Math10K` | |
| |
|
| | ## How to Use |
| |
|
| | Because these models use a custom architecture, you must pass `trust_remote_code=True` when loading them. This allows the `transformers` library to download and use the `modeling_ndlinear.py` file that should be included in each model's repository. |
| |
|
| | **Dependencies:** Before you start, make sure you have the necessary libraries installed: |
| | ```bash |
| | pip install torch transformers safetensors huggingface_hub accelerate |
| | pip install ndlinear |
| | ``` |
| |
|
| | ### Example Loading Script |
| | This script will work for any of the models listed above. Just change the `REPO_ID`. |
| |
|
| | ```python |
| | import torch |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | # --- Example Usage --- |
| | |
| | # 1. Choose the model you want to use from the table above |
| | # Replace "YourUsername" with your Hugging Face username or organization. |
| | REPO_ID = "YourUsername/Qwen3-1.7B-Math10K-NdLinearLoRA" |
| | |
| | # 2. Load the model and tokenizer |
| | # `trust_remote_code=True` is required to load the custom architecture. |
| | print(f"Loading model: {REPO_ID}") |
| | model = AutoModelForCausalLM.from_pretrained( |
| | REPO_ID, |
| | torch_dtype="auto", |
| | device_map="auto", |
| | trust_remote_code=True |
| | ) |
| | tokenizer = AutoTokenizer.from_pretrained(REPO_ID) |
| | print("Model and tokenizer loaded successfully.") |
| | |
| | |
| | # 3. Generate text |
| | # This prompt is geared for a math model. Adjust it for a QA model if needed. |
| | prompt = "### Instruction:\\nSolve the following math problem: If a train travels at 60 miles per hour, how long does it take to travel 180 miles?\\n\\n### Solution:\\n" |
| | inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| | |
| | with torch.no_grad(): |
| | outputs = model.generate(**inputs, max_new_tokens=150, eos_token_id=tokenizer.eos_token_id) |
| | |
| | print("\\n--- Generated Output ---") |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |