Buckets:
| # LoRA for Neuron | |
| LoRA (Low-Rank Adaptation) implementation optimized for distributed training on AWS Trainium devices. This module provides efficient parameter-efficient fine-tuning with tensor parallelism and sequence parallelism support. | |
| ## PEFT Model Classes | |
| ### NeuronPeftModel[[optimum.neuron.peft.NeuronPeftModel]] | |
| #### optimum.neuron.peft.NeuronPeftModel[[optimum.neuron.peft.NeuronPeftModel]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/peft_model.py#L82) | |
| ### NeuronPeftModelForCausalLM[[optimum.neuron.peft.NeuronPeftModelForCausalLM]] | |
| #### optimum.neuron.peft.NeuronPeftModelForCausalLM[[optimum.neuron.peft.NeuronPeftModelForCausalLM]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/peft_model.py#L463) | |
| ## LoRA Layer Implementations | |
| ### Base LoRA Layer[[optimum.neuron.peft.tuners.lora.layer.NeuronLoraLayer]] | |
| #### optimum.neuron.peft.tuners.lora.layer.NeuronLoraLayer[[optimum.neuron.peft.tuners.lora.layer.NeuronLoraLayer]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L81) | |
| ### Parallel Linear LoRA[[optimum.neuron.peft.tuners.lora.layer.ParallelLinear]] | |
| #### optimum.neuron.peft.tuners.lora.layer.ParallelLinear[[optimum.neuron.peft.tuners.lora.layer.ParallelLinear]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L232) | |
| mergeoptimum.neuron.peft.tuners.lora.layer.ParallelLinear.mergehttps://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L307[{"name": "safe_merge", "val": ": bool = False"}, {"name": "adapter_names", "val": ": list[str] | None = None"}]- **safe_merge** -- If True, perform merge in a copy and check for NaNs before merging. | |
| - **adapter_names** -- List of adapter names to merge. If None, all active adapters will be merged.0 | |
| Merge the active adapter weights into the base weights. | |
| This works with distributed parallel linear layers (RowParallelLinear, ColumnParallelLinear). | |
| The merge happens on the sharded weights - each rank merges its own shard. | |
| **Parameters:** | |
| safe_merge : If True, perform merge in a copy and check for NaNs before merging. | |
| adapter_names : List of adapter names to merge. If None, all active adapters will be merged. | |
| #### unmerge[[optimum.neuron.peft.tuners.lora.layer.ParallelLinear.unmerge]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L361) | |
| Unmerge all merged adapter layers from the base weights. | |
| This works with distributed parallel linear layers (RowParallelLinear, ColumnParallelLinear). | |
| The unmerge happens on the sharded weights - each rank unmerges its own shard. | |
| ### GQA QKV Column Parallel LoRA[[optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear]] | |
| #### optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear[[optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L433) | |
| get_delta_weightoptimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear.get_delta_weighthttps://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L578[{"name": "adapter", "val": ": str"}]- **adapter** -- The name of the adapter for which the delta weight should be computed.0Dict mapping "q"/"k"/"v" (or "qkv") to their delta weight tensors (sharded). | |
| Compute the delta weights for Q, K, V for the given adapter. | |
| Returns a dict with keys "q", "k", "v" (or "qkv" if fused) containing the delta tensors. | |
| **Parameters:** | |
| adapter : The name of the adapter for which the delta weight should be computed. | |
| **Returns:** | |
| Dict mapping "q"/"k"/"v" (or "qkv") to their delta weight tensors (sharded). | |
| #### merge[[optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear.merge]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L625) | |
| Merge the active adapter weights into the base Q, K, V weights. | |
| This works with GQAQKVColumnParallelLinear layers. | |
| The merge happens on the sharded weights - each rank merges its own shard. | |
| **Parameters:** | |
| safe_merge : If True, perform merge in a copy and check for NaNs before merging. | |
| adapter_names : List of adapter names to merge. If None, all active adapters will be merged. | |
| #### unmerge[[optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear.unmerge]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L688) | |
| Unmerge all merged adapter layers from the base Q, K, V weights. | |
| This works with GQAQKVColumnParallelLinear layers. | |
| The unmerge happens on the sharded weights - each rank unmerges its own shard. | |
| ### Parallel Embedding LoRA[[optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding]] | |
| #### optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding[[optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L758) | |
| mergeoptimum.neuron.peft.tuners.lora.layer.ParallelEmbedding.mergehttps://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L847[{"name": "safe_merge", "val": ": bool = False"}, {"name": "adapter_names", "val": ": list[str] | None = None"}]- **safe_merge** -- If True, perform merge in a copy and check for NaNs before merging. | |
| - **adapter_names** -- List of adapter names to merge. If None, all active adapters will be merged.0 | |
| Merge the active adapter weights into the base embedding weights. | |
| This works with ParallelEmbedding layers. | |
| The merge happens on the sharded weights - each rank merges its own shard. | |
| **Parameters:** | |
| safe_merge : If True, perform merge in a copy and check for NaNs before merging. | |
| adapter_names : List of adapter names to merge. If None, all active adapters will be merged. | |
| #### unmerge[[optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding.unmerge]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/layer.py#L885) | |
| Unmerge all merged adapter layers from the base embedding weights. | |
| This works with ParallelEmbedding layers. | |
| The unmerge happens on the sharded weights - each rank unmerges its own shard. | |
| ## LoRA Model | |
| ### NeuronLoraModel[[optimum.neuron.peft.tuners.NeuronLoraModel]] | |
| #### optimum.neuron.peft.tuners.NeuronLoraModel[[optimum.neuron.peft.tuners.NeuronLoraModel]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/tuners/lora/model.py#L29) | |
| ## Utility Functions | |
| ### get_peft_model[[optimum.neuron.peft.get_peft_model]] | |
| #### optimum.neuron.peft.get_peft_model[[optimum.neuron.peft.get_peft_model]] | |
| [Source](https://github.com/huggingface/optimum-neuron/blob/vr_1097/optimum/neuron/peft/mapping_func.py#L45) | |
| ## Architecture Support | |
| The Neuron LoRA implementation supports the following parallel layer types: | |
| - **ColumnParallelLinear**: For layers that split weights along the output dimension | |
| - **RowParallelLinear**: For layers that split weights along the input dimension | |
| - **ParallelEmbedding**: For embedding layers distributed across ranks | |
| - **GQAQKVColumnParallelLinear**: For grouped query attention projections with challenging tensor parallel configurations | |
| Each layer type has a corresponding LoRA implementation that maintains the parallelization strategy while adding low-rank adaptation capabilities. | |
| ## Key Features | |
| - **Distributed Training**: Full support for tensor parallelism and sequence parallelism | |
| - **Checkpoint Consolidation**: Automatic conversion between sharded and consolidated checkpoints | |
| - **Weight Transformation**: Seamless integration with model weight transformation specs | |
| - **Compatibility**: Works with all supported custom modeling architectures in Optimum Neuron |
Xet Storage Details
- Size:
- 7.99 kB
- Xet hash:
- 57046c01f85281b7959cb37e8d4bc21acffef31066dc15d5bb3519566a4b818e
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.