Buckets:

hf-doc-build
/

doc

Files

xet

hf-doc-build/doc / optimum-neuron /v0.4.2 /en /training_api /lora.md

rtrm

about 2 months ago

preview code

download

raw

3.7 kB

LoRA for Neuron

LoRA (Low-Rank Adaptation) implementation optimized for distributed training on AWS Trainium devices. This module provides efficient parameter-efficient fine-tuning with tensor parallelism and sequence parallelism support.

PEFT Model Classes

NeuronPeftModel[[optimum.neuron.peft.NeuronPeftModel]]

optimum.neuron.peft.NeuronPeftModel[[optimum.neuron.peft.NeuronPeftModel]]

Source

NeuronPeftModelForCausalLM[[optimum.neuron.peft.NeuronPeftModelForCausalLM]]

optimum.neuron.peft.NeuronPeftModelForCausalLM[[optimum.neuron.peft.NeuronPeftModelForCausalLM]]

Source

LoRA Layer Implementations

Base LoRA Layer[[optimum.neuron.peft.tuners.lora.layer.NeuronLoraLayer]]

optimum.neuron.peft.tuners.lora.layer.NeuronLoraLayer[[optimum.neuron.peft.tuners.lora.layer.NeuronLoraLayer]]

Source

Parallel Linear LoRA[[optimum.neuron.peft.tuners.lora.layer.ParallelLinear]]

optimum.neuron.peft.tuners.lora.layer.ParallelLinear[[optimum.neuron.peft.tuners.lora.layer.ParallelLinear]]

Source

GQA QKV Column Parallel LoRA[[optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear]]

optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear[[optimum.neuron.peft.tuners.lora.layer.GQAQKVColumnParallelLinear]]

Source

Parallel Embedding LoRA[[optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding]]

optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding[[optimum.neuron.peft.tuners.lora.layer.ParallelEmbedding]]

Source

LoRA Model

NeuronLoraModel[[optimum.neuron.peft.tuners.NeuronLoraModel]]

optimum.neuron.peft.tuners.NeuronLoraModel[[optimum.neuron.peft.tuners.NeuronLoraModel]]

Source

Utility Functions

get_peft_model[[optimum.neuron.peft.get_peft_model]]

optimum.neuron.peft.get_peft_model[[optimum.neuron.peft.get_peft_model]]

Source

Architecture Support

The Neuron LoRA implementation supports the following parallel layer types:

ColumnParallelLinear: For layers that split weights along the output dimension
RowParallelLinear: For layers that split weights along the input dimension
ParallelEmbedding: For embedding layers distributed across ranks
GQAQKVColumnParallelLinear: For grouped query attention projections with challenging tensor parallel configurations

Each layer type has a corresponding LoRA implementation that maintains the parallelization strategy while adding low-rank adaptation capabilities.

Key Features

Distributed Training: Full support for tensor parallelism and sequence parallelism
Checkpoint Consolidation: Automatic conversion between sharded and consolidated checkpoints
Weight Transformation: Seamless integration with model weight transformation specs
Compatibility: Works with all supported custom modeling architectures in Optimum Neuron

Xet Storage Details

Size:: 3.7 kB
Xet hash:: b2040d17503176c409565c8cbe9836a27e61a36b8abbd1a16b8662128c9d8913

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.