Transformers
Safetensors
editflow-llada
custom_code

Edit Flows base model

Training

Trained on a single 8xH100 node:

accelerate launch \
   --config_file scripts/accelerate_configs/zero2.yaml \
   examples/editflow/llada/adapt.py --model_name_or_path "GSAI-ML/LLaDA-8B-Instruct" \
   --lm_head_key "model.transformer.ff_out" \
   --init_editflow_from_src True \
   --per_device_train_batch_size 1 \
   --per_device_eval_batch_size 1 \
   --gradient_accumulation_steps 4 \
   --dataset_args "allenai/tulu-3-sft-mixture[train:500000]|lamm-mit/bio-silk-mech-mix-q-a-35K-messages-only|lamm-mit/graph_reasoning_v3_messages" \
   --output_dir "models/LlaDA-8B-EditFlow-instruct-v500" \
   --x0_sampler "masks[length:128]"   --max_length 1500 \
   --num_train_epochs 4 \
   --learning_rate 1e-5  \
   --push_to_hub True --save_strategy "steps" --save_steps 1000  \
   --hub_model_id lamm-mit/LlaDA-8B-EditFlow-instruct-v500 \
   --hub_private_repo True --eval_strategy "no"  \
   --warmup_steps 50  

Sampling

python examples/editflow/sample.py \
  --model_name_or_path "odels/LlaDA-8B-EditFlow-instruct-v500" \
  --mask_length 128 --seed 7070 \
  --prompt "Define materiomics."
Downloads last month
8
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lamm-mit/LlaDA-8B-EditFlow-instruct-v500

Finetuned
(16)
this model
Finetunes
3 models

Datasets used to train lamm-mit/LlaDA-8B-EditFlow-instruct-v500

Collection including lamm-mit/LlaDA-8B-EditFlow-instruct-v500