Input format: FP8 Quant format: MXFP4 Output format: ct Shards: 26 Workers: 8 × 3 threads Scale percentile: 99.5 Include patterns: ['moe.gate_proj', 'moe.up_proj', 'moe.down_proj'] (--exclude_layers ignored) MSE scale select: enabled (3 candidates per block) Loading input_layernorm.weight tensors for γ-weighted MSE... γ found for 48 layers (layers 0-47) Zero-copy: disabled (FP8 models/usr/lib/python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 5 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' ith γ) model-00003.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00015.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00023.safetensors: 4716MB → 2592MB (3 quantized, 2 with γ) model-00002.safetensors: 5279MB → 3155MB (3 quantized, 2 with γ) model-00010.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) model-00018.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) model-vit-00002.safetensors: 2348MB → 2348MB (0 quantized, 0 with γ) model-00006.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) model-00011.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00019.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00004.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) model-00012.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) model-00020.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) model-00007.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00016.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) model-00024.safetensors: 6968MB → 6968MB (0 quantized, 0 with γ) model-00005.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00013.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00021.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00001.safetensors: 924MB → 924MB (0 quantized, 0 with γ) model-00009.safetensors: 9500MB → 5251MB (6 quantized, 4 with γ) model-00014.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) model-00022.safetensors: 9567MB → 5319MB (6 quantized, 4 with γ) [1/26] model-00001.safetensors done (0% | elapsed 2s | ETA 8m46s) [2/26] model-00002.safetensors done (3% | elapsed 20s | ETA 11m27s) [3/26] model-00006.safetensors done (7% | elapsed 31s | ETA 6m29s) [4/26] model-00004.safetensors done (12% | elapsed 33s | ETA 4m09s) [5/26] model-00005.safetensors done (16% | elapsed 36s | ETA 3m04s) [6/26] model-00009.safetensors done (21% | elapsed 39s | ETA 2m31s) [7/26] model-00003.safetensors done (25% | elapsed 41s | ETA 2m01s) [8/26] model-00007.safetensors done (30% | elapsed 43s | ETA 1m41s) [9/26] model-00008.safetensors done (34% | elapsed 44s | ETA 1m25s) [10/26] model-00010.safetensors done (39% | elapsed 47s | ETA 1m14s) [11/26] model-00011.safetensors done (43% | elapsed 49s | ETA 1m04s) [12/26] model-00012.safetensors done (48% | elapsed 52s | ETA 0m57s) [13/26] model-00013.safetensors done (52% | elapsed 55s | ETA 0m50s) [14/26] model-00014.safetensors done (57% | elapsed 58s | ETA 0m44s) [15/26] model-00015.safetensors done (61% | elapsed 60s | ETA 0m38s) [16/26] model-00016.safetensors done (66% | elapsed 63s | ETA 0m33s) [17/26] model-00017.safetensors done (70% | elapsed 67s | ETA 0m28s) [18/26] model-00018.safetensors done (75% | elapsed 70s | ETA 0m23s) [19/26] model-vit-00001.safetensors done (75% | elapsed 72s | ETA 0m23s) [20/26] model-00023.safetensors done (78% | elapsed 72s | ETA 0m20s) [21/26] model-vit-00002.safetensors done (79% | elapsed 74s | ETA 0m20s) [22/26] model-00019.safetensors done (83% | elapsed 75s | ETA 0m15s) [23/26] model-00020.safetensors done (88% | elapsed 76s | ETA 0m10s) [24/26] model-00021.safetensors done (92% | elapsed 78s | ETA 0m06s) [25/26] model-00022.safetensors done (97% | elapsed 82s | ETA 0m02s) [26/26] model-00024.safetensors done (100% | elapsed 83s | ETA 0m00s) Copied special_tokens_map.json Copied .gitattributes Copied tokenizer.json Copied vision_encoder.py Copied tokenizer_config.json Copied configuration_step3p7.py Copied README.md Copied model.safetensors.index.json Copied chat_template.jinja Copied config.json Copied download.log Copied modeling_step3p7.py Copied processing_step3.py Index: 73921 tensors across 26 shards Done! 212.5GB → 123.3GB (58.0%) Output: /mnt/storage/stepfun-mxfp4