Synchronizing local compiler cache.
Browse files
neuronxcc-2.21.33363.0+82129205/MODULE_8508fced3b142be2a2ee+fb4cc044/compile_flags.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
["--target=trn1", "--auto-cast=none", "--model-type=transformer", "--tensorizer-options=--enable-ccop-compute-overlap --cc-pipeline-tiling-factor=2 --vectorize-strided-dma ", "-O2", "--lnc=1", "--logfile=/tmp/nxd_model/encoding/_tp0_bk0/log-neuron-cc.txt"]
|
neuronxcc-2.21.33363.0+82129205/MODULE_8508fced3b142be2a2ee+fb4cc044/model.hlo_module.pb
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2af5485875eb0c496ff7c3ecea5e35812074c14b25db1f04586a6e66b8d50cd7
|
| 3 |
+
size 810056
|
neuronxcc-2.21.33363.0+82129205/MODULE_8508fced3b142be2a2ee+fb4cc044/model.log
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Failed compilation with ['neuronx-cc', 'compile', '--framework=XLA', '/tmp/nxd_model/encoding/_tp0_bk0/model.MODULE_8508fced3b142be2a2ee+fb4cc044.hlo_module.pb', '--output', '/tmp/nxd_model/encoding/_tp0_bk0/model.MODULE_8508fced3b142be2a2ee+fb4cc044.neff', '--target=trn1', '--auto-cast=none', '--model-type=transformer', '--tensorizer-options=--enable-ccop-compute-overlap --cc-pipeline-tiling-factor=2 --vectorize-strided-dma ', '-O2', '--lnc=1', '--logfile=/tmp/nxd_model/encoding/_tp0_bk0/log-neuron-cc.txt', '--verbose=35']: 2026-02-09T10:25:27Z
|
| 2 |
+
Pre-Partition Pre-Opt Histogram:
|
| 3 |
+
total HLO instructions: 5171
|
| 4 |
+
convert 910 17.60% ################################################################
|
| 5 |
+
reshape 802 15.51% ########################################################
|
| 6 |
+
transpose 723 13.98% ##################################################
|
| 7 |
+
broadcast 548 10.60% ######################################
|
| 8 |
+
slice 543 10.50% ######################################
|
| 9 |
+
multiply 362 7.00% #########################
|
| 10 |
+
parameter 328 6.34% #######################
|
| 11 |
+
constant 221 4.27% ###############
|
| 12 |
+
call 217 4.20% ###############
|
| 13 |
+
dot 181 3.50% ############
|
| 14 |
+
add 144 2.78% ##########
|
| 15 |
+
concatenate 74 1.43% #####
|
| 16 |
+
negate 72 1.39% #####
|
| 17 |
+
get-tuple-element 37 0.72% ##
|
| 18 |
+
iota 3 0.06%
|
| 19 |
+
gather 2 0.04%
|
| 20 |
+
tuple 1 0.02%
|
| 21 |
+
cosine 1 0.02%
|
| 22 |
+
reduce 1 0.02%
|
| 23 |
+
sine 1 0.02%
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
Pre-Partition Post-Op Histogram:
|
| 27 |
+
total HLO instructions: 4140
|
| 28 |
+
convert 909 21.96% ################################################################
|
| 29 |
+
reshape 650 15.70% #############################################
|
| 30 |
+
transpose 540 13.04% ######################################
|
| 31 |
+
parameter 328 7.92% #######################
|
| 32 |
+
constant 256 6.18% ##################
|
| 33 |
+
broadcast 255 6.16% #################
|
| 34 |
+
slice 252 6.09% #################
|
| 35 |
+
custom-call 217 5.24% ###############
|
| 36 |
+
multiply 217 5.24% ###############
|
| 37 |
+
dot 180 4.35% ############
|
| 38 |
+
add 144 3.48% ##########
|
| 39 |
+
concatenate 74 1.79% #####
|
| 40 |
+
negate 72 1.74% #####
|
| 41 |
+
get-tuple-element 37 0.89% ##
|
| 42 |
+
iota 3 0.07%
|
| 43 |
+
gather 2 0.05%
|
| 44 |
+
cosine 1 0.02%
|
| 45 |
+
tuple 1 0.02%
|
| 46 |
+
reduce 1 0.02%
|
| 47 |
+
sine 1 0.02%
|
| 48 |
+
|
| 49 |
+
Potential split-points stats: #CC 0 #AR 0 #AG 0 #BN 0 nClamp 0
|
| 50 |
+
WARNING: Insufficient number of potential split points found. Entire model will be compiled as a single module.
|
| 51 |
+
No partitions found. Compiling as flat model
|
| 52 |
+
2026-02-09 10:25:27.146344: F hilo/hlo_passes/NeuronHloVerifier.cc:504] [ERROR] [NCC_VRF007] Tiled instruction count 3784872448 exceeds 5000000. TIP: Input HLO might be too big, please consider using smaller batches, applying model parallelism or compile under --optlevel=1 to create smaller subgraphs
|
| 53 |
+
|