{ "cells": [ { "cell_type": "markdown", "id": "b32d3842", "metadata": {}, "source": [ "# SDXL Int8 Quantization Solution by ModelOpt\n", "\n", "### Note:\n", "This notebook requires nvidia-modelopt > 0.9.x, which comes with NeMo framework container > 23.05. An example command to launch the container:\n", "\n", "```\n", "docker run --gpus all -it --rm -v :/opt/NeMo --shm-size=8g \\\n", " -p 8888:8888 --ulimit memlock=-1 --ulimit \\\n", " stack=67108864 \n", "```\n", "\n", "This tutorial shows how to use ModelOpt to calibrate and quantize the UNet part of the SDXL within NeMo framework. \n", "\n", "Please note that NeMo provides users with an end-to-end training framework for SDXL, and this quantization pipeline is supposed to work with a `.nemo` checkpoint trained from their own text-image dataset. In this tutorial, a open-source checkpoint is converted to `.nemo` format for illustration purpose." ] }, { "cell_type": "markdown", "id": "2f8320ca", "metadata": {}, "source": [ "### Download SDXL checkpoint" ] }, { "cell_type": "code", "execution_count": null, "id": "bd436eab", "metadata": {}, "outputs": [], "source": [ "## Download Unet checkpoint\n", "! mkdir -p /sdxl_ckpts/stable-diffusion-xl-base-1.0/unet && wget -P /sdxl_ckpts/stable-diffusion-xl-base-1.0/unet https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/unet/diffusion_pytorch_model.safetensors\n", "## Download Vae checkpoint \n", "! mkdir -p /sdxl_ckpts/stable-diffusion-xl-base-1.0/vae && wget -P /sdxl_ckpts/stable-diffusion-xl-base-1.0/vae https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/vae/diffusion_pytorch_model.safetensors" ] }, { "cell_type": "markdown", "id": "70164e82", "metadata": {}, "source": [ "### Convert downloaded checkpoint into `.nemo` format" ] }, { "cell_type": "code", "execution_count": 1, "id": "c9649553", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FlashAttention Installed\n", "[NeMo I 2024-04-24 22:13:11 distributed:42] Initializing torch.distributed with local_rank: 0, rank: 0, world_size: 1\n", "GPU available: True (cuda), used: True\n", "TPU available: False, using: 0 TPU cores\n", "IPU available: False, using: 0 IPUs\n", "HPU available: False, using: 0 HPUs\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1183] hidden_size not found in {'precision': 'bf16-mixed', 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': True, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': '/sdxl_ckpts/stable-diffusion-xl-base-1.0/unet/diffusion_pytorch_model.safetensors'}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': '/sdxl_ckpts/stable-diffusion-xl-base-1.0/vae/diffusion_pytorch_model.safetensors', 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0'}. Set this in model_parallel_config if using pipeline parallelism.\n", "[rank0]:[W init.cpp:767] Warning: nvfuser is no longer supported in torch script, use _jit_set_nvfuser_enabled is deprecated and a no-op (function operator())\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:253] Rank 0 has data parallel group : [0]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:259] Rank 0 has combined group of data parallel and context parallel : [0]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:264] All data parallel group ranks with context parallel combined: [[0]]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:267] Ranks 0 has data parallel rank: 0\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:284] Rank 0 has context parallel group: [0]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:287] All context parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:288] Ranks 0 has context parallel rank: 0\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:299] Rank 0 has model parallel group: [0]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:300] All model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:310] Rank 0 has tensor model parallel group: [0]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:314] All tensor model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:315] Rank 0 has tensor model parallel rank: 0\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:344] Rank 0 has pipeline model parallel group: [0]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:356] Rank 0 has embedding group: [0]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:362] All pipeline model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:363] Rank 0 has pipeline model parallel rank 0\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:364] All embedding group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:13:12 megatron_init:365] Rank 0 has embedding rank: 0\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:13:12 megatron_base_model:1183] hidden_size not found in {'precision': 'bf16-mixed', 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': True, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': '/sdxl_ckpts/stable-diffusion-xl-base-1.0/unet/diffusion_pytorch_model.safetensors'}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': '/sdxl_ckpts/stable-diffusion-xl-base-1.0/vae/diffusion_pytorch_model.safetensors', 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0'}. Set this in model_parallel_config if using pipeline parallelism.\n", "[NeMo I 2024-04-24 22:13:12 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:13:12 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:12 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:13:12 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:12 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:13:12 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:14 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:13:14 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:16 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:13:16 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:18 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:13:18 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:20 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:13:20 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:21 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:13:21 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:23 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:13:23 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:24 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:13:24 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:24 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:13:24 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:13:25 utils:108] Getting module=, cls=\n", "open_clip_pytorch_model.bin: 100%|██████████| 10.2G/10.2G [01:36<00:00, 106MB/s]\n", "Initialized embedder #0: FrozenCLIPEmbedder with 123060480 params. Trainable: False\n", "Initialized embedder #1: FrozenOpenCLIPEmbedder2 with 694659841 params. Trainable: False\n", "Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "Initialized embedder #3: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "making attention of type 'vanilla' with 512 in_channels\n", "Working with z of shape (1, 4, 32, 32) = 4096 dimensions.\n", "making attention of type 'vanilla' with 512 in_channels\n", "AutoencoderKLInferenceWrapper: Following keys are missing during loading VAE weights, which may lead to compromised image quality for a resumed training. Please check the checkpoint you provided.\n", "Missing:['encoder.mid.attn_1.proj_out.bias', 'decoder.mid.attn_1.v.weight', 'encoder.mid.attn_1.proj_out.weight', 'decoder.mid.attn_1.proj_out.bias', 'decoder.mid.attn_1.q.weight', 'decoder.mid.attn_1.q.bias', 'encoder.mid.attn_1.q.weight', 'encoder.mid.attn_1.k.weight', 'encoder.mid.attn_1.v.bias', 'decoder.mid.attn_1.k.weight', 'decoder.mid.attn_1.v.bias', 'decoder.mid.attn_1.proj_out.weight', 'encoder.mid.attn_1.q.bias', 'encoder.mid.attn_1.v.weight', 'encoder.mid.attn_1.k.bias', 'decoder.mid.attn_1.k.bias']\n", "Unexpected:['encoder.mid.attentions.0.to_k.weight', 'decoder.mid.attentions.0.to_out.0.weight', 'encoder.mid.attentions.0.to_v.bias', 'decoder.mid.attentions.0.to_q.bias', 'encoder.mid.attentions.0.to_q.weight', 'encoder.mid.attentions.0.to_v.weight', 'decoder.mid.attentions.0.to_k.weight', 'decoder.mid.attentions.0.to_v.bias', 'encoder.mid.attentions.0.to_k.bias', 'encoder.mid.attentions.0.to_out.0.bias', 'decoder.mid.attentions.0.to_out.0.bias', 'encoder.mid.attentions.0.to_out.0.weight', 'decoder.mid.attentions.0.to_k.bias', 'decoder.mid.attentions.0.to_v.weight', 'decoder.mid.attentions.0.to_q.weight', 'encoder.mid.attentions.0.to_q.bias']\n", "[NeMo I 2024-04-24 22:15:42 convert_hf_ckpt_to_nemo:226] NeMo model saved to: /quantization/sdxl_base.nemo\n" ] } ], "source": [ "WORKDIR = '/quantization'\n", "! torchrun /opt/NeMo/examples/multimodal/text_to_image/convert_hf_ckpt_to_nemo.py \\\n", " --model_type sdxl \\\n", " --ckpt_path /sdxl_ckpts/stable-diffusion-xl-base-1.0/unet/diffusion_pytorch_model.safetensors \\\n", " --hparams_file /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/conf/sd_xl_base_train.yaml \\\n", " --nemo_file_path $WORKDIR/sdxl_base.nemo" ] }, { "cell_type": "markdown", "id": "25817b85", "metadata": {}, "source": [ "### Run quantization script with default config, and finally the script will export the quantized unet to onnx file.\n", "\n", "##### Quantization config\n", "\n", "```yaml\n", "quantize\n", " exp_name: nemo_test\n", " n_steps: 20 # number of inference steps\n", " format: 'int8' # only int8 quantization is supported now\n", " percentile: 1.0 # Control quantization scaling factors (amax) collecting range, meaning that we will collect the minimum amax in the range of `(n_steps * percentile)` steps. Recommendation: 1.0\n", " batch_size: 1 # batch size calling sdxl inference pipeline during calibration\n", " calib_size: 32 # For SDXL, we recommend 32, 64 or 128\n", " quant_level: 2.5 #Which layers to be quantized, 1: `CNNs`, 2: `CNN + FFN`, 2.5: `CNN + FFN + QKV`, 3: `CNN + Linear`. Recommendation: 2, 2.5 and 3, depending on the requirements for image quality & speedup.\n", " alpha: 0.8 # A parameter in SmoothQuant, used for linear layers only. Recommendation: 0.8 for SDXL\n", "```\n", "\n", "##### Onnx export config\n", "\n", "```yaml\n", "onnx_export:\n", " onnx_dir: nemo_onnx # Path to save onnx files\n", " pretrained_base: ${model.restore_from_path} # Path to nemo checkpoint for sdxl\n", " quantized_ckpt: nemo.unet.state_dict.${quantize.exp_name}.pt # Path to save quantized unet checkpoint\n", " format: int8\n", "```\n", "##### Onnx export config\n", "\n", "```yaml\n", "trt_export:\n", " static_batch: False # static batch engines have better latency\n", " min_batch_size: 1 # minimum batch size when using dynamic batch, has to be the same with max_batch_size and infer.num_samples when using static batch\n", " max_batch_size: 1 # maximum batch size when using dynamic batch, has to be the same with min_batch_size and infer.num_samples when using static batch\n", " int8: True # Allow engine builder recognize int8 precision\n", " builder_optimization_level: 4 # set to 1-5, higher optimization level means better latency but longer compiling time\n", " trt_engine: int8_unet_xl.plan # path to save trt engine\n", "```\n", "\n", "The following command restores a pre-trained sdxl model from `$WORKDIR/sdxl_base.nemo` derived from the above step.\n", "The quantized U-Net checkpoint is saved to `quantize.quantized_ckpt`, converted onnx file is saved to `onnx_export.onnx_dir` and trt engine is saved to `trt_export.trt_engine`." ] }, { "cell_type": "code", "execution_count": 6, "id": "d955f6c3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FlashAttention Installed\n", "[NeMo W 2024-04-24 19:42:59 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.\n", " See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.\n", " ret = run_job(\n", " \n", "[NeMo W 2024-04-24 19:42:59 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/lightning_fabric/connector.py:563: `precision=16` is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!\n", " \n", "Using 16bit Automatic Mixed Precision (AMP)\n", "GPU available: True (cuda), used: True\n", "TPU available: False, using: 0 TPU cores\n", "IPU available: False, using: 0 IPUs\n", "HPU available: False, using: 0 HPUs\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1183] hidden_size not found in {'precision': 16, 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': False, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': None}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': None, 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0', 'ckpt_path': None}. Set this in model_parallel_config if using pipeline parallelism.\n", "[W init.cpp:767] Warning: nvfuser is no longer supported in torch script, use _jit_set_nvfuser_enabled is deprecated and a no-op (function operator())\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:253] Rank 0 has data parallel group : [0]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:259] Rank 0 has combined group of data parallel and context parallel : [0]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:264] All data parallel group ranks with context parallel combined: [[0]]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:267] Ranks 0 has data parallel rank: 0\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:284] Rank 0 has context parallel group: [0]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:287] All context parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:288] Ranks 0 has context parallel rank: 0\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:299] Rank 0 has model parallel group: [0]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:300] All model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:310] Rank 0 has tensor model parallel group: [0]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:314] All tensor model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:315] Rank 0 has tensor model parallel rank: 0\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:344] Rank 0 has pipeline model parallel group: [0]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:356] Rank 0 has embedding group: [0]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:362] All pipeline model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:363] Rank 0 has pipeline model parallel rank 0\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:364] All embedding group ranks: [[0]]\n", "[NeMo I 2024-04-24 19:43:09 megatron_init:365] Rank 0 has embedding rank: 0\n", "24-04-24 19:43:09 - PID:1361 - rank:(0, 0, 0, 0) - microbatches.py:39 - INFO - setting number of micro-batches to constant 1\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 19:43:09 megatron_base_model:1183] hidden_size not found in {'precision': 16, 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': False, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': None}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': None, 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0', 'ckpt_path': None}. Set this in model_parallel_config if using pipeline parallelism.\n", "[NeMo I 2024-04-24 19:43:09 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 19:43:09 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:09 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 19:43:09 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:10 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 19:43:10 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:11 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 19:43:11 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:13 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 19:43:13 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:15 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 19:43:15 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:17 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 19:43:17 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:19 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 19:43:19 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:20 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 19:43:20 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:21 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 19:43:21 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:21 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 19:43:21 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 19:43:21 utils:108] Getting module=, cls=\n", "Loaded ViT-bigG-14 model config.\n", "Loading pretrained ViT-bigG-14 weights (laion2b_s39b_b160k).\n", "Initialized embedder #0: FrozenCLIPEmbedder with 123060480 params. Trainable: False\n", "Initialized embedder #1: FrozenOpenCLIPEmbedder2 with 694659841 params. Trainable: False\n", "Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "Initialized embedder #3: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "making attention of type 'vanilla' with 512 in_channels\n", "Working with z of shape (1, 4, 32, 32) = 4096 dimensions.\n", "making attention of type 'vanilla' with 512 in_channels\n", "[NeMo I 2024-04-24 19:43:53 nlp_overrides:1155] Model MegatronDiffusionEngine was successfully restored from /quantization/sdxl_base.nemo.\n", "Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1\n", "----------------------------------------------------------------------------------------------------\n", "distributed_backend=nccl\n", "All distributed processes registered. Starting with 1 processes\n", "----------------------------------------------------------------------------------------------------\n", "\n", "Building TensorRT engine for /quantization/nemo_onnx/unet.onnx: /quantization/int8_unet_xl.plan\n", "[I] Configuring with profiles:[\n", " Profile 0:\n", " {x [min=(1, 4, 128, 128), opt=(4, 4, 128, 128), max=(8, 4, 128, 128)],\n", " y [min=(1, 2816), opt=(4, 2816), max=(8, 2816)],\n", " timesteps [min=(1,), opt=(4,), max=(8,)],\n", " context [min=(1, 80, 2048), opt=(4, 80, 2048), max=(8, 80, 2048)]}\n", " ]\n", "\u001b[38;5;11m[W] It looks like some layers in the network have compute precision set, but precision constraints were not enabled. \n", " Precision constraints must be set to 'prefer' or 'obey' for layer compute precision to take effect. \n", " Note: Layers and their requested precisions were: {'/input_blocks.0/input_blocks.0.0/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.0/input_blocks.0.0/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.0/input_blocks.0.0/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.0/input_blocks.0.0/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.1/input_blocks.1.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.1/input_blocks.1.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.1/input_blocks.1.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.1/input_blocks.1.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.1/input_blocks.1.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.1/input_blocks.1.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.1/input_blocks.1.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.1/input_blocks.1.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.2/input_blocks.2.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.2/input_blocks.2.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.2/input_blocks.2.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.2/input_blocks.2.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.2/input_blocks.2.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.2/input_blocks.2.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.2/input_blocks.2.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.2/input_blocks.2.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.3/input_blocks.3.0/op/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.3/input_blocks.3.0/op/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.3/input_blocks.3.0/op/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.3/input_blocks.3.0/op/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.4/input_blocks.4.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.5/input_blocks.5.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.6/input_blocks.6.0/op/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.6/input_blocks.6.0/op/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.6/input_blocks.6.0/op/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.6/input_blocks.6.0/op/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.7/input_blocks.7.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/input_blocks.8/input_blocks.8.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.2/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.2/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.2/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.2/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.2/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.2/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/middle_block/middle_block.2/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/middle_block/middle_block.2/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.0/output_blocks.0.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.1/output_blocks.1.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.2/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.3/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.4/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.5/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.6/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.7/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.8/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.1/transformer_blocks.9/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.2/conv/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.2/conv/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.2/output_blocks.2.2/conv/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.2/output_blocks.2.2/conv/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.3/output_blocks.3.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.4/output_blocks.4.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.0/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn1/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_q/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_q/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_q/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_q/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_k/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_k/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_k/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_k/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_v/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_v/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_v/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/attn2/to_v/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.0/proj/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.1/transformer_blocks.1/ff/net/net.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.2/conv/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.2/conv/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.5/output_blocks.5.2/conv/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.5/output_blocks.5.2/conv/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.6/output_blocks.6.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.6/output_blocks.6.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.7/output_blocks.7.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.7/output_blocks.7.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/in_layers/in_layers.1/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/in_layers/in_layers.1/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/in_layers/in_layers.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/in_layers/in_layers.1/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/out_layers/out_layers.2/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/out_layers/out_layers.2/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/out_layers/out_layers.2/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/out_layers/out_layers.2/weight_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/skip_connection/input_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/skip_connection/input_quantizer/DequantizeLinear': 'INT8', '/output_blocks.8/output_blocks.8.0/skip_connection/weight_quantizer/QuantizeLinear': 'FLOAT', '/output_blocks.8/output_blocks.8.0/skip_connection/weight_quantizer/DequantizeLinear': 'INT8', '/out/out.1/input_quantizer/QuantizeLinear': 'FLOAT', '/out/out.1/input_quantizer/DequantizeLinear': 'INT8', '/out/out.1/weight_quantizer/QuantizeLinear': 'FLOAT', '/out/out.1/weight_quantizer/DequantizeLinear': 'INT8'}\u001b[0m\n", "\u001b[38;5;14m[I] Building engine with configuration:\n", " Flags | [FP16, INT8]\n", " Engine Capability | EngineCapability.DEFAULT\n", " Memory Pools | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n", " Tactic Sources | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n", " Profiling Verbosity | ProfilingVerbosity.DETAILED\n", " Preview Features | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n", "\u001b[38;5;10m[I] Finished engine building in 881.973 seconds\u001b[0m\n", "[I] Saving engine to /quantization/int8_unet_xl.plan\n" ] } ], "source": [ "! torchrun /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_quantize.py model.restore_from_path=$WORKDIR/sdxl_base.nemo onnx_export.onnx_dir=$WORKDIR/nemo_onnx quantize.quantized_ckpt=$WORKDIR/nemo.unet.state_dict.nemo.pt trt_export.trt_engine=$WORKDIR/int8_unet_xl.plan\n" ] }, { "cell_type": "markdown", "id": "f97d6bfa", "metadata": {}, "source": [ "### Build end to end TRT inference pipeline\n", "In order to run an end to end inference with quantized U-Net engine, we need to export and build engines for the other compenents in SDXL, which includes the VAE and two CLIP encoder. The following script restores SDXL from the `nemo` checkpoint and saves the corresponding engine files to `infer.out_path`." ] }, { "cell_type": "code", "execution_count": 2, "id": "2e8b7742", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FlashAttention Installed\n", "[NeMo W 2024-04-24 22:17:42 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.\n", " See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.\n", " ret = run_job(\n", " \n", "[NeMo W 2024-04-24 22:17:42 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/lightning_fabric/connector.py:563: `precision=16` is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!\n", " \n", "Using 16bit Automatic Mixed Precision (AMP)\n", "GPU available: True (cuda), used: True\n", "TPU available: False, using: 0 TPU cores\n", "IPU available: False, using: 0 IPUs\n", "HPU available: False, using: 0 HPUs\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1183] hidden_size not found in {'precision': 16, 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': False, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': None}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': None, 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0', 'ckpt_path': None}. Set this in model_parallel_config if using pipeline parallelism.\n", "[W init.cpp:767] Warning: nvfuser is no longer supported in torch script, use _jit_set_nvfuser_enabled is deprecated and a no-op (function operator())\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:253] Rank 0 has data parallel group : [0]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:259] Rank 0 has combined group of data parallel and context parallel : [0]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:264] All data parallel group ranks with context parallel combined: [[0]]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:267] Ranks 0 has data parallel rank: 0\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:284] Rank 0 has context parallel group: [0]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:287] All context parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:288] Ranks 0 has context parallel rank: 0\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:299] Rank 0 has model parallel group: [0]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:300] All model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:310] Rank 0 has tensor model parallel group: [0]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:314] All tensor model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:315] Rank 0 has tensor model parallel rank: 0\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:344] Rank 0 has pipeline model parallel group: [0]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:356] Rank 0 has embedding group: [0]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:362] All pipeline model parallel group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:363] Rank 0 has pipeline model parallel rank 0\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:364] All embedding group ranks: [[0]]\n", "[NeMo I 2024-04-24 22:17:50 megatron_init:365] Rank 0 has embedding rank: 0\n", "24-04-24 22:17:50 - PID:703 - rank:(0, 0, 0, 0) - microbatches.py:39 - INFO - setting number of micro-batches to constant 1\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tensor_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: virtual_pipeline_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: sequence_parallel in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: use_cpu_initialization in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: gradient_accumulation_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_atomic_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: overlap_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: batch_p2p_comm in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: pipeline_model_parallel_split_rank in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_num_layers in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: _cpu_offloading_context in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_activations in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: cpu_offloading_weights in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1172] The model: MegatronDiffusionEngine() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.\n", "[NeMo W 2024-04-24 22:17:50 megatron_base_model:1183] hidden_size not found in {'precision': 16, 'micro_batch_size': 1, 'global_batch_size': 1, 'scale_factor': 0.13025, 'disable_first_stage_autocast': True, 'is_legacy': False, 'inductor': False, 'capture_cudagraph_iters': -1, 'scale_by_std': False, 'channels_last': False, 'fsdp': False, 'fsdp_set_buffer_dtype': None, 'precache_mode': None, 'loss_fn_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.loss.StandardDiffusionLoss', 'sigma_sampler': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.sigma_sampling.DiscreteSampling', 'num_idx': 1000, 'discretization': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}}, 'denoiser_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser.DiscreteDenoiser', 'num_idx': 1000, 'weighting_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_weighting.EpsWeighting'}, 'scaling_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser_scaling.EpsScaling'}, 'discretization_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.discretizer.LegacyDDPMDiscretization'}}, 'unet_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.openaimodel.UNetModel', 'from_NeMo': False, 'adm_in_channels': 2816, 'num_classes': 'sequential', 'use_checkpoint': False, 'in_channels': 4, 'out_channels': 4, 'model_channels': 320, 'attention_resolutions': [4, 2], 'num_res_blocks': 2, 'channel_mult': [1, 2, 4], 'num_head_channels': 64, 'use_spatial_transformer': True, 'use_linear_in_transformer': True, 'transformer_depth': [1, 2, 10], 'context_dim': 2048, 'image_size': 64, 'legacy': False, 'use_flash_attention': True, 'from_pretrained': None}, 'first_stage_config': {'_target_': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.ldm.autoencoder.AutoencoderKLInferenceWrapper', 'from_pretrained': None, 'embed_dim': 4, 'monitor': 'val/rec_loss', 'ddconfig': {'attn_type': 'vanilla', 'double_z': True, 'z_channels': 4, 'resolution': 256, 'in_channels': 3, 'out_ch': 3, 'ch': 128, 'ch_mult': [1, 2, 4, 4], 'num_res_blocks': 2, 'attn_resolutions': [], 'dropout': 0.0}, 'lossconfig': {'target': 'torch.nn.Identity'}}, 'conditioner_config': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.GeneralConditioner', 'emb_models': [{'is_trainable': False, 'input_key': 'captions', 'ucg_rate': 0.1, 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenCLIPEmbedder', 'layer': 'hidden', 'layer_idx': 11}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'captions', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.FrozenOpenCLIPEmbedder2', 'arch': 'ViT-bigG-14', 'version': 'laion2b_s39b_b160k', 'freeze': True, 'layer': 'penultimate', 'always_return_pooled': True, 'legacy': False}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'original_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'crop_coords_top_left', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}, {'is_trainable': False, 'ucg_rate': 0.1, 'input_key': 'target_size_as_tuple', 'emb_model': {'_target_': 'nemo.collections.multimodal.modules.stable_diffusion.encoders.modules.ConcatTimestepEmbedderND', 'outdim': 256}}]}, 'data': {'num_workers': 16, 'train': {'dataset_path': ['/datasets/coyo/test.pkl'], 'augmentations': {'resize_smallest_side': 256, 'horizontal_flip': False}, 'filterings': None}, 'webdataset': {'infinite_sampler': False, 'local_root_path': '/datasets/coyo'}}, 'seed': 1234, 'resume_from_checkpoint': None, 'apex_transformer_log_level': 30, 'gradient_as_bucket_view': True, 'optim': {'name': 'fused_adam', 'lr': 0.0001, 'weight_decay': 0.0, 'betas': [0.9, 0.999], 'sched': {'name': 'WarmupHoldPolicy', 'warmup_steps': 10000, 'hold_steps': 10000000000000}}, 'nsys_profile': {'enabled': False, 'start_step': 10, 'end_step': 10, 'ranks': [0], 'gen_shape': False}, 'target': 'nemo.collections.multimodal.models.text_to_image.stable_diffusion.diffusion_engine.MegatronDiffusionEngine', 'nemo_version': '1.23.0rc0', 'ckpt_path': None}. Set this in model_parallel_config if using pipeline parallelism.\n", "[NeMo I 2024-04-24 22:17:50 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:17:50 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:17:50 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:17:50 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:17:51 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:17:51 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:17:53 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:17:53 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:17:54 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:17:54 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:17:56 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:17:56 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:17:58 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:17:58 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:18:00 attention:436] constructing SpatialTransformer of depth 10 w/ 1280 channels and 20 heads\n", "[NeMo I 2024-04-24 22:18:00 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 10. Setting context_dim to [2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048, 2048] now.\n", "[NeMo I 2024-04-24 22:18:02 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:18:02 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:18:02 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:18:02 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:18:02 attention:436] constructing SpatialTransformer of depth 2 w/ 640 channels and 10 heads\n", "[NeMo I 2024-04-24 22:18:02 attention:445] WARNING: SpatialTransformer: Found context dims [2048] of depth 1, which does not match the specified 'depth' of 2. Setting context_dim to [2048, 2048] now.\n", "[NeMo I 2024-04-24 22:18:02 utils:108] Getting module=, cls=\n", "Loaded ViT-bigG-14 model config.\n", "Loading pretrained ViT-bigG-14 weights (laion2b_s39b_b160k).\n", "Initialized embedder #0: FrozenCLIPEmbedder with 123060480 params. Trainable: False\n", "Initialized embedder #1: FrozenOpenCLIPEmbedder2 with 694659841 params. Trainable: False\n", "Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "Initialized embedder #3: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False\n", "making attention of type 'vanilla' with 512 in_channels\n", "Working with z of shape (1, 4, 32, 32) = 4096 dimensions.\n", "making attention of type 'vanilla' with 512 in_channels\n", "[NeMo I 2024-04-24 22:18:35 nlp_overrides:1155] Model MegatronDiffusionEngine was successfully restored from /quantization/sdxl_base.nemo.\n", "Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1\n", "----------------------------------------------------------------------------------------------------\n", "distributed_backend=nccl\n", "All distributed processes registered. Starting with 1 processes\n", "----------------------------------------------------------------------------------------------------\n", "\n", "[NeMo W 2024-04-24 22:18:36 nemo_logging:349] /opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py:1184: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " assert y.shape[0] == x.shape[0]\n", " \n", "[NeMo W 2024-04-24 22:18:36 nemo_logging:349] /opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py:209: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " assert x.shape[1] == self.channels\n", " \n", "[NeMo W 2024-04-24 22:18:37 nemo_logging:349] /opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/openaimodel.py:145: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " assert x.shape[1] == self.channels\n", " \n", "[NeMo W 2024-04-24 22:22:17 nemo_logging:349] /opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/model.py:172: TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " w_ = w_ * (int(c) ** (-0.5))\n", " \n", "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/torch/onnx/utils.py:2095: UserWarning: Provided key z_pooled for dynamic axes is not a valid input/output name\n", " warnings.warn(\n", " \n", "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/modeling_attn_mask_utils.py:86: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " if input_shape[-1] > 1 or self.sliding_window is not None:\n", " \n", "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/modeling_attn_mask_utils.py:162: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " if past_key_values_length > 0:\n", " \n", "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py:281: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " if attn_weights.size() != (bsz * self.num_heads, tgt_len, src_len):\n", " \n", "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py:289: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " if causal_attention_mask.size() != (bsz, 1, tgt_len, src_len):\n", " \n", "[NeMo W 2024-04-24 22:22:19 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformers/models/clip/modeling_clip.py:321: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!\n", " if attn_output.size() != (bsz * self.num_heads, tgt_len, self.head_dim):\n", " \n", "[NeMo W 2024-04-24 22:22:27 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/torch/onnx/symbolic_opset9.py:5859: UserWarning: Exporting aten::index operator of advanced indexing in opset 14 is achieved by combination of multiple ONNX operators, including Reshape, Transpose, Concat, and Gather. If indices include negative values, the exported graph will produce incorrect results.\n", " warnings.warn(\n", " \n", "Building TensorRT engine for /quantization/onnx/unet_xl/unet_xl.onnx: /quantization/plan/unet_xl.plan\n", "[I] Configuring with profiles:[\n", " Profile 0:\n", " {x [min=(1, 4, 128, 128), opt=(2, 4, 128, 128), max=(8, 4, 128, 128)],\n", " y [min=(1, 2816), opt=(2, 2816), max=(8, 2816)],\n", " timesteps [min=(1,), opt=(2,), max=(8,)],\n", " context [min=(1, 80, 2048), opt=(2, 80, 2048), max=(8, 80, 2048)]}\n", " ]\n", "\u001b[38;5;14m[I] Building engine with configuration:\n", " Flags | [FP16]\n", " Engine Capability | EngineCapability.DEFAULT\n", " Memory Pools | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n", " Tactic Sources | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n", " Profiling Verbosity | ProfilingVerbosity.DETAILED\n", " Preview Features | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n", "\u001b[38;5;11m[W] Detected layernorm nodes in FP16.\u001b[0m\n", "\u001b[38;5;11m[W] Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.\u001b[0m\n", "\u001b[38;5;10m[I] Finished engine building in 553.937 seconds\u001b[0m\n", "[I] Saving engine to /quantization/plan/unet_xl.plan\n", "Building TensorRT engine for /quantization/onnx/vae/vae.onnx: /quantization/plan/vae.plan\n", "[I] Configuring with profiles:[\n", " Profile 0:\n", " {z [min=(1, 4, 128, 128), opt=(2, 4, 128, 128), max=(8, 4, 128, 128)]}\n", " ]\n", "\u001b[38;5;14m[I] Building engine with configuration:\n", " Flags | []\n", " Engine Capability | EngineCapability.DEFAULT\n", " Memory Pools | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n", " Tactic Sources | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n", " Profiling Verbosity | ProfilingVerbosity.DETAILED\n", " Preview Features | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n", "\u001b[38;5;10m[I] Finished engine building in 266.743 seconds\u001b[0m\n", "[I] Saving engine to /quantization/plan/vae.plan\n", "Building TensorRT engine for /quantization/onnx/clip1/clip1.onnx: /quantization/plan/clip1.plan\n", "\u001b[38;5;11m[W] ModelImporter.cpp:409: Make sure input input_ids has Int64 binding.\u001b[0m\n", "[I] Configuring with profiles:[\n", " Profile 0:\n", " {input_ids [min=(1, 77), opt=(2, 77), max=(8, 77)]}\n", " ]\n", "\u001b[38;5;14m[I] Building engine with configuration:\n", " Flags | [FP16]\n", " Engine Capability | EngineCapability.DEFAULT\n", " Memory Pools | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n", " Tactic Sources | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n", " Profiling Verbosity | ProfilingVerbosity.DETAILED\n", " Preview Features | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n", "\u001b[38;5;10m[I] Finished engine building in 16.988 seconds\u001b[0m\n", "[I] Saving engine to /quantization/plan/clip1.plan\n", "Building TensorRT engine for /quantization/onnx/clip2/clip2.onnx: /quantization/plan/clip2.plan\n", "[I] Configuring with profiles:[\n", " Profile 0:\n", " {input_ids [min=(1, 77), opt=(2, 77), max=(8, 77)]}\n", " ]\n", "\u001b[38;5;14m[I] Building engine with configuration:\n", " Flags | [FP16]\n", " Engine Capability | EngineCapability.DEFAULT\n", " Memory Pools | [WORKSPACE: 48685.38 MiB, TACTIC_DRAM: 48685.38 MiB]\n", " Tactic Sources | [CUBLAS, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]\n", " Profiling Verbosity | ProfilingVerbosity.DETAILED\n", " Preview Features | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]\u001b[0m\n", "\u001b[38;5;10m[I] Finished engine building in 72.535 seconds\u001b[0m\n", "[I] Saving engine to /quantization/plan/clip2.plan\n" ] } ], "source": [ "! torchrun /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_export.py model.restore_from_path=$WORKDIR/sdxl_base.nemo infer.out_path=$WORKDIR" ] }, { "cell_type": "markdown", "id": "e7eb2d03", "metadata": {}, "source": [ "### Run TRT inference pipeline with original engines" ] }, { "cell_type": "code", "execution_count": null, "id": "25737be2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "FlashAttention Installed\n", "[NeMo W 2024-04-24 22:46:11 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.\n", " See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.\n", " ret = run_job(\n", " \n", "Loading TensorRT engine: /quantization/plan/unet_xl.plan\n", "[I] Loading bytes from /quantization/plan/unet_xl.plan\n", "unet_xl trt engine loaded successfully\n", "Loading TensorRT engine: /quantization/plan/vae.plan\n", "[I] Loading bytes from /quantization/plan/vae.plan\n", "vae trt engine loaded successfully\n", "Loading TensorRT engine: /quantization/plan/clip1.plan\n", "[I] Loading bytes from /quantization/plan/clip1.plan\n", "clip1 trt engine loaded successfully\n", "Loading TensorRT engine: /quantization/plan/clip2.plan\n", "[I] Loading bytes from /quantization/plan/clip2.plan\n", "clip2 trt engine loaded successfully\n", "[NeMo I 2024-04-24 22:46:17 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:46:17 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:46:17 utils:108] Getting module=, cls=\n", "############################## Sampling setting ##############################\n", "Sampler: EulerEDMSampler\n", "Discretization: LegacyDDPMDiscretization\n", "Guider: VanillaCFG\n", "Sampling with EulerEDMSampler for 41 steps: 98%|▉| 40/41 [00:24<00:00, 1.60it/\n", "This batch takes 27.204587490297854s\n", "[NeMo I 2024-04-24 22:46:45 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:46:45 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:46:45 utils:108] Getting module=, cls=\n", "############################## Sampling setting ##############################\n", "Sampler: EulerEDMSampler\n", "Discretization: LegacyDDPMDiscretization\n", "Guider: VanillaCFG\n", "Sampling with EulerEDMSampler for 41 steps: 98%|▉| 40/41 [00:25<00:00, 1.57it/\n", "This batch takes 25.58329666685313s\n", "[NeMo I 2024-04-24 22:47:14 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:47:14 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:47:14 utils:108] Getting module=, cls=\n", "############################## Sampling setting ##############################\n", "Sampler: EulerEDMSampler\n", "Discretization: LegacyDDPMDiscretization\n", "Guider: VanillaCFG\n", "Sampling with EulerEDMSampler for 41 steps: 98%|▉| 40/41 [00:25<00:00, 1.55it/\n", "This batch takes 25.87396944500506s\n", "[NeMo I 2024-04-24 22:47:44 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:47:44 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:47:44 utils:108] Getting module=, cls=\n", "############################## Sampling setting ##############################\n", "Sampler: EulerEDMSampler\n", "Discretization: LegacyDDPMDiscretization\n", "Guider: VanillaCFG\n", "Sampling with EulerEDMSampler for 41 steps: 98%|▉| 40/41 [00:25<00:00, 1.54it/\n", "This batch takes 26.03419069480151s\n", "[NeMo I 2024-04-24 22:48:13 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:48:13 utils:108] Getting module=, cls=\n", "[NeMo I 2024-04-24 22:48:13 utils:108] Getting module=, cls=\n", "############################## Sampling setting ##############################\n", "Sampler: EulerEDMSampler\n", "Discretization: LegacyDDPMDiscretization\n", "Guider: VanillaCFG\n", "Sampling with EulerEDMSampler for 41 steps: 71%|▋| 29/41 [00:18<00:07, 1.52it/" ] } ], "source": [ "! torchrun /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_trt_inference.py \\\n", " out_path=$WORKDIR/trt_output_fp16 \\\n", " unet_xl=$WORKDIR/plan/unet_xl.plan \\\n", " vae=$WORKDIR/plan/vae.plan \\\n", " clip1=$WORKDIR/plan/clip1.plan \\\n", " clip2=$WORKDIR/plan/clip2.plan\n", " " ] }, { "cell_type": "markdown", "id": "d11bbe7d", "metadata": {}, "source": [ "### Run TRT inference pipeline with quantized U-Net engine" ] }, { "cell_type": "code", "execution_count": 5, "id": "3f2263b4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "^C\n", "Traceback (most recent call last):\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3109, in _dep_map\n", "[2024-04-24 19:42:46,104] torch.distributed.elastic.agent.server.api: [WARNING] Received Signals.SIGINT death signal, shutting down workers\n", "[2024-04-24 19:42:46,104] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 1300 closing signal SIGINT\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 2902, in __getattr__\n", " raise AttributeError(attr)\n", "AttributeError: _DistInfoDistribution__dep_map\n", "\n", "During handling of the above exception, another exception occurred:\n", "\n", "Traceback (most recent call last):\n", " File \"/opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_trt_inference.py\", line 25, in \n", " from nemo.collections.multimodal.modules.stable_diffusion.diffusionmodules.denoiser import DiscreteDenoiser\n", " File \"/opt/NeMo/nemo/collections/multimodal/modules/stable_diffusion/diffusionmodules/denoiser.py\", line 17, in \n", " from nemo.collections.multimodal.parts.stable_diffusion.utils import append_dims, instantiate_from_config\n", " File \"/opt/NeMo/nemo/collections/multimodal/parts/stable_diffusion/utils.py\", line 25, in \n", " from nemo.utils import logging\n", " File \"/opt/NeMo/nemo/utils/__init__.py\", line 31, in \n", " from nemo.utils.lightning_logger_patch import add_memory_handlers_to_pl_logger\n", " File \"/opt/NeMo/nemo/utils/lightning_logger_patch.py\", line 18, in \n", " import pytorch_lightning as pl\n", " File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/__init__.py\", line 27, in \n", " from pytorch_lightning.callbacks import Callback # noqa: E402\n", " File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/__init__.py\", line 14, in \n", " from pytorch_lightning.callbacks.batch_size_finder import BatchSizeFinder\n", " File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/batch_size_finder.py\", line 26, in \n", " from pytorch_lightning.callbacks.callback import Callback\n", " File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/callback.py\", line 22, in \n", " from pytorch_lightning.utilities.types import STEP_OUTPUT\n", " File \"/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/types.py\", line 41, in \n", " from torchmetrics import Metric\n", " File \"/usr/local/lib/python3.10/dist-packages/torchmetrics/__init__.py\", line 22, in \n", " from torchmetrics import functional # noqa: E402\n", " File \"/usr/local/lib/python3.10/dist-packages/torchmetrics/functional/__init__.py\", line 121, in \n", " from torchmetrics.functional.text._deprecated import _bleu_score as bleu_score\n", " File \"/usr/local/lib/python3.10/dist-packages/torchmetrics/functional/text/__init__.py\", line 49, in \n", " if _TRANSFORMERS_GREATER_EQUAL_4_4:\n", " File \"/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py\", line 164, in __bool__\n", " self._check_available()\n", " File \"/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py\", line 158, in _check_available\n", " self._check_requirement()\n", " File \"/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py\", line 132, in _check_requirement\n", " pkg_resources.require(self.requirement)\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 968, in require\n", " needed = self.resolve(parse_requirements(requirements))\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 834, in resolve\n", " new_requirements = dist.requires(req.extras)[::-1]\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 2822, in requires\n", " dm = self._dep_map\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3111, in _dep_map\n", " self.__dep_map = self._compute_dependencies()\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3133, in _compute_dependencies\n", " dm[s_extra] = [r for r in reqs_for_extra(extra) if r not in common]\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3133, in \n", " dm[s_extra] = [r for r in reqs_for_extra(extra) if r not in common]\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py\", line 3125, in reqs_for_extra\n", " if not req.marker or req.marker.evaluate({'extra': extra}):\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/_vendor/packaging/markers.py\", line 252, in evaluate\n", " return _evaluate_markers(self._markers, current_environment)\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/_vendor/packaging/markers.py\", line 164, in _evaluate_markers\n", " return any(all(item) for item in groups)\n", " File \"/usr/local/lib/python3.10/dist-packages/pkg_resources/_vendor/packaging/markers.py\", line 164, in \n", " return any(all(item) for item in groups)\n", "KeyboardInterrupt\n" ] } ], "source": [ "! torchrun /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_trt_inference.py \\\n", " out_path=$WORKDIR/trt_output_int8 \\\n", " unet_xl=$WORKDIR/int8_unet_xl.plan \\\n", " vae=$WORKDIR/plan/vae.plan \\\n", " clip1=$WORKDIR/plan/clip1.plan \\\n", " clip2=$WORKDIR/plan/clip2.plan" ] }, { "cell_type": "code", "execution_count": null, "id": "c48c21dd", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 5 }