Я пытаюсь преобразовать модель Vision Transformer ViT-B/32 из репозитория UNICOM на Jetson Orin Nano. Класс Vision Transformer модели и исходный код находятся здесь.
Я использую следующий код для преобразования модели в ONNX:
import torch
import onnx
import onnxruntime
from unicom.vision_transformer import build_model
if __name__ == '__main__':
model_name = "ViT-B/32"
model_name_fp16 = "FP16-ViT-B-32"
onnx_model_path = f"{model_name_fp16}.onnx"
model = build_model(model_name)
model.eval()
model = model.to('cuda')
torch_input = torch.randn(1, 3, 224, 224).to('cuda')
onnx_program = torch.onnx.dynamo_export(model, torch_input)
onnx_program.save(onnx_model_path)
onnx_model = onnx.load(onnx_model_path)
onnx.checker.check_model(onnx_model_path)
Затем я использую следующую командную строку для преобразования модели ONNX в механизм TensorRT:
/usr/src/tensorrt/bin/trtexec --onnx=FP16-ViT-B-32.onnx --saveEngine=FP16-ViT-B-32.trt --workspace=1024 --fp16
Это приводит к следующей ошибке:
[W] --workspace flag has been deprecated by --memPoolSize flag.
=== Model Options ===
Format: ONNX
Model: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.onnx
Output:
=== Build Options ===
Max batch: explicit batch
Memory Pools: workspace: 1024 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
minTiming: 1
avgTiming: 8
Precision: FP32+FP16
[I] LayerPrecisions:
[I] Layer Device Types:
[I] Calibration:
[I] Refit: Disabled
[I] Version Compatible: Disabled
[I] ONNX Native InstanceNorm: Disabled
[I] TensorRT runtime: full
[I] Lean DLL Path:
[I] Tempfile Controls: { in_memory: allow, temporary: allow }
[I] Exclude Lean Runtime: Disabled
[I] Sparsity: Disabled
[I] Safe mode: Disabled
[I] Build DLA standalone loadable: Disabled
[I] Allow GPU fallback for DLA: Disabled
[I] DirectIO mode: Disabled
[I] Restricted mode: Disabled
[I] Skip inference: Disabled
[I] Save engine: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.trt
[I] Load engine:
[I] Profiling verbosity: 0
[I] Tactic sources: Using default tactic sources
[I] timingCacheMode: local
[I] timingCacheFile:
[I] Heuristic: Disabled
[I] Preview Features: Use default preview flags.
[I] MaxAuxStreams: -1
[I] BuilderOptimizationLevel: -1
[I] Input(s)s format: fp32:CHW
[I] Output(s)s format: fp32:CHW
[I] Input build shapes: model
[I] Input calibration shapes: model
[I] === System Options ===
[I] Device: 0
[I] DLACore:
[I] Plugins:
[I] setPluginsToSerialize:
[I] dynamicPlugins:
[I] ignoreParsedPluginLibs: 0
[I]
[I] === Inference Options ===
[I] Batch: Explicit
[I] Input inference shapes: model
[I] Iterations: 10
[I] Duration: 3s (+ 200ms warm up)
[I] Sleep time: 0ms
[I] Idle time: 0ms
[I] Inference Streams: 1
[I] ExposeDMA: Disabled
[I] Data transfers: Enabled
[I] Spin-wait: Disabled
[I] Multithreading: Disabled
[I] CUDA Graph: Disabled
[I] Separate profiling: Disabled
[I] Time Deserialize: Disabled
[I] Time Refit: Disabled
[I] NVTX verbosity: 0
[I] Persistent Cache Ratio: 0
[I] Inputs:
[I] === Reporting Options ===
[I] Verbose: Disabled
[I] Averages: 10 inferences
[I] Percentiles: 90,95,99
[I] Dump refittable layers:Disabled
[I] Dump output: Disabled
[I] Profile: Disabled
[I] Export timing to JSON file:
[I] Export output to JSON file:
[I] Export profile to JSON file:
[I]
[I] === Device Information ===
[I] Selected Device: Orin
[I] Compute Capability: 8.7
[I] SMs: 8
[I] Device Global Memory: 7620 MiB
[I] Shared Memory per SM: 164 KiB
[I] Memory Bus Width: 128 bits (ECC disabled)
[I] Application Compute Clock Rate: 0.624 GHz
[I] Application Memory Clock Rate: 0.624 GHz
[I]
[I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[I]
[I] TensorRT version: 8.6.2
[I] Loading standard plugins
[I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 33, GPU 4508 (MiB)
[I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1154, GPU +1351, now: CPU 1223, GPU 5866 (MiB)
[I] Start parsing network model.
[I] [TRT] ----------------------------------------------------------------
[I] [TRT] Input filename: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.onnx
[I] [TRT] ONNX IR version: 0.0.8
[I] [TRT] Opset version: 1
[I] [TRT] Producer name: pytorch
[I] [TRT] Producer version: 2.3.0
[I] [TRT] Domain:
[I] [TRT] Model version: 0
[I] [TRT] Doc string:
[I] [TRT] ----------------------------------------------------------------
[I] [TRT] No importer registered for op: unicom_vision_transformer_PatchEmbedding_patch_embed_1. Attempting to import as plugin.
[I] [TRT] Searching for plugin: unicom_vision_transformer_PatchEmbedding_patch_embed_1, plugin_version: 1, plugin_namespace:
[E] [TRT] 3: getPluginCreator could not find plugin: unicom_vision_transformer_PatchEmbedding_patch_embed_1 version: 1
[E] [TRT] ModelImporter.cpp:768: While parsing node number 0 [unicom_vision_transformer_PatchEmbedding_patch_embed_1 -> "patch_embed_1"]:
[E] [TRT] ModelImporter.cpp:769: --- Begin node ---
[E] [TRT] ModelImporter.cpp:770: input: "l_x_"
[W] --workspace flag has been deprecated by --memPoolSize flag.
[I] === Model Options ===
[I] Format: ONNX
[I] Model: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.onnx
[I] Output:
[I] === Build Options ===
[I] Max batch: explicit batch
[I] Memory Pools: workspace: 1024 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[I] minTiming: 1
[I] avgTiming: 8
[I] Precision: FP32+FP16
[I] LayerPrecisions:
[I] Layer Device Types:
[I] Calibration:
[I] Refit: Disabled
[I] Version Compatible: Disabled
[I] ONNX Native InstanceNorm: Disabled
[I] TensorRT runtime: full
[I] Lean DLL Path:
[I] Tempfile Controls: { in_memory: allow, temporary: allow }
[I] Exclude Lean Runtime: Disabled
[I] Sparsity: Disabled
[I] Safe mode: Disabled
[I] Build DLA standalone loadable: Disabled
[I] Allow GPU fallback for DLA: Disabled
[I] DirectIO mode: Disabled
[I] Restricted mode: Disabled
[I] Skip inference: Disabled
[I] Save engine: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.trt
[I] Load engine:
[I] Profiling verbosity: 0
[I] Tactic sources: Using default tactic sources
[I] timingCacheMode: local
[I] timingCacheFile:
[I] Heuristic: Disabled
[I] Preview Features: Use default preview flags.
[I] MaxAuxStreams: -1
[I] BuilderOptimizationLevel: -1
[I] Input(s)s format: fp32:CHW
[I] Output(s)s format: fp32:CHW
[I] Input build shapes: model
[I] Input calibration shapes: model
[I] === System Options ===
[I] Device: 0
[I] DLACore:
[I] Plugins:
[I] setPluginsToSerialize:
[I] dynamicPlugins:
[I] ignoreParsedPluginLibs: 0
[I]
[I] === Inference Options ===
[I] Batch: Explicit
[I] Input inference shapes: model
[I] Iterations: 10
[I] Duration: 3s (+ 200ms warm up)
[I] Sleep time: 0ms
[I] Idle time: 0ms
[I] Inference Streams: 1
[I] ExposeDMA: Disabled
[I] Data transfers: Enabled
[I] Spin-wait: Disabled
[I] Multithreading: Disabled
[I] CUDA Graph: Disabled
[I] Separate profiling: Disabled
[I] Time Deserialize: Disabled
[I] Time Refit: Disabled
[I] NVTX verbosity: 0
[I] Persistent Cache Ratio: 0
[I] Inputs:
[I] === Reporting Options ===
[I] Verbose: Enabled
[I] Averages: 10 inferences
[I] Percentiles: 90,95,99
[I] Dump refittable layers:Disabled
[I] Dump output: Disabled
[I] Profile: Disabled
[I] Export timing to JSON file:
[I] Export output to JSON file:
[I] Export profile to JSON file:
[I]
[I] === Device Information ===
[I] Selected Device: Orin
[I] Compute Capability: 8.7
[I] SMs: 8
[I] Device Global Memory: 7620 MiB
[I] Shared Memory per SM: 164 KiB
[I] Memory Bus Width: 128 bits (ECC disabled)
[I] Application Compute Clock Rate: 0.624 GHz
[I] Application Memory Clock Rate: 0.624 GHz
[I]
[I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[I]
[I] TensorRT version: 8.6.2
[I] Loading standard plugins
[V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[V] [TRT] Registered plugin creator - ::BatchTilePlugin_TRT version 1
[V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[V] [TRT] Registered plugin creator - ::CoordConvAC version 1
[V] [TRT] Registered plugin creator - ::CropAndResizeDynamic version 1
[V] [TRT] Registered plugin creator - ::CropAndResize version 1
[V] [TRT] Registered plugin creator - ::DecodeBbox3DPlugin version 1
[V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[V] [TRT] Registered plugin creator - ::EfficientNMS_Explicit_TF_TRT version 1
[V] [TRT] Registered plugin creator - ::EfficientNMS_Implicit_TF_TRT version 1
[V] [TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[V] [TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[V] [TRT] Registered plugin creator - ::GenerateDetection_TRT version 1
[V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[V] [TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 2
[V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[V] [TRT] Registered plugin creator - ::ModulatedDeformConv2d version 1
[V] [TRT] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
[V] [TRT] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
[V] [TRT] Registered plugin creator - ::MultiscaleDeformableAttnPlugin_TRT version 1
[V] [TRT] Registered plugin creator - ::NMSDynamic_TRT version 1
[V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[V] [TRT] Registered plugin creator - ::PillarScatterPlugin version 1
[V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[V] [TRT] Registered plugin creator - ::ProposalDynamic version 1
[V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[V] [TRT] Registered plugin creator - ::Proposal version 1
[V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[V] [TRT] Registered plugin creator - ::Region_TRT version 1
[V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[V] [TRT] Registered plugin creator - ::ROIAlign_TRT version 1
[V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[V] [TRT] Registered plugin creator - ::ScatterND version 1
[V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[V] [TRT] Registered plugin creator - ::Split version 1
[V] [TRT] Registered plugin creator - ::VoxelGeneratorPlugin version 1
[I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 33, GPU 5167 (MiB)
[V] [TRT] Trying to load shared library libnvinfer_builder_resource.so.8.6.2
[V] [TRT] Loaded shared library libnvinfer_builder_resource.so.8.6.2
[I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1154, GPU +995, now: CPU 1223, GPU 6203 (MiB)
[V] [TRT] CUDA lazy loading is enabled.
[I] Start parsing network model.
[I] [TRT] ----------------------------------------------------------------
[I] [TRT] Input filename: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.onnx
[I] [TRT] ONNX IR version: 0.0.8
[I] [TRT] Opset version: 1
[I] [TRT] Producer name: pytorch
[I] [TRT] Producer version: 2.3.0
[I] [TRT] Domain:
[I] [TRT] Model version: 0
[I] [TRT] Doc string:
[I] [TRT] ----------------------------------------------------------------
[V] [TRT] Plugin creator already registered - ::BatchedNMSDynamic_TRT version 1
[V] [TRT] Plugin creator already registered - ::BatchedNMS_TRT version 1
[V] [TRT] Plugin creator already registered - ::BatchTilePlugin_TRT version 1
[V] [TRT] Plugin creator already registered - ::Clip_TRT version 1
[V] [TRT] Plugin creator already registered - ::CoordConvAC version 1
[V] [TRT] Plugin creator already registered - ::CropAndResizeDynamic version 1
[V] [TRT] Plugin creator already registered - ::CropAndResize version 1
[V] [TRT] Plugin creator already registered - ::DecodeBbox3DPlugin version 1
[V] [TRT] Plugin creator already registered - ::DetectionLayer_TRT version 1
[V] [TRT] Plugin creator already registered - ::EfficientNMS_Explicit_TF_TRT version 1
[V] [TRT] Plugin creator already registered - ::EfficientNMS_Implicit_TF_TRT version 1
[V] [TRT] Plugin creator already registered - ::EfficientNMS_ONNX_TRT version 1
[V] [TRT] Plugin creator already registered - ::EfficientNMS_TRT version 1
[V] [TRT] Plugin creator already registered - ::FlattenConcat_TRT version 1
[V] [TRT] Plugin creator already registered - ::GenerateDetection_TRT version 1
[V] [TRT] Plugin creator already registered - ::GridAnchor_TRT version 1
[V] [TRT] Plugin creator already registered - ::GridAnchorRect_TRT version 1
[V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 1
[V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 2
[V] [TRT] Plugin creator already registered - ::LReLU_TRT version 1
[V] [TRT] Plugin creator already registered - ::ModulatedDeformConv2d version 1
[V] [TRT] Plugin creator already registered - ::MultilevelCropAndResize_TRT version 1
[V] [TRT] Plugin creator already registered - ::MultilevelProposeROI_TRT version 1
[V] [TRT] Plugin creator already registered - ::MultiscaleDeformableAttnPlugin_TRT version 1
[V] [TRT] Plugin creator already registered - ::NMSDynamic_TRT version 1
[V] [TRT] Plugin creator already registered - ::NMS_TRT version 1
[V] [TRT] Plugin creator already registered - ::Normalize_TRT version 1
[V] [TRT] Plugin creator already registered - ::PillarScatterPlugin version 1
[V] [TRT] Plugin creator already registered - ::PriorBox_TRT version 1
[V] [TRT] Plugin creator already registered - ::ProposalDynamic version 1
[V] [TRT] Plugin creator already registered - ::ProposalLayer_TRT version 1
[V] [TRT] Plugin creator already registered - ::Proposal version 1
[V] [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT version 1
[V] [TRT] Plugin creator already registered - ::Region_TRT version 1
[V] [TRT] Plugin creator already registered - ::Reorg_TRT version 1
[V] [TRT] Plugin creator already registered - ::ResizeNearest_TRT version 1
[V] [TRT] Plugin creator already registered - ::ROIAlign_TRT version 1
[V] [TRT] Plugin creator already registered - ::RPROI_TRT version 1
[V] [TRT] Plugin creator already registered - ::ScatterND version 1
[V] [TRT] Plugin creator already registered - ::SpecialSlice_TRT version 1
[V] [TRT] Plugin creator already registered - ::Split version 1
[V] [TRT] Plugin creator already registered - ::VoxelGeneratorPlugin version 1
[V] [TRT] Adding network input: l_x_ with dtype: float32, dimensions: (1, 3, 224, 224)
[V] [TRT] Registering tensor: l_x_ for ONNX tensor: l_x_
[V] [TRT] Importing initializer: patch_embed.proj.weight
[V] [TRT] Importing initializer: patch_embed.proj.bias
[V] [TRT] Importing initializer: pos_embed
[V] [TRT] Importing initializer: blocks.0.norm1.weight
[V] [TRT] Importing initializer: blocks.0.norm1.bias
[V] [TRT] Importing initializer: blocks.0.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.0.attn.proj.weight
[V] [TRT] Importing initializer: blocks.0.attn.proj.bias
[V] [TRT] Importing initializer: blocks.0.norm2.weight
[V] [TRT] Importing initializer: blocks.0.norm2.bias
[V] [TRT] Importing initializer: blocks.0.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.0.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.0.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.0.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.1.norm1.weight
[V] [TRT] Importing initializer: blocks.1.norm1.bias
[V] [TRT] Importing initializer: blocks.1.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.1.attn.proj.weight
[V] [TRT] Importing initializer: blocks.1.attn.proj.bias
[V] [TRT] Importing initializer: blocks.1.norm2.weight
[V] [TRT] Importing initializer: blocks.1.norm2.bias
[V] [TRT] Importing initializer: blocks.1.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.1.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.1.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.1.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.2.norm1.weight
[V] [TRT] Importing initializer: blocks.2.norm1.bias
[V] [TRT] Importing initializer: blocks.2.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.2.attn.proj.weight
[V] [TRT] Importing initializer: blocks.2.attn.proj.bias
[V] [TRT] Importing initializer: blocks.2.norm2.weight
[V] [TRT] Importing initializer: blocks.2.norm2.bias
[V] [TRT] Importing initializer: blocks.2.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.2.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.2.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.2.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.3.norm1.weight
[V] [TRT] Importing initializer: blocks.3.norm1.bias
[V] [TRT] Importing initializer: blocks.3.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.3.attn.proj.weight
[V] [TRT] Importing initializer: blocks.3.attn.proj.bias
[V] [TRT] Importing initializer: blocks.3.norm2.weight
[V] [TRT] Importing initializer: blocks.3.norm2.bias
[V] [TRT] Importing initializer: blocks.3.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.3.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.3.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.3.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.4.norm1.weight
[V] [TRT] Importing initializer: blocks.4.norm1.bias
[V] [TRT] Importing initializer: blocks.4.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.4.attn.proj.weight
[V] [TRT] Importing initializer: blocks.4.attn.proj.bias
[V] [TRT] Importing initializer: blocks.4.norm2.weight
[V] [TRT] Importing initializer: blocks.4.norm2.bias
[V] [TRT] Importing initializer: blocks.4.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.4.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.4.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.4.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.5.norm1.weight
[V] [TRT] Importing initializer: blocks.5.norm1.bias
[V] [TRT] Importing initializer: blocks.5.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.5.attn.proj.weight
[V] [TRT] Importing initializer: blocks.5.attn.proj.bias
[V] [TRT] Importing initializer: blocks.5.norm2.weight
[V] [TRT] Importing initializer: blocks.5.norm2.bias
[V] [TRT] Importing initializer: blocks.5.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.5.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.5.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.5.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.6.norm1.weight
[V] [TRT] Importing initializer: blocks.6.norm1.bias
[V] [TRT] Importing initializer: blocks.6.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.6.attn.proj.weight
[V] [TRT] Importing initializer: blocks.6.attn.proj.bias
[V] [TRT] Importing initializer: blocks.6.norm2.weight
[V] [TRT] Importing initializer: blocks.6.norm2.bias
[V] [TRT] Importing initializer: blocks.6.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.6.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.6.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.6.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.7.norm1.weight
[V] [TRT] Importing initializer: blocks.7.norm1.bias
[V] [TRT] Importing initializer: blocks.7.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.7.attn.proj.weight
[V] [TRT] Importing initializer: blocks.7.attn.proj.bias
[V] [TRT] Importing initializer: blocks.7.norm2.weight
[V] [TRT] Importing initializer: blocks.7.norm2.bias
[V] [TRT] Importing initializer: blocks.7.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.7.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.7.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.7.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.8.norm1.weight
[V] [TRT] Importing initializer: blocks.8.norm1.bias
[V] [TRT] Importing initializer: blocks.8.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.8.attn.proj.weight
[V] [TRT] Importing initializer: blocks.8.attn.proj.bias
[V] [TRT] Importing initializer: blocks.8.norm2.weight
[V] [TRT] Importing initializer: blocks.8.norm2.bias
[V] [TRT] Importing initializer: blocks.8.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.8.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.8.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.8.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.9.norm1.weight
[V] [TRT] Importing initializer: blocks.9.norm1.bias
[V] [TRT] Importing initializer: blocks.9.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.9.attn.proj.weight
[V] [TRT] Importing initializer: blocks.9.attn.proj.bias
[V] [TRT] Importing initializer: blocks.9.norm2.weight
[V] [TRT] Importing initializer: blocks.9.norm2.bias
[V] [TRT] Importing initializer: blocks.9.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.9.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.9.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.9.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.10.norm1.weight
[V] [TRT] Importing initializer: blocks.10.norm1.bias
[V] [TRT] Importing initializer: blocks.10.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.10.attn.proj.weight
[V] [TRT] Importing initializer: blocks.10.attn.proj.bias
[V] [TRT] Importing initializer: blocks.10.norm2.weight
[V] [TRT] Importing initializer: blocks.10.norm2.bias
[V] [TRT] Importing initializer: blocks.10.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.10.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.10.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.10.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.11.norm1.weight
[V] [TRT] Importing initializer: blocks.11.norm1.bias
[V] [TRT] Importing initializer: blocks.11.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.11.attn.proj.weight
[V] [TRT] Importing initializer: blocks.11.attn.proj.bias
[V] [TRT] Importing initializer: blocks.11.norm2.weight
[V] [TRT] Importing initializer: blocks.11.norm2.bias
[V] [TRT] Importing initializer: blocks.11.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.11.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.11.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.11.mlp.fc2.bias
[V] [TRT] Importing initializer: norm.weight
[V] [TRT] Importing initializer: norm.bias
[V] [TRT] Importing initializer: feature.0.weight
[V] [TRT] Importing initializer: feature.1.weight
[V] [TRT] Importing initializer: feature.1.bias
[V] [TRT] Importing initializer: feature.1.running_mean
[V] [TRT] Importing initializer: feature.1.running_var
[V] [TRT] Importing initializer: feature.2.weight
[V] [TRT] Importing initializer: feature.3.weight
[V] [TRT] Importing initializer: feature.3.bias
[V] [TRT] Importing initializer: feature.3.running_mean
[V] [TRT] Importing initializer: feature.3.running_var
[V] [TRT] Parsing node: unicom_vision_transformer_PatchEmbedding_patch_embed_1_1 [unicom_vision_transformer_PatchEmbedding_patch_embed_1]
[V] [TRT] Searching for input: l_x_
[V] [TRT] Searching for input: patch_embed.proj.weight
[V] [TRT] Searching for input: patch_embed.proj.bias
[V] [TRT] unicom_vision_transformer_PatchEmbedding_patch_embed_1_1 [unicom_vision_transformer_PatchEmbedding_patch_embed_1] inputs: [l_x_ -> (1, 3, 224, 224)[FLOAT]], [patch_embed.proj.weight -> (768, 3, 32, 32)[FLOAT]], [patch_embed.proj.bias -> (768)[FLOAT]],
[I] [TRT] No importer registered for op: unicom_vision_transformer_PatchEmbedding_patch_embed_1. Attempting to import as plugin.
[I] [TRT] Searching for plugin: unicom_vision_transformer_PatchEmbedding_patch_embed_1, plugin_version: 1, plugin_namespace:
[V] [TRT] Local registry did not find unicom_vision_transformer_PatchEmbedding_patch_embed_1 creator. Will try parent registry if enabled.
[V] [TRT] Global registry did not find unicom_vision_transformer_PatchEmbedding_patch_embed_1 creator. Will try parent registry if enabled.
[E] [TRT] 3: getPluginCreator could not find plugin: unicom_vision_transformer_PatchEmbedding_patch_embed_1 version: 1
[E] [TRT] ModelImporter.cpp:768: While parsing node number 0 [unicom_vision_transformer_PatchEmbedding_patch_embed_1 -> "patch_embed_1"]:
[E] [TRT] ModelImporter.cpp:769: --- Begin node ---
[E] [TRT] ModelImporter.cpp:770: input: "l_x_"
input: "patch_embed.proj.weight"
input: "patch_embed.proj.bias"
output: "patch_embed_1"
name: "unicom_vision_transformer_PatchEmbedding_patch_embed_1_1"
op_type: "unicom_vision_transformer_PatchEmbedding_patch_embed_1"
doc_string: ""
domain: "pkg.unicom"
input: "patch_embed.proj.weight"
input: "patch_embed.proj.bias"
output: "patch_embed_1"
name: "unicom_vision_transformer_PatchEmbedding_patch_embed_1_1"
op_type: "unicom_vision_transformer_PatchEmbedding_patch_embed_1"
doc_string: ""
domain: "pkg.unicom"
[E] [TRT] ModelImporter.cpp:771: --- End node ---
[E] [TRT] ModelImporter.cpp:773: ERROR: builtin_op_importers.cpp:5403 In function importFallbackPluginImporter:
[E] [TRT] ModelImporter.cpp:771: --- End node ---
[E] [TRT] ModelImporter.cpp:773: ERROR: builtin_op_importers.cpp:5403 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[E] Failed to parse onnx file
[I] Finished parsing network model. Parse time: 4.99544
[E] Parsing model failed
[E] Failed to create engine from model or file.
[E] Engine set up failed
[E] Failed to parse onnx file
[I] Finished parsing network model. Parse time: 13.1481
[E] Parsing model failed
[E] Failed to create engine from model or file.
[E] Engine set up failed
Похоже, что проблема возникает из-за класса PatchEmbedding, и не похоже, что модель использует какие-либо необычные методы и слои, которые не могут быть преобразованы TensorRT. Вот исходный код класса:
class PatchEmbedding(nn.Module):
def __init__(self, input_size=224, patch_size=32, in_channels: int = 3, dim: int = 768):
super().__init__()
if isinstance(input_size, int):
input_size = (input_size, input_size)
if isinstance(patch_size, int):
patch_size = (patch_size, patch_size)
H = input_size[0] // patch_size[0]
W = input_size[1] // patch_size[1]
self.num_patches = H * W
self.proj = nn.Conv2d(
in_channels, dim, kernel_size=patch_size, stride=patch_size)
def forward(self, x):
x = self.proj(x).flatten(2).transpose(1, 2)
return x
Что мне сделать, чтобы модель можно было конвертировать в TensorRT?
Спасибо
## Environment
**TensorRT Version**: tensorrt_version_8_6_2_3
**GPU Type**: Jetson Orin Nano
**Nvidia Driver Version**:
**CUDA Version**: 12.2
**CUDNN Version**: 8.9.4.25-1+cuda12.2
**Operating System + Version**: Jetpack 6.0
**Python Version (if applicable)**: 3.10
**PyTorch Version (if applicable)**: 2.3.0
**ONNX Version (if applicable)**: 1.16.1
**onnxruntime-gpu Version (if applicable)**: 1.17.0
**onnxscript Version (if applicable)**: 0.1.0.dev20240721
Подробнее здесь: https://stackoverflow.com/questions/787 ... -orin-nano
Преобразование модели PyTorch ONNX в движок TensorRT - Jetson Orin Nano ⇐ Python
Программы на Python
1721869739
Anonymous
Я пытаюсь преобразовать модель Vision Transformer ViT-B/32 из репозитория UNICOM на Jetson Orin Nano. Класс Vision Transformer модели и исходный код находятся здесь.
Я использую следующий код для преобразования модели в ONNX:
import torch
import onnx
import onnxruntime
from unicom.vision_transformer import build_model
if __name__ == '__main__':
model_name = "ViT-B/32"
model_name_fp16 = "FP16-ViT-B-32"
onnx_model_path = f"{model_name_fp16}.onnx"
model = build_model(model_name)
model.eval()
model = model.to('cuda')
torch_input = torch.randn(1, 3, 224, 224).to('cuda')
onnx_program = torch.onnx.dynamo_export(model, torch_input)
onnx_program.save(onnx_model_path)
onnx_model = onnx.load(onnx_model_path)
onnx.checker.check_model(onnx_model_path)
Затем я использую следующую командную строку для преобразования модели ONNX в механизм TensorRT:
/usr/src/tensorrt/bin/trtexec --onnx=FP16-ViT-B-32.onnx --saveEngine=FP16-ViT-B-32.trt --workspace=1024 --fp16
Это приводит к следующей ошибке:
[W] --workspace flag has been deprecated by --memPoolSize flag.
[I] === Model Options ===
[I] Format: ONNX
[I] Model: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.onnx
[I] Output:
[I] === Build Options ===
[I] Max batch: explicit batch
[I] Memory Pools: workspace: 1024 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[I] minTiming: 1
[I] avgTiming: 8
[I] Precision: FP32+FP16
[I] LayerPrecisions:
[I] Layer Device Types:
[I] Calibration:
[I] Refit: Disabled
[I] Version Compatible: Disabled
[I] ONNX Native InstanceNorm: Disabled
[I] TensorRT runtime: full
[I] Lean DLL Path:
[I] Tempfile Controls: { in_memory: allow, temporary: allow }
[I] Exclude Lean Runtime: Disabled
[I] Sparsity: Disabled
[I] Safe mode: Disabled
[I] Build DLA standalone loadable: Disabled
[I] Allow GPU fallback for DLA: Disabled
[I] DirectIO mode: Disabled
[I] Restricted mode: Disabled
[I] Skip inference: Disabled
[I] Save engine: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.trt
[I] Load engine:
[I] Profiling verbosity: 0
[I] Tactic sources: Using default tactic sources
[I] timingCacheMode: local
[I] timingCacheFile:
[I] Heuristic: Disabled
[I] Preview Features: Use default preview flags.
[I] MaxAuxStreams: -1
[I] BuilderOptimizationLevel: -1
[I] Input(s)s format: fp32:CHW
[I] Output(s)s format: fp32:CHW
[I] Input build shapes: model
[I] Input calibration shapes: model
[I] === System Options ===
[I] Device: 0
[I] DLACore:
[I] Plugins:
[I] setPluginsToSerialize:
[I] dynamicPlugins:
[I] ignoreParsedPluginLibs: 0
[I]
[I] === Inference Options ===
[I] Batch: Explicit
[I] Input inference shapes: model
[I] Iterations: 10
[I] Duration: 3s (+ 200ms warm up)
[I] Sleep time: 0ms
[I] Idle time: 0ms
[I] Inference Streams: 1
[I] ExposeDMA: Disabled
[I] Data transfers: Enabled
[I] Spin-wait: Disabled
[I] Multithreading: Disabled
[I] CUDA Graph: Disabled
[I] Separate profiling: Disabled
[I] Time Deserialize: Disabled
[I] Time Refit: Disabled
[I] NVTX verbosity: 0
[I] Persistent Cache Ratio: 0
[I] Inputs:
[I] === Reporting Options ===
[I] Verbose: Disabled
[I] Averages: 10 inferences
[I] Percentiles: 90,95,99
[I] Dump refittable layers:Disabled
[I] Dump output: Disabled
[I] Profile: Disabled
[I] Export timing to JSON file:
[I] Export output to JSON file:
[I] Export profile to JSON file:
[I]
[I] === Device Information ===
[I] Selected Device: Orin
[I] Compute Capability: 8.7
[I] SMs: 8
[I] Device Global Memory: 7620 MiB
[I] Shared Memory per SM: 164 KiB
[I] Memory Bus Width: 128 bits (ECC disabled)
[I] Application Compute Clock Rate: 0.624 GHz
[I] Application Memory Clock Rate: 0.624 GHz
[I]
[I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[I]
[I] TensorRT version: 8.6.2
[I] Loading standard plugins
[I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 33, GPU 4508 (MiB)
[I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1154, GPU +1351, now: CPU 1223, GPU 5866 (MiB)
[I] Start parsing network model.
[I] [TRT] ----------------------------------------------------------------
[I] [TRT] Input filename: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.onnx
[I] [TRT] ONNX IR version: 0.0.8
[I] [TRT] Opset version: 1
[I] [TRT] Producer name: pytorch
[I] [TRT] Producer version: 2.3.0
[I] [TRT] Domain:
[I] [TRT] Model version: 0
[I] [TRT] Doc string:
[I] [TRT] ----------------------------------------------------------------
[I] [TRT] No importer registered for op: unicom_vision_transformer_PatchEmbedding_patch_embed_1. Attempting to import as plugin.
[I] [TRT] Searching for plugin: unicom_vision_transformer_PatchEmbedding_patch_embed_1, plugin_version: 1, plugin_namespace:
[E] [TRT] 3: getPluginCreator could not find plugin: unicom_vision_transformer_PatchEmbedding_patch_embed_1 version: 1
[E] [TRT] ModelImporter.cpp:768: While parsing node number 0 [unicom_vision_transformer_PatchEmbedding_patch_embed_1 -> "patch_embed_1"]:
[E] [TRT] ModelImporter.cpp:769: --- Begin node ---
[E] [TRT] ModelImporter.cpp:770: input: "l_x_"
[W] --workspace flag has been deprecated by --memPoolSize flag.
[I] === Model Options ===
[I] Format: ONNX
[I] Model: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.onnx
[I] Output:
[I] === Build Options ===
[I] Max batch: explicit batch
[I] Memory Pools: workspace: 1024 MiB, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[I] minTiming: 1
[I] avgTiming: 8
[I] Precision: FP32+FP16
[I] LayerPrecisions:
[I] Layer Device Types:
[I] Calibration:
[I] Refit: Disabled
[I] Version Compatible: Disabled
[I] ONNX Native InstanceNorm: Disabled
[I] TensorRT runtime: full
[I] Lean DLL Path:
[I] Tempfile Controls: { in_memory: allow, temporary: allow }
[I] Exclude Lean Runtime: Disabled
[I] Sparsity: Disabled
[I] Safe mode: Disabled
[I] Build DLA standalone loadable: Disabled
[I] Allow GPU fallback for DLA: Disabled
[I] DirectIO mode: Disabled
[I] Restricted mode: Disabled
[I] Skip inference: Disabled
[I] Save engine: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.trt
[I] Load engine:
[I] Profiling verbosity: 0
[I] Tactic sources: Using default tactic sources
[I] timingCacheMode: local
[I] timingCacheFile:
[I] Heuristic: Disabled
[I] Preview Features: Use default preview flags.
[I] MaxAuxStreams: -1
[I] BuilderOptimizationLevel: -1
[I] Input(s)s format: fp32:CHW
[I] Output(s)s format: fp32:CHW
[I] Input build shapes: model
[I] Input calibration shapes: model
[I] === System Options ===
[I] Device: 0
[I] DLACore:
[I] Plugins:
[I] setPluginsToSerialize:
[I] dynamicPlugins:
[I] ignoreParsedPluginLibs: 0
[I]
[I] === Inference Options ===
[I] Batch: Explicit
[I] Input inference shapes: model
[I] Iterations: 10
[I] Duration: 3s (+ 200ms warm up)
[I] Sleep time: 0ms
[I] Idle time: 0ms
[I] Inference Streams: 1
[I] ExposeDMA: Disabled
[I] Data transfers: Enabled
[I] Spin-wait: Disabled
[I] Multithreading: Disabled
[I] CUDA Graph: Disabled
[I] Separate profiling: Disabled
[I] Time Deserialize: Disabled
[I] Time Refit: Disabled
[I] NVTX verbosity: 0
[I] Persistent Cache Ratio: 0
[I] Inputs:
[I] === Reporting Options ===
[I] Verbose: Enabled
[I] Averages: 10 inferences
[I] Percentiles: 90,95,99
[I] Dump refittable layers:Disabled
[I] Dump output: Disabled
[I] Profile: Disabled
[I] Export timing to JSON file:
[I] Export output to JSON file:
[I] Export profile to JSON file:
[I]
[I] === Device Information ===
[I] Selected Device: Orin
[I] Compute Capability: 8.7
[I] SMs: 8
[I] Device Global Memory: 7620 MiB
[I] Shared Memory per SM: 164 KiB
[I] Memory Bus Width: 128 bits (ECC disabled)
[I] Application Compute Clock Rate: 0.624 GHz
[I] Application Memory Clock Rate: 0.624 GHz
[I]
[I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[I]
[I] TensorRT version: 8.6.2
[I] Loading standard plugins
[V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[V] [TRT] Registered plugin creator - ::BatchTilePlugin_TRT version 1
[V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[V] [TRT] Registered plugin creator - ::CoordConvAC version 1
[V] [TRT] Registered plugin creator - ::CropAndResizeDynamic version 1
[V] [TRT] Registered plugin creator - ::CropAndResize version 1
[V] [TRT] Registered plugin creator - ::DecodeBbox3DPlugin version 1
[V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[V] [TRT] Registered plugin creator - ::EfficientNMS_Explicit_TF_TRT version 1
[V] [TRT] Registered plugin creator - ::EfficientNMS_Implicit_TF_TRT version 1
[V] [TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[V] [TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[V] [TRT] Registered plugin creator - ::GenerateDetection_TRT version 1
[V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[V] [TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 2
[V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[V] [TRT] Registered plugin creator - ::ModulatedDeformConv2d version 1
[V] [TRT] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
[V] [TRT] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
[V] [TRT] Registered plugin creator - ::MultiscaleDeformableAttnPlugin_TRT version 1
[V] [TRT] Registered plugin creator - ::NMSDynamic_TRT version 1
[V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[V] [TRT] Registered plugin creator - ::PillarScatterPlugin version 1
[V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[V] [TRT] Registered plugin creator - ::ProposalDynamic version 1
[V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[V] [TRT] Registered plugin creator - ::Proposal version 1
[V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[V] [TRT] Registered plugin creator - ::Region_TRT version 1
[V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[V] [TRT] Registered plugin creator - ::ROIAlign_TRT version 1
[V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[V] [TRT] Registered plugin creator - ::ScatterND version 1
[V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[V] [TRT] Registered plugin creator - ::Split version 1
[V] [TRT] Registered plugin creator - ::VoxelGeneratorPlugin version 1
[I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 33, GPU 5167 (MiB)
[V] [TRT] Trying to load shared library libnvinfer_builder_resource.so.8.6.2
[V] [TRT] Loaded shared library libnvinfer_builder_resource.so.8.6.2
[I] [TRT] [MemUsageChange] Init builder kernel library: CPU +1154, GPU +995, now: CPU 1223, GPU 6203 (MiB)
[V] [TRT] CUDA lazy loading is enabled.
[I] Start parsing network model.
[I] [TRT] ----------------------------------------------------------------
[I] [TRT] Input filename: /home/jetson/HPS/Models/FeatureExtractor/UNICOM/ONNX/FP16-ViT-B-32.onnx
[I] [TRT] ONNX IR version: 0.0.8
[I] [TRT] Opset version: 1
[I] [TRT] Producer name: pytorch
[I] [TRT] Producer version: 2.3.0
[I] [TRT] Domain:
[I] [TRT] Model version: 0
[I] [TRT] Doc string:
[I] [TRT] ----------------------------------------------------------------
[V] [TRT] Plugin creator already registered - ::BatchedNMSDynamic_TRT version 1
[V] [TRT] Plugin creator already registered - ::BatchedNMS_TRT version 1
[V] [TRT] Plugin creator already registered - ::BatchTilePlugin_TRT version 1
[V] [TRT] Plugin creator already registered - ::Clip_TRT version 1
[V] [TRT] Plugin creator already registered - ::CoordConvAC version 1
[V] [TRT] Plugin creator already registered - ::CropAndResizeDynamic version 1
[V] [TRT] Plugin creator already registered - ::CropAndResize version 1
[V] [TRT] Plugin creator already registered - ::DecodeBbox3DPlugin version 1
[V] [TRT] Plugin creator already registered - ::DetectionLayer_TRT version 1
[V] [TRT] Plugin creator already registered - ::EfficientNMS_Explicit_TF_TRT version 1
[V] [TRT] Plugin creator already registered - ::EfficientNMS_Implicit_TF_TRT version 1
[V] [TRT] Plugin creator already registered - ::EfficientNMS_ONNX_TRT version 1
[V] [TRT] Plugin creator already registered - ::EfficientNMS_TRT version 1
[V] [TRT] Plugin creator already registered - ::FlattenConcat_TRT version 1
[V] [TRT] Plugin creator already registered - ::GenerateDetection_TRT version 1
[V] [TRT] Plugin creator already registered - ::GridAnchor_TRT version 1
[V] [TRT] Plugin creator already registered - ::GridAnchorRect_TRT version 1
[V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 1
[V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 2
[V] [TRT] Plugin creator already registered - ::LReLU_TRT version 1
[V] [TRT] Plugin creator already registered - ::ModulatedDeformConv2d version 1
[V] [TRT] Plugin creator already registered - ::MultilevelCropAndResize_TRT version 1
[V] [TRT] Plugin creator already registered - ::MultilevelProposeROI_TRT version 1
[V] [TRT] Plugin creator already registered - ::MultiscaleDeformableAttnPlugin_TRT version 1
[V] [TRT] Plugin creator already registered - ::NMSDynamic_TRT version 1
[V] [TRT] Plugin creator already registered - ::NMS_TRT version 1
[V] [TRT] Plugin creator already registered - ::Normalize_TRT version 1
[V] [TRT] Plugin creator already registered - ::PillarScatterPlugin version 1
[V] [TRT] Plugin creator already registered - ::PriorBox_TRT version 1
[V] [TRT] Plugin creator already registered - ::ProposalDynamic version 1
[V] [TRT] Plugin creator already registered - ::ProposalLayer_TRT version 1
[V] [TRT] Plugin creator already registered - ::Proposal version 1
[V] [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT version 1
[V] [TRT] Plugin creator already registered - ::Region_TRT version 1
[V] [TRT] Plugin creator already registered - ::Reorg_TRT version 1
[V] [TRT] Plugin creator already registered - ::ResizeNearest_TRT version 1
[V] [TRT] Plugin creator already registered - ::ROIAlign_TRT version 1
[V] [TRT] Plugin creator already registered - ::RPROI_TRT version 1
[V] [TRT] Plugin creator already registered - ::ScatterND version 1
[V] [TRT] Plugin creator already registered - ::SpecialSlice_TRT version 1
[V] [TRT] Plugin creator already registered - ::Split version 1
[V] [TRT] Plugin creator already registered - ::VoxelGeneratorPlugin version 1
[V] [TRT] Adding network input: l_x_ with dtype: float32, dimensions: (1, 3, 224, 224)
[V] [TRT] Registering tensor: l_x_ for ONNX tensor: l_x_
[V] [TRT] Importing initializer: patch_embed.proj.weight
[V] [TRT] Importing initializer: patch_embed.proj.bias
[V] [TRT] Importing initializer: pos_embed
[V] [TRT] Importing initializer: blocks.0.norm1.weight
[V] [TRT] Importing initializer: blocks.0.norm1.bias
[V] [TRT] Importing initializer: blocks.0.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.0.attn.proj.weight
[V] [TRT] Importing initializer: blocks.0.attn.proj.bias
[V] [TRT] Importing initializer: blocks.0.norm2.weight
[V] [TRT] Importing initializer: blocks.0.norm2.bias
[V] [TRT] Importing initializer: blocks.0.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.0.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.0.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.0.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.1.norm1.weight
[V] [TRT] Importing initializer: blocks.1.norm1.bias
[V] [TRT] Importing initializer: blocks.1.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.1.attn.proj.weight
[V] [TRT] Importing initializer: blocks.1.attn.proj.bias
[V] [TRT] Importing initializer: blocks.1.norm2.weight
[V] [TRT] Importing initializer: blocks.1.norm2.bias
[V] [TRT] Importing initializer: blocks.1.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.1.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.1.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.1.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.2.norm1.weight
[V] [TRT] Importing initializer: blocks.2.norm1.bias
[V] [TRT] Importing initializer: blocks.2.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.2.attn.proj.weight
[V] [TRT] Importing initializer: blocks.2.attn.proj.bias
[V] [TRT] Importing initializer: blocks.2.norm2.weight
[V] [TRT] Importing initializer: blocks.2.norm2.bias
[V] [TRT] Importing initializer: blocks.2.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.2.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.2.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.2.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.3.norm1.weight
[V] [TRT] Importing initializer: blocks.3.norm1.bias
[V] [TRT] Importing initializer: blocks.3.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.3.attn.proj.weight
[V] [TRT] Importing initializer: blocks.3.attn.proj.bias
[V] [TRT] Importing initializer: blocks.3.norm2.weight
[V] [TRT] Importing initializer: blocks.3.norm2.bias
[V] [TRT] Importing initializer: blocks.3.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.3.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.3.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.3.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.4.norm1.weight
[V] [TRT] Importing initializer: blocks.4.norm1.bias
[V] [TRT] Importing initializer: blocks.4.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.4.attn.proj.weight
[V] [TRT] Importing initializer: blocks.4.attn.proj.bias
[V] [TRT] Importing initializer: blocks.4.norm2.weight
[V] [TRT] Importing initializer: blocks.4.norm2.bias
[V] [TRT] Importing initializer: blocks.4.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.4.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.4.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.4.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.5.norm1.weight
[V] [TRT] Importing initializer: blocks.5.norm1.bias
[V] [TRT] Importing initializer: blocks.5.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.5.attn.proj.weight
[V] [TRT] Importing initializer: blocks.5.attn.proj.bias
[V] [TRT] Importing initializer: blocks.5.norm2.weight
[V] [TRT] Importing initializer: blocks.5.norm2.bias
[V] [TRT] Importing initializer: blocks.5.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.5.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.5.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.5.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.6.norm1.weight
[V] [TRT] Importing initializer: blocks.6.norm1.bias
[V] [TRT] Importing initializer: blocks.6.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.6.attn.proj.weight
[V] [TRT] Importing initializer: blocks.6.attn.proj.bias
[V] [TRT] Importing initializer: blocks.6.norm2.weight
[V] [TRT] Importing initializer: blocks.6.norm2.bias
[V] [TRT] Importing initializer: blocks.6.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.6.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.6.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.6.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.7.norm1.weight
[V] [TRT] Importing initializer: blocks.7.norm1.bias
[V] [TRT] Importing initializer: blocks.7.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.7.attn.proj.weight
[V] [TRT] Importing initializer: blocks.7.attn.proj.bias
[V] [TRT] Importing initializer: blocks.7.norm2.weight
[V] [TRT] Importing initializer: blocks.7.norm2.bias
[V] [TRT] Importing initializer: blocks.7.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.7.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.7.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.7.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.8.norm1.weight
[V] [TRT] Importing initializer: blocks.8.norm1.bias
[V] [TRT] Importing initializer: blocks.8.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.8.attn.proj.weight
[V] [TRT] Importing initializer: blocks.8.attn.proj.bias
[V] [TRT] Importing initializer: blocks.8.norm2.weight
[V] [TRT] Importing initializer: blocks.8.norm2.bias
[V] [TRT] Importing initializer: blocks.8.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.8.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.8.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.8.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.9.norm1.weight
[V] [TRT] Importing initializer: blocks.9.norm1.bias
[V] [TRT] Importing initializer: blocks.9.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.9.attn.proj.weight
[V] [TRT] Importing initializer: blocks.9.attn.proj.bias
[V] [TRT] Importing initializer: blocks.9.norm2.weight
[V] [TRT] Importing initializer: blocks.9.norm2.bias
[V] [TRT] Importing initializer: blocks.9.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.9.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.9.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.9.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.10.norm1.weight
[V] [TRT] Importing initializer: blocks.10.norm1.bias
[V] [TRT] Importing initializer: blocks.10.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.10.attn.proj.weight
[V] [TRT] Importing initializer: blocks.10.attn.proj.bias
[V] [TRT] Importing initializer: blocks.10.norm2.weight
[V] [TRT] Importing initializer: blocks.10.norm2.bias
[V] [TRT] Importing initializer: blocks.10.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.10.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.10.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.10.mlp.fc2.bias
[V] [TRT] Importing initializer: blocks.11.norm1.weight
[V] [TRT] Importing initializer: blocks.11.norm1.bias
[V] [TRT] Importing initializer: blocks.11.attn.qkv.weight
[V] [TRT] Importing initializer: blocks.11.attn.proj.weight
[V] [TRT] Importing initializer: blocks.11.attn.proj.bias
[V] [TRT] Importing initializer: blocks.11.norm2.weight
[V] [TRT] Importing initializer: blocks.11.norm2.bias
[V] [TRT] Importing initializer: blocks.11.mlp.fc1.weight
[V] [TRT] Importing initializer: blocks.11.mlp.fc1.bias
[V] [TRT] Importing initializer: blocks.11.mlp.fc2.weight
[V] [TRT] Importing initializer: blocks.11.mlp.fc2.bias
[V] [TRT] Importing initializer: norm.weight
[V] [TRT] Importing initializer: norm.bias
[V] [TRT] Importing initializer: feature.0.weight
[V] [TRT] Importing initializer: feature.1.weight
[V] [TRT] Importing initializer: feature.1.bias
[V] [TRT] Importing initializer: feature.1.running_mean
[V] [TRT] Importing initializer: feature.1.running_var
[V] [TRT] Importing initializer: feature.2.weight
[V] [TRT] Importing initializer: feature.3.weight
[V] [TRT] Importing initializer: feature.3.bias
[V] [TRT] Importing initializer: feature.3.running_mean
[V] [TRT] Importing initializer: feature.3.running_var
[V] [TRT] Parsing node: unicom_vision_transformer_PatchEmbedding_patch_embed_1_1 [unicom_vision_transformer_PatchEmbedding_patch_embed_1]
[V] [TRT] Searching for input: l_x_
[V] [TRT] Searching for input: patch_embed.proj.weight
[V] [TRT] Searching for input: patch_embed.proj.bias
[V] [TRT] unicom_vision_transformer_PatchEmbedding_patch_embed_1_1 [unicom_vision_transformer_PatchEmbedding_patch_embed_1] inputs: [l_x_ -> (1, 3, 224, 224)[FLOAT]], [patch_embed.proj.weight -> (768, 3, 32, 32)[FLOAT]], [patch_embed.proj.bias -> (768)[FLOAT]],
[I] [TRT] No importer registered for op: unicom_vision_transformer_PatchEmbedding_patch_embed_1. Attempting to import as plugin.
[I] [TRT] Searching for plugin: unicom_vision_transformer_PatchEmbedding_patch_embed_1, plugin_version: 1, plugin_namespace:
[V] [TRT] Local registry did not find unicom_vision_transformer_PatchEmbedding_patch_embed_1 creator. Will try parent registry if enabled.
[V] [TRT] Global registry did not find unicom_vision_transformer_PatchEmbedding_patch_embed_1 creator. Will try parent registry if enabled.
[E] [TRT] 3: getPluginCreator could not find plugin: unicom_vision_transformer_PatchEmbedding_patch_embed_1 version: 1
[E] [TRT] ModelImporter.cpp:768: While parsing node number 0 [unicom_vision_transformer_PatchEmbedding_patch_embed_1 -> "patch_embed_1"]:
[E] [TRT] ModelImporter.cpp:769: --- Begin node ---
[E] [TRT] ModelImporter.cpp:770: input: "l_x_"
input: "patch_embed.proj.weight"
input: "patch_embed.proj.bias"
output: "patch_embed_1"
name: "unicom_vision_transformer_PatchEmbedding_patch_embed_1_1"
op_type: "unicom_vision_transformer_PatchEmbedding_patch_embed_1"
doc_string: ""
domain: "pkg.unicom"
input: "patch_embed.proj.weight"
input: "patch_embed.proj.bias"
output: "patch_embed_1"
name: "unicom_vision_transformer_PatchEmbedding_patch_embed_1_1"
op_type: "unicom_vision_transformer_PatchEmbedding_patch_embed_1"
doc_string: ""
domain: "pkg.unicom"
[E] [TRT] ModelImporter.cpp:771: --- End node ---
[E] [TRT] ModelImporter.cpp:773: ERROR: builtin_op_importers.cpp:5403 In function importFallbackPluginImporter:
[E] [TRT] ModelImporter.cpp:771: --- End node ---
[E] [TRT] ModelImporter.cpp:773: ERROR: builtin_op_importers.cpp:5403 In function importFallbackPluginImporter:
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[8] Assertion failed: creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[E] Failed to parse onnx file
[I] Finished parsing network model. Parse time: 4.99544
[E] Parsing model failed
[E] Failed to create engine from model or file.
[E] Engine set up failed
[E] Failed to parse onnx file
[I] Finished parsing network model. Parse time: 13.1481
[E] Parsing model failed
[E] Failed to create engine from model or file.
[E] Engine set up failed
Похоже, что проблема возникает из-за класса PatchEmbedding, и не похоже, что модель использует какие-либо необычные методы и слои, которые не могут быть преобразованы TensorRT. Вот исходный код класса:
class PatchEmbedding(nn.Module):
def __init__(self, input_size=224, patch_size=32, in_channels: int = 3, dim: int = 768):
super().__init__()
if isinstance(input_size, int):
input_size = (input_size, input_size)
if isinstance(patch_size, int):
patch_size = (patch_size, patch_size)
H = input_size[0] // patch_size[0]
W = input_size[1] // patch_size[1]
self.num_patches = H * W
self.proj = nn.Conv2d(
in_channels, dim, kernel_size=patch_size, stride=patch_size)
def forward(self, x):
x = self.proj(x).flatten(2).transpose(1, 2)
return x
Что мне сделать, чтобы модель можно было конвертировать в TensorRT?
Спасибо
## Environment
**TensorRT Version**: tensorrt_version_8_6_2_3
**GPU Type**: Jetson Orin Nano
**Nvidia Driver Version**:
**CUDA Version**: 12.2
**CUDNN Version**: 8.9.4.25-1+cuda12.2
**Operating System + Version**: Jetpack 6.0
**Python Version (if applicable)**: 3.10
**PyTorch Version (if applicable)**: 2.3.0
**ONNX Version (if applicable)**: 1.16.1
**onnxruntime-gpu Version (if applicable)**: 1.17.0
**onnxscript Version (if applicable)**: 0.1.0.dev20240721
Подробнее здесь: [url]https://stackoverflow.com/questions/78787534/converting-a-pytorch-onnx-model-to-tensorrt-engine-jetson-orin-nano[/url]
-
- Похожие темы
- Ответы
- Просмотры
- Последнее сообщение
-
-
Переписывано ядра [Jetson Orin Nano 8GB] - потерял все сети, пытаясь установить SCTP
Anonymous » » в форуме LinuxЯ борюсь со своим проектом, который требует установки и запуска проекта Srsran для 5G подключения на моем Jetson Orin Nano 8gb. (Я уже пробовал Raspberry Pi Model B B, но отказался продолжать из -за отсутствия производительности.) Я должен был... - 0 Ответы
- 54 Просмотры
-
Последнее сообщение Anonymous
-
-
-
Детектор Jetson Nano Adafruit ServoKit не может обнаружить Jetson
Гость » » в форуме PythonПытаемся заставить ServoKit работать на Jetson Nano
Установлены библиотеки Adafruit и ServoKit:
Адафрут-Блинка 6.15.0 adafruit-circuitpython-busdevice 5.1.5 adafruit-схемаpython-мотор 3.3.5 adafruit-circuitpython-pca9685 3.4.10... - 0 Ответы
- 95 Просмотры
-
Последнее сообщение Гость
-
-
-
Отсутствует Libbackend_with_compiler.so при составлении Pytorch на Jetson Orin
Anonymous » » в форуме C++Я пытался запустить программу, которую кто -то сделал некоторое время назад на работе, для которой нет поддержки внутренне. Это требует Pytorch, и он находится на Jetson Orin (ARM64/AARCH64). Все идет нормально с компиляцией, и, похоже, нет никаких... - 0 Ответы
- 7 Просмотры
-
Последнее сообщение Anonymous
-
-
-
Поставщик выполнения CUDA в ONNX допускает ошибку при объединении TensorRT с ONNX
Anonymous » » в форуме PythonПри переходе от CUDA ONNX к коду TensorRT на Python
возникла эта ошибка при запуске модели из ONNX (с CUDA Provider) и модели из TensorRT в одном и том же коде.
получены эти ошибки
2023-11-26 11:46:35.483254243 Некоторым узлам не были назначены... - 0 Ответы
- 36 Просмотры
-
Последнее сообщение Anonymous
-
-
-
Значения CUDA Jetson AGX Orin равны 0.
Anonymous » » в форуме C++Я не могу собрать свой Jetson AGX Orin в течение нескольких дней.
Эти тестовые примеры хорошо работают на моей машине x86 с наборами инструментов RTX4060 + CUDA 12.4.
Но не могут работать на Jetson AGX Orin.
Моя среда: (обновлено менеджером Nvidia... - 0 Ответы
- 31 Просмотры
-
Последнее сообщение Anonymous
-
Перейти
- Кемерово-IT
- ↳ Javascript
- ↳ C#
- ↳ JAVA
- ↳ Elasticsearch aggregation
- ↳ Python
- ↳ Php
- ↳ Android
- ↳ Html
- ↳ Jquery
- ↳ C++
- ↳ IOS
- ↳ CSS
- ↳ Excel
- ↳ Linux
- ↳ Apache
- ↳ MySql
- Детский мир
- Для души
- ↳ Музыкальные инструменты даром
- ↳ Печатная продукция даром
- Внешняя красота и здоровье
- ↳ Одежда и обувь для взрослых даром
- ↳ Товары для здоровья
- ↳ Физкультура и спорт
- Техника - даром!
- ↳ Автомобилистам
- ↳ Компьютерная техника
- ↳ Плиты: газовые и электрические
- ↳ Холодильники
- ↳ Стиральные машины
- ↳ Телевизоры
- ↳ Телефоны, смартфоны, плашеты
- ↳ Швейные машинки
- ↳ Прочая электроника и техника
- ↳ Фототехника
- Ремонт и интерьер
- ↳ Стройматериалы, инструмент
- ↳ Мебель и предметы интерьера даром
- ↳ Cантехника
- Другие темы
- ↳ Разное даром
- ↳ Давай меняться!
- ↳ Отдам\возьму за копеечку
- ↳ Работа и подработка в Кемерове
- ↳ Давай с тобой поговорим...