./build/examples/multimodal/action_build --onnxDir $WORKSPACE_DIR/$MODEL_NAME/onnx/action --engineDir $WORKSPACE_DIR/$MODEL_NAME/engines --maxBatchSize 6
[15:02:57.185] [INFO] [trtUtils.h:62:loadEdgellmPluginLib] EDGELLM_PLUGIN_PATH: /home/agx/TensorRT-Edge-LLM/build/libNvInfer_edgellm_plugin.so
[15:02:57.188] [WARNING] [version.cpp:79:checkVersion] Model does not have edgellm_version. Current runtime version: 0.7.1
[15:02:57.475] [INFO] [TensorRT] [MemUsageChange] Init CUDA: CPU +21, GPU +0, now: CPU 29, GPU 101073 (MiB)
[15:02:58.736] [INFO] [TensorRT] [MemUsageChange] Init builder kernel library: CPU +1228, GPU +1234, now: CPU 1459, GPU 102512 (MiB)
[15:02:58.744] [INFO] [TensorRT] ----------------------------------------------------------------
[15:02:58.744] [INFO] [TensorRT] Input filename: /home/agx/tensorrt-edgellm-workspace/Alpamayo-R1-10B/onnx/action/model.onnx
[15:02:58.744] [INFO] [TensorRT] ONNX IR version: 0.0.10
[15:02:58.744] [INFO] [TensorRT] Opset version: 24
[15:02:58.744] [INFO] [TensorRT] Producer name: pytorch
[15:02:58.744] [INFO] [TensorRT] Producer version: 2.10.0+cu128
[15:02:58.744] [INFO] [TensorRT] Domain:
[15:02:58.744] [INFO] [TensorRT] Model version: 0
[15:02:58.744] [INFO] [TensorRT] Doc string:
[15:02:58.744] [INFO] [TensorRT] ----------------------------------------------------------------
[15:02:58.751] [INFO] [TensorRT] Searching for plugin wth node domain namespace:
[15:02:58.751] [INFO] [TensorRT] Searching for plugin wth node domain namespace:
...
[15:02:58.754] [ERROR] [TensorRT] ModelImporter.cpp:135: While parsing node number 126 [RotaryEmbedding -> "rope_onnx"]:
[15:02:58.754] [ERROR] [TensorRT] ModelImporter.cpp:138: --- Begin node ---
input: "mul_198"
input: "_to_copy_5"
input: "_to_copy_6"
input: "attention_pos_id"
output: "rope_onnx"
name: "n0"
op_type: "RotaryEmbedding"
domain: "trt"
metadata_props {
key: "namespace"
value: ": llm_loader.models.alpamayo.modeling_alpamayo_action.AlpamayoAction/expert: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionExpert/expert.layers.0: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionDecoderLayer/expert.layers.0.self_attn: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionAttention/rope_onnx: trt.rope_onnx.default"
}
metadata_props {
key: "pkg.torch.onnx.class_hierarchy"
value: "[\'llm_loader.models.alpamayo.modeling_alpamayo_action.AlpamayoAction\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionExpert\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionDecoderLayer\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionAttention\', \'trt.rope_onnx.default\']"
}
metadata_props {
key: "pkg.torch.onnx.fx_node"
value: "%rope_onnx : [num_users=1] = call_function[target=torch.ops.trt.rope_onnx.default](args = (%mul_198, %_to_copy_5, %_to_copy_6, %attention_pos_id), kwargs = {})"
}
metadata_props {
key: "pkg.torch.onnx.name_scopes"
value: "[\'\', \'expert\', \'expert.layers.0\', \'expert.layers.0.self_attn\', \'rope_onnx\']"
}
metadata_props {
key: "pkg.torch.onnx.stack_trace"
value: "File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 401, in forward\n hidden, present_ks, present_vs = self.expert(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 318, in forward\n hidden_states, pk, pv = layer(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 271, in forward\n hidden_states, present_k, present_v = self.self_attn(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 208, in forward\n q = rope_onnx(q.to(compute_type), rope_cos, rope_sin, position_ids)"
}
...
[15:02:58.754] [ERROR] [TensorRT] ModelImporter.cpp:139: --- End node ---
[15:02:58.754] [ERROR] [TensorRT] ModelImporter.cpp:141: ERROR: onnxOpCheckers.cpp:837 In function checkFallbackPluginImporter:
[6] creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[15:02:58.754] [ERROR] [TensorRT] ModelImporter.cpp:135: While parsing node number 129 [TensorScatter -> "present_k_cache_0"]:
[15:02:58.754] [ERROR] [TensorRT] ModelImporter.cpp:138: --- Begin node ---
input: "k_cache_0"
input: "rope_onnx_1"
input: "kvcache_start_index"
output: "present_k_cache_0"
name: "n0_3"
op_type: "TensorScatter"
domain: "trt"
metadata_props {
key: "namespace"
value: ": llm_loader.models.alpamayo.modeling_alpamayo_action.AlpamayoAction/expert: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionExpert/expert.layers.0: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionDecoderLayer/expert.layers.0.self_attn: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionAttention/kv_cache_update_onnx: trt.kv_cache_update_onnx.default"
}
metadata_props {
key: "pkg.torch.onnx.class_hierarchy"
value: "[\'llm_loader.models.alpamayo.modeling_alpamayo_action.AlpamayoAction\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionExpert\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionDecoderLayer\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionAttention\', \'trt.kv_cache_update_onnx.default\']"
}
metadata_props {
key: "pkg.torch.onnx.fx_node"
value: "%kv_cache_update_onnx : [num_users=2] = call_function[target=torch.ops.trt.kv_cache_update_onnx.default](args = (%cache_tensors_0, %rope_onnx_1, %kvcache_start_index), kwargs = {})"
}
metadata_props {
key: "pkg.torch.onnx.name_scopes"
value: "[\'\', \'expert\', \'expert.layers.0\', \'expert.layers.0.self_attn\', \'kv_cache_update_onnx\']"
}
metadata_props {
key: "pkg.torch.onnx.stack_trace"
value: "File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 401, in forward\n hidden, present_ks, present_vs = self.expert(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 318, in forward\n hidden_states, pk, pv = layer(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 271, in forward\n hidden_states, present_k, present_v = self.self_attn(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 219, in forward\n present_k = kv_cache_update_onnx(k_cache, k, kvcache_start_index)"
}
...
[15:02:58.755] [ERROR] [TensorRT] ModelImporter.cpp:139: --- End node ---
[15:02:58.755] [ERROR] [TensorRT] ModelImporter.cpp:141: ERROR: onnxOpCheckers.cpp:837 In function checkFallbackPluginImporter:
[6] creator && "Plugin not found, are the plugin name, version, and namespace correct?"
[15:02:58.755] [ERROR] [TensorRT] ModelImporter.cpp:135: While parsing node number 131 [Attention -> "attention_onnx"]:
[15:02:58.755] [ERROR] [TensorRT] ModelImporter.cpp:138: --- Begin node ---
input: "mul_223"
input: "present_k_cache_0"
input: "present_v_cache_0"
input: "masked_fill"
output: "attention_onnx"
name: "n0_5"
op_type: "Attention"
attribute {
name: "is_causal"
i: 0
type: INT
}
attribute {
name: "TRT_decomposable"
i: 1
type: INT
}
attribute {
name: "scale"
f: 1
type: FLOAT
}
domain: "trt"
metadata_props {
key: "namespace"
value: ": llm_loader.models.alpamayo.modeling_alpamayo_action.AlpamayoAction/expert: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionExpert/expert.layers.0: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionDecoderLayer/expert.layers.0.self_attn: llm_loader.models.alpamayo.modeling_alpamayo_action.ActionAttention/attention_onnx: trt.attention_onnx.default"
}
metadata_props {
key: "pkg.torch.onnx.class_hierarchy"
value: "[\'llm_loader.models.alpamayo.modeling_alpamayo_action.AlpamayoAction\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionExpert\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionDecoderLayer\', \'llm_loader.models.alpamayo.modeling_alpamayo_action.ActionAttention\', \'trt.attention_onnx.default\']"
}
metadata_props {
key: "pkg.torch.onnx.fx_node"
value: "%attention_onnx : [num_users=1] = call_function[target=torch.ops.trt.attention_onnx.default](args = (%mul_223, %kv_cache_update_onnx, %kv_cache_update_onnx_1, %masked_fill, False, 1.0), kwargs = {})"
}
metadata_props {
key: "pkg.torch.onnx.name_scopes"
value: "[\'\', \'expert\', \'expert.layers.0\', \'expert.layers.0.self_attn\', \'attention_onnx\']"
}
metadata_props {
key: "pkg.torch.onnx.stack_trace"
value: "File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 401, in forward\n hidden, present_ks, present_vs = self.expert(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 318, in forward\n hidden_states, pk, pv = layer(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 271, in forward\n hidden_states, present_k, present_v = self.self_attn(\n File \"/home/user/TensorRT-Edge-LLM/experimental/llm_loader/models/alpamayo/modeling_alpamayo_action.py\", line 223, in forward\n attn_output = attention_onnx("
}
...
It seems like that op_type "Attention", "TensorScatter", "RotaryEmbedding" is not registered correctly as a plugin?
trtexec -h
...
&&&& PASSED TensorRT.trtexec [TensorRT v101303] [b9] # trtexec
Building the engine of
Alpamayo-R1-10B'saction head leads to the following error:It seems like that op_type "Attention", "TensorScatter", "RotaryEmbedding" is not registered correctly as a plugin?
TRT version is: