Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4 (LLM part) failed to convert to TensorRT Engine

## Describe the bug
During the model conversion of Nemotron Omni from ONNX to TensorRT Engine on the target platform, the error message shows:
```text
...
[07:48:03.497] [INFO] [TensorRT] Successfully created plugin: Nvfp4MoePlugin
[07:48:03.516] [INFO] [llmBuilder.cpp:118:build] Created directory Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4/omni_engine_nvfp4.trtedgellm070/llm for saving LLM engine.
[07:48:03.546] [ERROR] [TensorRT] IBuilder::buildSerializedNetwork: Error Code 9: API Usage Error (INT8 and FP8 mixed precision is allowed only when building network with kSTRONGLY_TYPED mode on Blackwell+ platforms.)
[07:48:03.546] [ERROR] [builderUtils.cpp:313:buildAndSerializeEngine] Failed to build serialized engine
[07:48:04.430] [ERROR] [llm_build.cpp:242:main] Failed to build LLM engine.
```

### Steps/Code to reproduce bug
1. Download model from https://huggingface.co/nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4
2. Change `model_type` in `config.json` from `NemotronH_Nano_Omni_Reasoning_V3` to `NemotronH_Nano_VL_V2`
3. Convert model to ONNX by using experimental llm_loader
4. Send model & cross-compiled TensorRT Edge-LLM program to Nvidia Drive Thor U (DriveOS 7.0.3) 
5. Set `/proc/sys/vm/nr_hugepages` to `16384`
6. Run `./build/examples/llm/llm_build --onnxDir Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4/omni_onnx_nvfp4.trtedgellm070/llm/ --engineDir Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4/omni_engine_nvfp4.trtedgellm070/llm`

**Build configuration:**
```bash
cmake .. \
    -DCMAKE_BUILD_TYPE=Release \
    -DTRT_PACKAGE_DIR=/usr \
    -DCMAKE_TOOLCHAIN_FILE=cmake/aarch64_linux_toolchain.cmake \
    -DEMBEDDED_TARGET=auto-thor \
    -DCUDA_CTK_VERSION=12.8
```

### Expected behavior
The LLM build should be completed. And I can run this omni model with llm_inference

## System information (Edge Device)

- Platform (e.g., NVIDIA Jetson Thor):  Drive Thor U
- Software release (e.g., JetPack 7.1): DriveOS 7.0.3
- CPU architecture: aarch64
- GPU compute capability (e.g., SM110 for Jetson Thor): SM101
- Total device memory: 59045MB
- Build type (e.g., Release, Debug): Release
- Library versions:
  - TensorRT Edge-LLM version or commit hash: 0.7.0
  - CUDA: 12.8
  - TensorRT: 10.10
  - C++ compiler (e.g., GCC 11.4): GCC 13.3 (x86 docker dev env)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4 (LLM part) failed to convert to TensorRT Engine #89

Describe the bug

Steps/Code to reproduce bug

Expected behavior

System information (Edge Device)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4 (LLM part) failed to convert to TensorRT Engine #89

Description

Describe the bug

Steps/Code to reproduce bug

Expected behavior

System information (Edge Device)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions