Skip to content

Commit 55d8730

Browse files
committed
Support dynamic batch.
1 parent d0f7150 commit 55d8730

18 files changed

Lines changed: 91 additions & 57 deletions

README.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ This repo use TensorRT-8.x to deploy well-trained models, both image preprocessi
3535
+ 2023.05.16 🚀 Support cuda box postprocess.
3636
+ 2023.05.19 🚀 Support cuda mask postprocess and support rtdetr.
3737
+ 2023.05.21 🚀 Support yolov6.
38+
+ 2023.05.26 🚀 Support dynamic batch inference.
3839
</details>
3940

4041
## 3.Support Models
@@ -57,13 +58,13 @@ All speed tests were performed on RTX 3090 with COCO Val set.The time calculated
5758
| Models | BatchSize | Mode | Resolution | FPS |
5859
|-|-|:-:|:-:|:-:|
5960
| YOLOv5-s v7.0 | 1 | FP32 | 640x640 | 200 |
60-
| YOLOv5-s v7.0 | 32 | FP32 | 640x640 | - |
61+
| YOLOv5-s v7.0 | 32 | FP32 | 640x640 | 246 |
6162
| YOLOv5-seg-s v7.0 | 1 | FP32 | 640x640 | 155 |
6263
| YOLOv6-s v3 | 1 | FP32 | 640x640 | 163 |
6364
| YOLOv7 | 1 | FP32 | 640x640 | 107 |
6465
| YOLOv8-s | 1 | FP32 | 640x640 | 171 |
6566
| YOLOv8-seg-s | 1 | FP32 | 640x640 | 122 |
66-
| RT-DETR | 1 | FP32 | 640x640 | - |
67+
| RT-DETR | 1 | FP32 | 640x640 | 106 |
6768
</div>
6869

6970

@@ -97,8 +98,10 @@ mkdir build && cd build
9798
cmake ..
9899
make -j$(nproc)
99100
```
100-
4. Download the TRT engine or ONNX model and put them in `weights/MODEL_NAME`. Then modify the configuration file in `configs`.
101-
101+
4. Get the ONNX model from the official repository and put them in `weights/MODEL_NAME`. Then modify the configuration file in `configs`.Take yolov5 as an example:
102+
```
103+
python export.py --weights=yolov5s.pt --dynamic --simplify --include=onnx --opset 11
104+
```
102105
5. The executable file will be generated in `bin` in the repo directory if compile successfully.Then enjoy yourself with command like this:
103106
```
104107
cd bin
@@ -107,6 +110,7 @@ cd bin
107110

108111
> Notes:
109112
> 1. The output of the model is required for post-processing is num_bboxes (imageHeight x image Width) x num_pred(num_cls + coordinates + confidence),while the output of YOLOv8 is num_pred x num_bboxes,which means the predicted values of the same box are not contiguous in memory.For convenience, the corresponding dimensions of the original pytorch output need to be transposed when exporting to ONNX model.
113+
> 2. The dynamic shape engine is convenient but sacrifices some inference speed compared with the static model of the same batchsize.Therefore, if you want to pursue faster inference speed, it is better to export the ONNX model of fixed batchsize, such as batchsize 32.
110114
111115

112116

configs/rtdetr.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ rtdetr:
33
engine_file: "../weights/rtdetr/rtdetr_hgnetl.trt"
44
type: "coco"
55
mode: "fp32"
6+
dynamic: 1
67
batchSize: 1
78
imageWidth: 640
89
imageHeight: 640

configs/yolov5-seg.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ yolov5-seg:
33
engine_file: "../weights/yolov5/yolov5s-seg.trt"
44
type: "coco"
55
mode: "fp32"
6+
dynamic: 1
67
batchSize: 1
78
imageWidth: 640
89
imageHeight: 640

configs/yolov5.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ yolov5:
33
engine_file: "../weights/yolov5/yolov5s.trt"
44
type: "coco"
55
mode: "fp32"
6+
dynamic: 1
67
batchSize: 1
78
imageWidth: 640
89
imageHeight: 640

configs/yolov6.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ yolov6:
33
engine_file: "../weights/yolov6/yolov6s.trt"
44
type: "coco"
55
mode: "fp32"
6+
dynamic: 1
67
batchSize: 1
78
imageWidth: 640
89
imageHeight: 640

configs/yolov7-p6.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@ yolov7:
22
onnx_file: "../weights/yolov7/yolov7-w6.onnx"
33
engine_file: "../weights/yolov7/yolov7-w6.trt"
44
type: "coco"
5-
mode: "fp32"
5+
mode: "fp32"
6+
dynamic: 1
67
batchSize: 1
78
imageWidth: 1280
89
imageHeight: 1280

configs/yolov7.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ yolov7:
33
engine_file: "../weights/yolov7/yolov7.trt"
44
type: "coco"
55
mode: "fp32"
6+
dynamic: 1
67
batchSize: 1
78
imageWidth: 640
89
imageHeight: 640

configs/yolov8-seg.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ yolov8-seg:
33
engine_file: "../weights/yolov8/yolov8s-seg.trt"
44
type: "coco"
55
mode: "fp32"
6+
dynamic: 1
67
batchSize: 1
78
imageWidth: 640
89
imageHeight: 640

configs/yolov8.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ yolov8:
33
engine_file: "../weights/yolov8/yolov8s.trt"
44
type: "coco"
55
mode: "fp32"
6+
dynamic: 1
67
batchSize: 1
78
imageWidth: 640
89
imageHeight: 640

include/basemodel.h

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,14 +19,14 @@ class Model {
1919
std::string onnx_file;
2020
std::string engine_file;
2121
std::string mode;
22-
std::vector<AffineMatrix> dst2src;
22+
int dynamic;
2323
int batchSize;
2424
int imageWidth;
2525
int imageHeight;
26-
std::string names[10];
27-
float** cpu_buffers = new float* [10];
26+
float* cpu_buffer;
2827
float* gpu_buffers[10]{};
2928
std::vector<int64_t> bufferSize;
29+
std::vector<AffineMatrix> dst2src;
3030
std::shared_ptr<nvinfer1::ICudaEngine> engine;
3131
std::unique_ptr<nvinfer1::IExecutionContext> context;
3232

0 commit comments

Comments
 (0)