Skip to content

Commit 7c83f2b

Browse files
committed
update readme
1 parent 2f5de0c commit 7c83f2b

7 files changed

Lines changed: 73 additions & 61 deletions

File tree

README.md

Lines changed: 42 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -13,39 +13,48 @@
1313
</div>
1414

1515
## 1.Introduction
16-
This repo use TensorRT-8.x to deploy well-trained models.
17-
16+
This repo use TensorRT-8.x to deploy well-trained models, both image preprocessing and postprocessing are performed with CUDA, which realizes high-speed inference.
1817
## 2.Update
18+
<details open>
19+
<summary>update process</summary>
1920

20-
- [x] [YOLOv5](https://github.com/ultralytics/yolov5) (sd)
21-
- [x] [YOLOv5-seg](https://github.com/ultralytics/yolov5)
22-
- [x] [YOLOv7](https://github.com/WongKinYiu/yolov7)
23-
- [x] [YOLOv8](https://github.com/ultralytics/ultralytics)
24-
- [x] [YOLOv8-seg](https://github.com/ultralytics/ultralytics)
25-
26-
21+
+ 2023.05.01 🚀 Create the repo.
22+
+ 2023.05.03 🚀 Support yolov5 detection.
23+
+ 2023.05.05 🚀 Support yolov7 and yolov5 instance-segmentation.
24+
+ 2023.05.10 🚀 Support yolov8 detection and instance-segmentation.
25+
+ 2023.05.12 🚀 Support cuda preprocess for speed up.
26+
+ 2023.05.16 🚀 Support cuda box postprocess.
27+
+ 2023.05.19 🚀 Support cuda mask postprocess and support rtdetr.
28+
</details>
2729

2830
## 3.Support Models
29-
All speed tests were performed on RTX 3090 with COCO Val set.The time calculated here is the sum of the time of image preprocess, inference and postprocess, since image loading and visualizing are not counted in, the actual spedd will be a little slower.
31+
<details open>
32+
<summary>supported models</summary>
33+
- [x] [YOLOv5](https://github.com/ultralytics/yolov5)<br>
34+
- [x] [YOLOv5-seg](https://github.com/ultralytics/yolov5)<br>
35+
- [x] [YOLOv7](https://github.com/WongKinYiu/yolov7)<br>
36+
- [x] [YOLOv8](https://github.com/ultralytics/ultralytics)<br>
37+
- [x] [YOLOv8-seg](https://github.com/ultralytics/ultralytics)<br>
38+
- [x] [RT-DETR](https://github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rtdetr)<br>
39+
- [] [YOLOv6](https://github.com/meituan/YOLOv6) (to be continued)<br>
40+
- [] [YOLO-NAS](https://github.com/Deci-AI/super-gradients) (to be continued)<br>
41+
</details>
42+
43+
All speed tests were performed on RTX 3090 with COCO Val set.The time calculated here is the sum of the time of image loading, preprocess, inference and postprocess, so it's going to be slower than what's reported in the paper.
44+
<div align='center'>
3045

31-
| Models | BatchSize | Mode | Input Shape(HxW) | FPS* | FPS |
32-
|-|-|:-:|:-:|:-:|:-:|
33-
| YOLOv5-n v7.0 | 1 | FP32 | 640x640 | 724 |
46+
| Models | BatchSize | Mode | Input Shape(HxW) | FPS |
47+
|-|-|:-:|:-:|:-:|
3448
| YOLOv5-s v7.0 | 1 | FP32 | 640x640 | 468 |
3549
| YOLOv5-s v7.0 | 32 | FP32 | 640x640 | - |
36-
| YOLOv5-m v7.0 | 1 | FP32 | 640x640 | 270 |
37-
| YOLOv5-l v7.0 | 1 | FP32 | 640x640 | 151 |
38-
| YOLOv5-x v7.0 | 1 | FP32 | 640x640 | 94 |
50+
| YOLOv5-seg-s v7.0 | 1 | FP32 | 640x640 | - |
3951
| YOLOv7 | 1 | FP32 | 640x640 | 154 |
40-
| YOLOv7x | 1 | FP32 | 640x640 | - | - |
41-
| YOLOv8-n | 1 | FP32 | 640x640 | 390 | 127 |
42-
| YOLOv8-s | 1 | FP32 | 640x640 | 171 | 101 |
43-
| YOLOv8-m | 1 | FP32 | 640x640 | 122 |
44-
| YOLOv8-l | 1 | FP32 | 640x640 | 88 |
45-
| YOLOv8-x | 1 | FP32 | 640x640 | 68 |
46-
| RT-DETR | 1 | FP32 | 640x640 | - | - |
47-
| RT-DETR | 1 | FP32 | 640x640 | - | - |
48-
+ FPS* means that the time of image loading, image processing and visualization are taken into account when calculating.FPS only counts image processing time(preprocess, inference, postprocess).
52+
| YOLOv8-s | 1 | FP32 | 640x640 | 171 |
53+
| YOLOv8-s | 1 | FP32 | 640x640 | - |
54+
| RT-DETR | 1 | FP32 | 640x640 | - |
55+
| RT-DETR | 1 | FP32 | 640x640 | - |
56+
</div>
57+
4958

5059
## 4.Usage
5160
1. Clone the repo.
@@ -65,4 +74,12 @@ cd bin
6574
./object_detection yolov5 /path/to/input/dir
6675
```
6776

77+
## 5.Reference
78+
[0].https://github.com/NVIDIA/TensorRT<br>
79+
[1].https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#c_topics<br>
80+
[2].https://github.com/linghu8812/tensorrt_inference<br>
81+
[3].https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#<br>
82+
[4].https://blog.csdn.net/bobchen1017?type=blog<br>
83+
84+
6885

include/build.h

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,11 @@
22
#define BUILD_H
33

44

5-
// #include "Swin-Transformer.h"
65
#include "yolov5.h"
7-
// #include "YOLOv6.h"
6+
// #include "yolov6.h"
87
#include "yolov7.h"
98
#include "yolov8.h"
109
#include "rtdetr.h"
1110

1211
std::shared_ptr<Model> build_model(std::string model_arch, std::string cfg);
13-
// char **argv
1412
#endif

include/detection.h

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -17,32 +17,34 @@ namespace Category {
1717
};
1818
const std::vector<std::string> voc = {
1919
"aeroplane","bicycle","bird","boat","bottle","bus","car","cat","chair","cow","diningtable",
20-
"dog","horse","motorbike","person","pottedplant","sheep","sofa","train","tvmonitor"
20+
"dog","horse","motorbike","person","pottedplant","sheep","sofa","train","tv/monitor"
2121
};
2222
}
2323

2424
namespace Color {
2525
const std::vector<cv::Scalar> coco {
26-
cv::Scalar(128, 77, 207),cv::Scalar(65, 32, 208),cv::Scalar(0, 224, 45),cv::Scalar(3, 141, 219),cv::Scalar(80, 239, 253),cv::Scalar(239, 184, 12),
27-
cv::Scalar(7, 144, 145),cv::Scalar(161, 88, 57),cv::Scalar(0, 166, 46),cv::Scalar(218, 113, 53),cv::Scalar(193, 33, 128),cv::Scalar(190, 94, 113),
28-
cv::Scalar(113, 123, 232),cv::Scalar(69, 205, 80),cv::Scalar(18, 170, 49),cv::Scalar(89, 51, 241),cv::Scalar(153, 191, 154),cv::Scalar(27, 26, 69),
29-
cv::Scalar(20, 186, 194),cv::Scalar(210, 202, 167),cv::Scalar(196, 113, 204),cv::Scalar(9, 81, 88),cv::Scalar(191, 162, 67),cv::Scalar(227, 73, 120),
30-
cv::Scalar(177, 31, 19),cv::Scalar(133, 102, 137),cv::Scalar(146, 72, 97),cv::Scalar(145, 243, 208),cv::Scalar(2, 184, 176),cv::Scalar(219, 220, 93),
31-
cv::Scalar(238, 253, 234),cv::Scalar(197, 169, 160),cv::Scalar(204, 201, 106),cv::Scalar(13, 24, 129),cv::Scalar(40, 38, 4),cv::Scalar(5, 41, 34),
32-
cv::Scalar(46, 94, 129),cv::Scalar(102, 65, 107),cv::Scalar(27, 11, 208),cv::Scalar(191, 240, 183),cv::Scalar(225, 76, 38),cv::Scalar(193, 89, 124),
33-
cv::Scalar(30, 14, 175),cv::Scalar(144, 96, 90),cv::Scalar(181, 186, 86),cv::Scalar(102, 136, 34),cv::Scalar(158, 71, 15),cv::Scalar(183, 81, 247),
34-
cv::Scalar(73, 69, 89),cv::Scalar(123, 73, 232),cv::Scalar(4, 175, 57),cv::Scalar(87, 108, 23),cv::Scalar(105, 204, 142),cv::Scalar(63, 115, 53),
35-
cv::Scalar(105, 153, 126),cv::Scalar(247, 224, 137),cv::Scalar(136, 21, 188),cv::Scalar(122, 129, 78),cv::Scalar(145, 80, 81),cv::Scalar(51, 167, 149),
36-
cv::Scalar(162, 173, 20),cv::Scalar(252, 202, 17),cv::Scalar(10, 40, 3),cv::Scalar(150, 90, 254),cv::Scalar(169, 21, 68),cv::Scalar(157, 148, 180),
37-
cv::Scalar(131, 254, 90),cv::Scalar(7, 221, 102),cv::Scalar(19, 191, 184),cv::Scalar(98, 126, 199),cv::Scalar(210, 61, 56),cv::Scalar(252, 86, 59),
38-
cv::Scalar(102, 195, 55),cv::Scalar(160, 26, 91),cv::Scalar(60, 94, 66),cv::Scalar(204, 169, 193),cv::Scalar(126, 4, 181),cv::Scalar(229, 209, 196),
39-
cv::Scalar(195, 170, 186),cv::Scalar(155, 207, 148)
26+
cv::Scalar(220, 20, 60), cv::Scalar(119, 11, 32), cv::Scalar(0, 0, 142), cv::Scalar(0, 0, 230), cv::Scalar(106, 0, 228),
27+
cv::Scalar(0, 60, 100), cv::Scalar(0, 80, 100), cv::Scalar(0, 0, 70), cv::Scalar(0, 0, 192), cv::Scalar(250, 170, 30),
28+
cv::Scalar(100, 170, 30), cv::Scalar(220, 220, 0), cv::Scalar(175, 116, 175), cv::Scalar(250, 0, 30), cv::Scalar(165, 42, 42),
29+
cv::Scalar(255, 77, 255), cv::Scalar(0, 226, 252), cv::Scalar(182, 182, 255), cv::Scalar(0, 82, 0), cv::Scalar(120, 166, 157),
30+
cv::Scalar(110, 76, 0), cv::Scalar(174, 57, 255), cv::Scalar(199, 100, 0), cv::Scalar(72, 0, 118), cv::Scalar(255, 179, 240),
31+
cv::Scalar(0, 125, 92), cv::Scalar(209, 0, 151), cv::Scalar(188, 208, 182), cv::Scalar(0, 220, 176), cv::Scalar(255, 99, 164),
32+
cv::Scalar(92, 0, 73), cv::Scalar(133, 129, 255), cv::Scalar(78, 180, 255), cv::Scalar(0, 228, 0), cv::Scalar(174, 255, 243),
33+
cv::Scalar(45, 89, 255), cv::Scalar(134, 134, 103), cv::Scalar(145, 148, 174), cv::Scalar(255, 208, 186), cv::Scalar(197, 226, 255),
34+
cv::Scalar(171, 134, 1), cv::Scalar(109, 63, 54), cv::Scalar(207, 138, 255), cv::Scalar(151, 0, 95), cv::Scalar(9, 80, 61),
35+
cv::Scalar(84, 105, 51), cv::Scalar(74, 65, 105), cv::Scalar(166, 196, 102), cv::Scalar(208, 195, 210), cv::Scalar(255, 109, 65),
36+
cv::Scalar(0, 143, 149), cv::Scalar(179, 0, 194), cv::Scalar(209, 99, 106), cv::Scalar(5, 121, 0), cv::Scalar(227, 255, 205),
37+
cv::Scalar(147, 186, 208), cv::Scalar(153, 69, 1), cv::Scalar(3, 95, 161), cv::Scalar(163, 255, 0), cv::Scalar(119, 0, 170),
38+
cv::Scalar(0, 182, 199), cv::Scalar(0, 165, 120), cv::Scalar(183, 130, 88), cv::Scalar(95, 32, 0), cv::Scalar(130, 114, 135),
39+
cv::Scalar(110, 129, 133), cv::Scalar(166, 74, 118), cv::Scalar(219, 142, 185), cv::Scalar(79, 210, 114), cv::Scalar(178, 90, 62),
40+
cv::Scalar(65, 70, 15), cv::Scalar(127, 167, 115), cv::Scalar(59, 105, 106), cv::Scalar(142, 108, 45), cv::Scalar(196, 172, 0),
41+
cv::Scalar(95, 54, 80), cv::Scalar(128, 76, 255), cv::Scalar(201, 57, 1), cv::Scalar(246, 0, 122), cv::Scalar(191, 162, 208)
4042
};
4143
const std::vector<cv::Scalar> voc {
42-
cv::Scalar(128, 77, 207),cv::Scalar(65, 32, 208),cv::Scalar(0, 224, 45),cv::Scalar(3, 141, 219),cv::Scalar(80, 239, 253),cv::Scalar(239, 184, 12),
43-
cv::Scalar(7, 144, 145),cv::Scalar(161, 88, 57),cv::Scalar(0, 166, 46),cv::Scalar(218, 113, 53),cv::Scalar(193, 33, 128),cv::Scalar(190, 94, 113),
44-
cv::Scalar(113, 123, 232),cv::Scalar(69, 205, 80),cv::Scalar(18, 170, 49),cv::Scalar(89, 51, 241),cv::Scalar(153, 191, 154),cv::Scalar(27, 26, 69),
45-
cv::Scalar(20, 186, 194),cv::Scalar(210, 202, 167),cv::Scalar(196, 113, 204),cv::Scalar(9, 81, 88),cv::Scalar(191, 162, 67),cv::Scalar(227, 73, 120)
44+
cv::Scalar(106, 0, 228), cv::Scalar(119, 11, 32), cv::Scalar(165, 42, 42), cv::Scalar(0, 0, 192), cv::Scalar(197, 226, 255),
45+
cv::Scalar(0, 60, 100), cv::Scalar(0, 0, 142), cv::Scalar(255, 77, 255), cv::Scalar(153, 69, 1), cv::Scalar(120, 166, 157),
46+
cv::Scalar(0, 182, 199), cv::Scalar(0, 226, 252), cv::Scalar(182, 182, 255), cv::Scalar(0, 0, 230), cv::Scalar(220, 20, 60),
47+
cv::Scalar(163, 255, 0), cv::Scalar(0, 82, 0), cv::Scalar(3, 95, 161), cv::Scalar(0, 80, 100), cv::Scalar(183, 130, 88)
4648
};
4749
};
4850

object_detection/CMakeLists.txt

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,6 @@ list(APPEND ALL_INCLUDE ${PROJECT_INCLUDE})
6161

6262
include_directories(${ALL_INCLUDE})
6363

64-
65-
# add_subdirectory(nanodet)
66-
# add_subdirectory(Swin-Transformer)
6764
add_subdirectory(yolov5)
6865
# add_subdirectory(yolov6)
6966
add_subdirectory(yolov7)

src/basemodel.cpp

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -116,13 +116,7 @@ void Model::PreProcess(std::vector<cv::Mat>& img_batch) {
116116
cv::Mat d2s = cv::Mat::zeros(2, 3, CV_32FC1);
117117
cv::invertAffineTransform(s2d, d2s);
118118

119-
// memcpy(d2s.value, dst2src.ptr<float>(0), sizeof(d2s.value));
120-
dst2src.v0 = d2s.ptr<float>(0)[0];
121-
dst2src.v1 = d2s.ptr<float>(0)[1];
122-
dst2src.v2 = d2s.ptr<float>(0)[2];
123-
dst2src.v3 = d2s.ptr<float>(1)[0];
124-
dst2src.v4 = d2s.ptr<float>(1)[1];
125-
dst2src.v5 = d2s.ptr<float>(1)[2];
119+
memcpy(&dst2src, d2s.ptr(), sizeof(dst2src));
126120
preprocess(img_batch[i].ptr(), dst2src, width, height, &gpu_buffers[0][bufferSize[0] * i], imageWidth, imageHeight, stream);
127121
CUDA_CHECK(cudaStreamSynchronize(stream));
128122
}

src/build.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ std::shared_ptr<Model> build_model(std::string model_arch, std::string cfg) {
77
model = std::make_shared<YOLOv5>(root[model_arch]);
88
else if (model_arch == "yolov5-seg")
99
model = std::make_shared<YOLOv5_seg>(root[model_arch]);
10-
// else if (model_arch == "YOLOv6")
10+
// else if (model_arch == "yolov6")
1111
// model = std::make_shared<YOLOv6>(root[model_arch]);
1212
else if (model_arch == "yolov7")
1313
model = std::make_shared<YOLOv7>(root[model_arch]);

src/rtdetr.cpp

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@ std::vector<Detections> RTDETR::InferenceImages(std::vector<cv::Mat> &imgBatch)
2121
auto boxes = PostProcess(imgBatch, gpu_buffers[1], gpu_buffers[2]);
2222
auto t_end_post = std::chrono::high_resolution_clock::now();
2323
float total_post = std::chrono::duration<float, std::milli>(t_end_post - t_start_post).count();
24-
std::cout << "preprocess time: "<< total_pre << "ms " <<
25-
"detection inference time: " << total_inf << "ms "
26-
"postprocess time: " << total_post << "ms " << std::endl;
24+
// std::cout << "preprocess time: "<< total_pre << "ms " <<
25+
// "detection inference time: " << total_inf << "ms "
26+
// "postprocess time: " << total_post << "ms " << std::endl;
2727
return boxes;
2828
}
2929

@@ -38,8 +38,12 @@ std::vector<Detections> RTDETR::PostProcess(const std::vector<cv::Mat> &imgBatch
3838
float* box_per_img = output1 + index * predboxSize;
3939
float* score_per_img = output2 + index * predscoreSize;
4040
cuda_postprocess_init(6, imageWidth, imageHeight);
41+
auto t_start_post = std::chrono::high_resolution_clock::now();
4142
rtdetr_postprocess_box(box_per_img, score_per_img, num_bboxes, num_classes, 6,
4243
conf_thr, imageWidth, imageHeight, dst2src, stream, cpu_buffers[2]);
44+
auto t_end_post = std::chrono::high_resolution_clock::now();
45+
float total_post = std::chrono::duration<float, std::milli>(t_end_post - t_start_post).count();
46+
std::cout << "postprocess time: " << total_post << "ms " << std::endl;
4347
int num_boxes = std::min((int)cpu_buffers[2][0], 300);
4448
for (int i = 0; i < num_boxes; i++) {
4549
Box box;

0 commit comments

Comments
 (0)