sdg_pallet_model
:NVIDIA's sdg_pallet_model wrapper
SDG Pallet Model
This repository contains code for performing optimized TensorRT inference with a pre-trained
pallet detection model that was trained using synthetic data with NVIDIA Omniverse Replicator.
The model takes as input a monocular RGB image, and outputs the pallet box estimates. The box esimates
are defined for each pallet side face. So a single pallet may have multiple box
estimates.
If you have any questions, please feel free to reach out by opening an issue!
Instructions
Step 1 - Install dependencies
Assumes you've already set up your system with OpenCV, PyTorch and numpy.
Install einops for some utility functions.
pip3 install einops
Install torch2trt. This is used
for the TRTModule class which simplifies engine inference.
git clone https://github.com/NVIDIA-AI-IOT/torch2trt
cd torch2trt
python3 setup.py develop
Step 2 - Download the ONNX model
Download the pallet model ONNX file.
| Model | Notes | Links |
|---|---|---|
| pallet_model_v1_all | Trained for wood and other pallets (metal, plastic). | onnx |
| pallet_model_v1_wood | Trained only for wood pallets. | onnx |
Step 3 - Build the TensorRT engine
Option 1 (recommended) - Build the FP16 engine
To build the FP16 engine, call the following:
./build_trt_fp16.sh
Option 2 - Build the INT8 engine
The INT8 model instructions do not yet include calibration. Please only use this model for throughput profiling. The accuracy is likely to vary from FP32/FP16 models. However, once calibration is included, this may become the recommended option given the improved throughput results.
To build the INT8 engine, call the following:
./build_trt_int8.sh
We hope to provide instructions for using the Deep Learning Accelerator (DLA) on Jetson AGX Orin, and INT8 calibration soon.
Step 3 - Profile the engine
To profile the engine with the trtexec tool, call the following:
./profile_engine.sh
Here are the results for a model inference at 256x256 resolution, profiled on Jetson AGX Orin.
| Precision | Throughput (FPS) |
|---|---|
| FP16 | 465 |
| INT8 | 710 |
Notes:
- Called
jetson_clocksbefore running - Using MAXN power mode by calling
sudo nvpmodel -m0 - Batch size 1
--useSpinWaitflag enabled to stabilize timings--useCudaGraphflag enabled to use CUDA graph optimizations. Cuda graph isn't yet used in the predict function.
Step 4 - Run inference on an example image.
python3 predict.py --output=
For more options
python3 predict.py --help
Next Steps
Try modifying the predict.py code to visualize inference on a live camera feed.