On T4 GPU, mean of 30 inferences for each.

Inferred using pipeline (pre-processing and post-processing included with model inference).

Model/Batch Size 16 4 1
intel/dpt-large 2709.652 667.799 172.617
facebook/dpt-dinov2-small-nyu 2534.854 654.822 159.754
facebook/dpt-dinov2-base-nyu 4316.8733 1090.824 266.699
Intel/dpt-beit-large-512 7961.386 2036.743 497.656
depth-anything-small 1692.368 415.915 143.379

torch.compile’d benchmarks with reduce-overhead mode: we have compiled the model and loaded it to the pipeline for the benchmarks to be fair.

Model/Batch Size 16 4 1
intel/dpt-large 2556.668 645.750 155.153
facebook/dpt-dinov2-small-nyu 2415.25 610.967 148.526
facebook/dpt-dinov2-base-nyu 4057.909 1035.672 245.692
Intel/dpt-beit-large-512 7417.388 1795.882 426.546
depth-anything-small 1664.025 384.688 97.865