XIMEA API can enable the use of NVIDIA's GPUDirect RDMA feature on supported configurations.
-pcie
optionFirst, you need to enable GPUDirect support in XIMEA kernel driver:
sudo /opt/XIMEA/src/ximea_cam_pcie/enable_gpudirect.sh
/etc/rc.local
or analogous file and reboot the system:modprobe ximea_cam_pcie major="$(grep ximea_cam_pcie /proc/devices | cut -f1 -d\ )" for i in $(seq "$(lspci -nd deda:|wc -l)") do minor="$((i - 1))" devname="/dev/ximea$(printf "%02d" ${minor})" mknod -m 660 "${devname}" c "${major}" "${minor}" chgrp plugdev "${devname}" done
xiSetParamInt(handle, XI_PRM_BUFFER_POLICY, XI_BP_UNSAFE); xiSetParamInt(handle, XI_PRM_IMAGE_DATA_FORMAT, XI_FRM_TRANSPORT_DATA); xiSetParamInt(handle, XI_PRM_TRANSPORT_DATA_TARGET, XI_TRANSPORT_DATA_TARGET_GPU_RAM); // Lower the size of acquisition buffer because it's limited by GPU's BAR size, e.g. (256 - 32) MB xiSetParamInt(handle, XI_PRM_ACQ_BUFFER_SIZE, 200000000);
When XI_PRM_TRANSPORT_DATA_TARGET
is set to XI_TRANSPORT_DATA_TARGET_GPU_RAM
xiGetImage
returns a pointer to GPU memory in bp
field of XI_IMG
structure.
This saves you one cudaMemcpy
operation for copying the data from CPU to device memory on each acquired frame.
GPU memory is allocated using cudaMalloc
in xiStartAcquisition
function and deallocated (using cudaFree
) in xiStopAcquisition
.
You should select necessary CUDA device using cudaSetDevice
before calling xiStartAcquisition
.
Attached is an example demonstrating xiAPI GPUDirect feature: xiCUDASample.tar.bz2.
It is based on 3_Imaging/histogram
from CUDA samples which computes 64-bin histogram on GPU.
Results are displayed on the terminal using ASCII-art.
Also, this application prints time measurements, so you can compare the time needed for running the computation with and without GPUDirect enabled.
Please refer to readme.txt
included in the tarball for building instructions.
More information about GPUDirect RDMA technology can be found on NVIDIA website:
http://docs.nvidia.com/cuda/gpudirect-rdma/