Abstract: Broadcast is a common operation in machine learning and widely used in calculating bias or subtracting maximum
for normalization in convolutional neural networks. Broadcast
operation is required when two tensors possibly with different
number of dimensions, hence with different number of elements,
are input to an element-wise function. Tensors are scaled in
process so that the two tensors match in size and dimension.
In this research, we introduce a new broadcast functionality for
matrices to be used on CUDA enabled GPU devices. We further
extend this operation to multidimensional arrays and measure its
performance against the implementation available in the Knet
deep learning framework. Our final implementation provides
up to 2x improvement over the Knet broadcast implementation,
which only supports vector broadcast. Our implementation can
handle broadcast operations with any number of dimensions.
September 14, 2017
Multidimensional Broadcast Operation on the GPU
Enis Berk Çoban, Deniz Yuret and Didem Unat. 2017. In 5. Ulusal Yüksek Başarımlı Hesaplama Konferansı, İstanbul, September. (PDF).
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment