[1] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition. arXiv Preprint, arXiv: 1409. 1556, 2014.
[2] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition ( CVPR’16 ), 2016, Jun 27 - 30, Las Vegas, NV, USA. Piscataway, NJ, USA: IEEE, 2016: 770 - 778.
[3] KRIZHEVSKY A, ILYA S, HINTON G E. ImageNet classification with deep convolutional neural networks. Proceedings of the 26th Annual Conference on Neural Information Processing Systems ( NIPS’12): Vol 1, 2012, Dec 3 - 6, Lake Tahoe, NV, USA. Red Hook, NY, USA: Curran Associates, Inc, 2012: 1097 - 1105.
[4] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognitionn ( CVPR’14 ), 2014, Jun 23 - 28, Columbus, OH, USA. Piscataway, NJ, USA: IEEE, 2014: 580 - 587.
[5] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition ( CVPR’16 ), 2016, Jun 27 - 30, Las Vegas, NV, USA. Piscataway, NJ, USA: IEEE, 2016: 779 - 788.
[6] DONG C, LOY C C, HE K M, et al. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38 (2): 295 - 307.
[7] CHELLAPILLA K, PURI S, SIMARD P. High performance convolutional neural networks for document processing. Proceedings of the 10th International Workshop on Frontiers in Handwriting Recognition ( IWFHR’06 ), 2006, Oct 23 - 26, La Baule, France. Paris, France: Publisoft, 2006: 1 - 7.
[8] GEORGANAS E, AVANCHA S, BANERJEE K, et al. Anatomy of high-performance deep learning convolutions on SIMD architectures. Proceedings of the 2018 International Conference for High Performance Computing, Networking, Storage and Analysis ( SC’18 ), 2018, Nov 11 - 16, Dallas, TX, USA. Piscataway, NJ, USA: IEEE, 2018: 830 - 841.
[9] HONG S, PARK D. Differential image-based fast and compatible convolutional layers for multi-core processors. Proceedings of the 2023 International Conference on Artificial Intelligence in Information and Communication ( ICAIIC’23 ), 2023, Feb 20 - 23, Bali, Indonesia. Piscataway, NJ, USA: IEEE, 2023: 86 - 90.
[10] CONG J, XIAO B J. Minimizing computation in convolutional neural networks. Proceedings of the 24th International Conference on Artificial Neural Networks ( ICANN’14 ), 2014, Sep 15 - 19, Hamburg, Germany. LNTCS 8681. Berlin, Germany: Springer, 2014: 281 - 290.
[11] ZLATESKI A, JIA Z, LI K, et al. FFT convolutions are faster than Winograd on modern CPUs, here is why. arXiv Preprint, arXiv: 1809. 07851, 2018.
[12 ] LAVIN, A, GRAY S. Fast algorithms for convolutional neural networks. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition ( CVPR’16), 2016, Jun 27 - 30, Las Vegas, NV, USA. Piscataway, NJ, USA: IEEE, 2016: 4013 - 4021.
[13] LU L Q, LIANG Y. SpWA: an efficient sparse Winograd convolutional neural networks accelerator on FPGAs. Proceedings of the 55th ACM / ESDA / IEEE Design Automation Conference ( DAC’18 ), 2018, Jun 24 - 28, San Francisco, CA, USA. Piscataway, NJ, USA: IEEE, 2018: 1 - 6.
[14] ABTAHI T, SHEA C, KULKARNI A, et al. Accelerating convolutional neural network with FFT on embedded hardware. IEEE Transactions on Very Large Scale Integration ( VLSI ) Systems, 2018, 26(9): 1737 - 1749.
[15] ALBERICIO J, JUDD P, HETHERINGTON T, et al. Cnvlutin: ineffectual-neuron-free deep neural network computing. ACM SIGARCH Computer Architecture News, 2016, 44(3): 1 - 13.
[16] HAN S, MAO H Z, DALLY W J. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv Preprint, arXiv: 1510. 00149, 2015.
[17] ZHANG C, SUN G Y, FANG Z M, et al. Caffeine: toward uniformed representation and acceleration for deep convolutional neural networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2018, 38(11 ): 2072 - 2085.
[18] WANG L M, GUO S, HUANG W L, et al. Places205-VGGnet models for scene recognition. arXiv Preprint, arXiv: 1508. 01667,
2015.
[19] BENEDETTI A, PRATI A, SCARABOTTOLO N. Image convolution on FPGAs: the implementation of a multi-FPGA FIFO structure. Proceedings of the 24th Conference on EUROMICRO ( EUROMICRO ’98 ): Vol. 1, 1998, Aug 25 - 27, Vasteras,
Sweden. Washington, DC, USA: IEEE Computer Society, 1998: 123 - 130.
[20] LIANG Y, LU L Q, XIAO Q C, et al. Evaluating fast algorithms for convolutional neural networks on FPGAs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(4): 857 - 870.
[21] HAN S, LIU X Y, MAO H Z, et al. EIE: efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News, 2016, 44(3): 243 - 254.
[22] MA Y F, CAO Y, VRUDHULA S, et al. Optimizing loop operation and dataflow in FPGA acceleration of deep convolutional neural networks. Proceedings of the 2017 ACM / SIGDA International Symposium on Field-Programmable Gate Arrays ( FPGA’17 ), 2017, Feb 22 - 24, Monterey, CA, USA. New York, NY, USA: ACM, 2017: 45 - 54.
[23] SUDA N, CHANDRA V, DASIKA G, et al. Throughput- optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. Proceedings of the 2016 ACM / SIGDA International Symposium on Field-Programmable Gate Arrays ( FPGA’16 ), 2016, Feb 21 - 23, Monterey, CA, USA. New York, NY, USA: ACM, 2016: 16 - 25.
[24] XIE B, ZHANG G D, SHEN Y J, et al. Fast FFT-based inference in 3D convolutional neural networks. Proceedings of the 12th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing ( IMIS’18 ), 2018, Jul 4 - 6, Matsue, Japan. AISC 773. Berlin, Germany: Springer, 2019: 420 - 431.
|