Data augmentation via joint multi-scale CNN and multi-channel attention for bumblebee image generation

doi:10.19682/j.cnki.1005-8885.2023.1001

中国邮电高校学报(英文) ›› 2023, Vol. 30 ›› Issue (3): 32-40.doi: 10.19682/j.cnki.1005-8885.2023.1001

• Artificial Intelligence • 上一篇下一篇

Data augmentation via joint multi-scale CNN and multi-channel attention for bumblebee image generation

Du Rong, Chen Shudong, Li Weiwei, Zhang Xueting, Wang Xianhui, Ge Jin

1. Intelligent Manufacturing Electronics Research and Development Center, Institute of Microelectronics of the Chinese Academy of Sciences,Beijing 100029, China 2. School of Integrated Circuits, University of Chinese Academy of Sciences, Beijing 100049, China 3. Institute of Zoology, Chinese Academy of Sciences, Beijing 100080, China 4. Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China

收稿日期:2022-05-20 修回日期:2023-03-14 出版日期:2023-06-30 发布日期:2023-06-30
通讯作者: 陈曙东 E-mail:chenshudong@ime.ac.cn

Data augmentation via joint multi-scale CNN and multi-channel attention for bumblebee image generation

Du Rong, Chen Shudong, Li Weiwei, Zhang Xueting, Wang Xianhui, Ge Jin

1. Intelligent Manufacturing Electronics Research and Development Center, Institute of Microelectronics of the Chinese Academy of Sciences,Beijing 100029, China 2. School of Integrated Circuits, University of Chinese Academy of Sciences, Beijing 100049, China 3. Institute of Zoology, Chinese Academy of Sciences, Beijing 100080, China 4. Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China

Received:2022-05-20 Revised:2023-03-14 Online:2023-06-30 Published:2023-06-30
Contact: Chen Shudong E-mail:chenshudong@ime.ac.cn

摘要/Abstract

摘要：

The difficulty of bumblebee data collecting and the laborious nature of bumblebee data annotation sometimes result in a lack of training data, which impairs the effectiveness of deep learning based counting methods. Given that it is challenging to produce the detailed background information in the generated bumblebee images using current data augmentation methods, in this paper, a joint multi-scale convolutional neural network and multi-channel attention based generative adversarial network (MMGAN) is proposed. MMGAN generates the bumblebee image in accordance with the corresponding density map marking the bumblebee positions. Specifically, the multi-scale convolutional neural network ( CNN) module utilizes multiple convolution kernels to completely extract features of different scales from the input bumblebee image and density map. To generate various targets in the generated image, the multi-channel attention module builds numerous intermediate generation layers and attention maps. These targets are then stacked to produce a bumblebee image with a specific number of bumblebees. The proposed model obtains the greatest performance in bumblebee image generating tasks, and such generated bumblebee images considerably improve the efficiency of deep learning based counting methods in bumblebee counting applications.

关键词:

data augmentation, image generation, attention mechanism

Abstract:

Key words: data augmentation, image generation, attention mechanism

中图分类号:

TP183

Du Rong, Chen Shudong, Li Weiwei, Zhang Xueting, Wang Xianhui, Ge Jin. Data augmentation via joint multi-scale CNN and multi-channel attention for bumblebee image generation[J]. The Journal of China Universities of Posts and Telecommunications, 2023, 30(3): 32-40.

参考文献

1. ZHANG Y, ZHOU D S, CHEN S Q, et al. Single-image crowd counting via multi-column convolutional neural network. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16), 2016, Jun 27-30, Las Vegas, NV, USA. Piscataway, NJ, USA: IEEE, 2016: 589-597.

2. FAN H, LING H B. SANet: Structure-aware network for visual tracking. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’17), 2017, Jul 21-26, Honolulu, HI, USA. Piscataway, NJ, USA: IEEE, 2017: 2217-2224.

3. LI Y H, ZHANG X F, CHEN D M. CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’18), 2018, Jun 18-22, Salt Lake City, UT, USA. Piscataway, NJ, USA: IEEE, 2018: 1091-1100.

4. ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-image translation with conditional adversarial networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17), 2017, Jul 21-26, Honolulu, HI, USA. Piscataway, NJ, USA: IEEE, 2017: 5967-5976.

5. GONZALEZ R C, WOODS R E. Digital image processing. 3rd ed. Harlow, UK: Pearson Education, 2008.

6. WANG J, PEREZ L. The effectiveness of data augmentation in image classification using deep learning. arXiv Preprint, arXiv:1712.04621, 2017.

7. CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 2002, 16: 321-357.

8. INOUE H. Data augmentation by pairing samples for images classification. arXiv Preprint, arXiv:1801.02929, 2018.

9. CUBUK E. D, ZOPH B, MANE D, et al. AutoAugment: Learning augmentation strategies from data. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19), 2019, Jun 16-20, Long Beach, CA, USA. Piscataway, NJ, USA: IEEE, 2019: 113-123.

10. DEVRIES. T, TAYLOR G W. Dataset augmentation in feature space. Proceedings of the 5th International Conference on Learning Representations-Workshop Track (ICLRW’17), 2017, Apr 24-26, Toulon, France. 2017: 12p.

11. MIYATO T, DAI A M, GOODFELLOW I. Adversarial training methods for semi-supervised text classification. Proceedings of the 5th International Conference on Learning Representations (ICLR’17), 2017, Apr 24-26, Toulon, France. 2017: 11p.

12. MUZAHID A A M, WAN W G, SOHEL F. Progressive conditional GAN-based augmentation for 3D object recognition. Neurocomputing, 2021, 460: 20-30.

13. WANG G X, KANG W X, WU Q X, et al. Generative adversarial network (GAN) based data augmentation for palmprint recognition. Proceedings of the 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA’18), 2018, Dec 10-13, Canberra, Australia. Piscataway, NJ, USA: IEEE, 2018: 7p.

14. FRID-ADAR M, DIAMANT I, KLANG E, et al. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing, 2018, 321: 321-331.

15. MOK T C W, CHUNG A C S. Learning data augmentation for brain tumor segmentation with coarse-to-fine generative adversarial networks. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: Proceedings of the 4th International MICCAI Brainlesion Workshop (MICCAI’18): Part I, 2018, Sept 16-20, Granada, Spain. LNIP 11383. Berlin, Germany: Springer, 2018: 70-80.

16. MORRIS J, LIFLAND E, YOO J Y, et al. TextAttack: A framework for adversarial attacks, data augmentation, and adversarial training in NLP. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (EMNLP’20), 2020, Nov 16-20, Online. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020: 119-126.

17. PENG B L, ZHU C G, ZENG M, et al. Data augmentation for spoken language understanding via pretrained language models. Proceedings of the 22nd Annual Conference of the International Speech Communication Association (INTERSPEECH’21), 2021, Aug 30-Sept 3, Brno, Czech. Baixas, France: International Speech Communication Association, 2021: 1219-1223.

18. GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial networks. Advances in Neural Information Processing Systems 27: Proceedings of the 28th Annual Conference on Neural Information Processing Systems (NIPS’14), 2014, Dec 8-13, Montreal, Canada. Cambridge, MA, USA: MIT Press, 2014: 2672-2680.

19. MIRZA M, OSINDERO S. Conditional generative adversarial nets. arXiv Preprint, arXiv:1411.1784, 2014.

20. DAI B, FIDLER S, URTASUN R, et al. Towards diverse and natural image descriptions via a conditional GAN. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV’17), 2017, Oct 22-29, Venice, Italy. Piscataway, NJ, USA: IEEE, 2017: 2989-2998.

21. LI X Y, ZHANG Y Y, ZHANG J Y, et al. Region-based activity recognition using conditional GAN. Proceedings of the 25th ACM International Conference on Multimedia Conference (MM’17), 2017, Oct 23-27, Mountain View, CA, USA. New York, NY, USA: ACM, 2017: 1059-1067.

22. LU Y, TAI T W, TANG C K. Attribute-guided face generation using conditional CycleGAN. Computer Vision: Proceedings of the 15th European Conference on Computer Vision (ECCV’18): Part XII, 2018, Sept 8-14, Munich, Germany. LNIP 11216. Berlin, Germany: Springer, 2018: 293-308.

23. KANTOR P B. Foundations of statistical natural language processing. Information Retrieval, 2001, 4(1): 80-81.

24. JURAFSKY D, MARTIN J. H. Speech and language processing: An introduction to natural language processing. 2nd ed. Prentice Hall series in artificial intelligence. Upper Saddle River, NJ, USA: Pearson Prentice Hall, 2009.

25. GREGOR K, DANIHELKA I, GRAVES A, et al. DRAW: A recurrent neural network for image generation. Proceedings of the 32nd International Conference on Machine Learning (ICML’15), 2015, Jul 6-11, Lille, France. Stroudsburg, PA, USA: International Machine Learning Society (IMLS), 2015: 1462-1471.

26. MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention. Advances in Neural Information Processing Systems 27: Proceedings of the 28th Annual Conference on Neural Information Processing Systems (NIPS’14), 2014, Dec 8-13, Montreal, Canada. Cambridge, MA, USA: MIT Press, 2014: 2204-2212.

27. BA J, MNIH V, KAVUKCUOGLU K. Multiple object recognition with visual attention. Proceedings of the 3rd International Conference on Learning Representations (ICLR’15), 2015, May 7-9, San Diego, CA, USA. 2015: 10p.

28. HAFIZ A M, PARAH S A, BHAT R U A. Attention mechanisms and deep learning for machine vision: A survey of the state of the art. arXiv Preprint, arXiv:2106.07550, 2021.

29. ZAREMBA W, SUTSKEVER I. Reinforcement learning neural Turing machines. arXiv Preprint, arXiv:1505.00521, 2015.

30. TANG H, XU D, SEBE N, et al. Multi-channel attention selection GAN with cascaded semantic guidance for cross-view image translation. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19), 2019, Jun 16-20, Long Beach, CA, USA. Piscataway, NJ, USA: IEEE, 2019: 2417-2426.

[1]	Li Hao, Zhang Linghua, Tong Cheng, Zhou Chenyang. Short-term load forecasting model based on gated recurrent unit and multi-head attention [J]. 中国邮电高校学报(英文), 2023, 30(3): 25-31.
[2]	段炼唐贵进. Low-light image enhancement algorithm using a residual network with semantic information[J]. 中国邮电高校学报(英文版), 2022, 29(2): 52-62.
[3]	焦继超陈新平管孟赵亚鑫. TCL: a taxi trajectory prediction model combining time and space features [J]. 中国邮电高校学报(英文版), 2021, 28(3): 63-75.
[4]	季一木, 李可, 刘尚东, 刘强, 尧海昌, 李奎. Collaborative filtering recommendation algorithm based on interactive data classification[J]. 中国邮电高校学报(英文版), 2020, 27(5): 1-12.
[5]	杨健健张强王晓林杜毅博王超吴淼. Research on equipment fault diagnosis method based on random stochastic adaptive particle swarm optimization [J]. 中国邮电高校学报(英文版), 2020, 27(4): 17-25.
[6]	Pang Hao, Bu Yunyun, Wang Cong, Xiao Hui. Automatic detection of breast nodule in the ultrasound images using CNN[J]. 中国邮电高校学报(英文版), 2019, 26(2): 9-16.
[7]	He Jin, Wang Cong, Chen Zhao. Pulmonary tuberculosis detection model of chest X-ray images using convolutional neural network[J]. 中国邮电高校学报(英文版), 2018, 25(6): 1-6.

Data augmentation via joint multi-scale CNN and multi-channel attention for bumblebee image generation

Data augmentation via joint multi-scale CNN and multi-channel attention for bumblebee image generation

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 7

编辑推荐

Metrics

本文评价