总访问量
今日访问
在线人数
In frequency division duplex ( FDD) massive multiple-input multiple-output ( MIMO) systems, a bidirectional positional attention network ( BPANet) was proposed to address the high computational complexity and low accuracy of existing deep learning-based channel state information ( CSI) feedback methods. Specifically, a bidirectional position attention module ( BPAM) was designed in the BPANet to improve the network performance. The BPAM captures the distribution characteristics of the CSI matrix by integrating channel and spatial dimension information, thereby enhancing the feature representation of the CSI matrix. Furthermore, channel attention is decomposed into two one-dimensional (1D) feature encoding processes effectively reducing computational costs. Simulation results demonstrate that, compared with the existing representative method complex input lightweight neural network ( CLNet), BPANet reduces computational complexity by an average of 19. 4% and improves accuracy by an average of 7. 1% . Additionally, it performs better in terms of running time delay and cosine similarity.
In convolutional neural networks ( CNNs), the floating-point computation in the traditional convolutional layer is enormous, and the execution speed of the network is limited by intensive computing, which makes it challenging to meet the real-time response requirements of complex applications. This work is based on the principle that the time domain convolution result equals the frequency domain point multiplication result to reduce the amount of floating- point calculations for convolution. The input feature map and the convolution kernel are converted to the frequency domain by the fast Fourier transform( FFT), and the corresponding point multiplication is performed. Then the frequency domain result is converted back to the time domain, and the output result of the convolution is obtained. In the shared CNN, the input feature map is much larger than the convolution kernel, resulting in many invalid operations. The overlap addition method is proposed to reduce invalid calculations and speed up network execution better. This work designs a hardware accelerator for frequency domain convolution and verifies its efficiency on the Xilinx Zynq UltraScale + MPSoC ZCU102 board. Comparing the calculation time of visual geometry group 16 ( VGG16 ) under the ImageNet dataset faster than the traditional time domain convolution, the hardware acceleration of frequency domain convolution is 8. 5 times.
Due to the diversity of graph computing applications, the power-law distribution of graph data, and the high compute-to-memory ratio, traditional architectures face significant challenges regarding poor flexibility, imbalanced workload distribution, and inefficient memory access when executing graph computing tasks. Graph computing accelerator, GraphApp, based on a reconfigurable processing element ( PE) array was proposed to address the challenges above. GraphApp utilizes 16 reconfigurable PEs for parallel computation and employs tiled data. By reasonably dividing the data into tiles, load balancing is achieved and the overall efficiency of parallel computation is enhanced. Additionally, it preprocesses graph data using the compressed sparse columns independently ( CSCI) data compression format to alleviate the issue of low memory access efficiency caused by the high memory access-to-computation ratio. Lastly, GraphApp is evaluated using triangle counting ( TC) and depth-first search ( DFS) algorithms. Performance analysis is conducted by measuring the execution time of these algorithms in GraphApp against existing typical graph frameworks, Ligra, and GraphBIG, using six datasets from the Stanford Network Analysis Project ( SNAP) database. The results show that GraphApp achieves a maximum performance improvement of 30.86 % compared to Ligra and 20.43 % compared to GraphBIG when processing the same datasets.
Addressing the issue of low pulse identification rates for low probability of intercept ( LPI) radar signals under low signal-to-noise ratio ( SNR) conditions, this paper aims to investigate a new method in the field of deep learning to recognize modulation types of LPI radar signals efficiently. A novel algorithm combining dual efficient network ( DEN) and non-local means ( NLM) denoising was proposed for the identification and selection of LPI radar signals. Time-domain signals for 12 radar modulation types were simulated, adding Gaussian white noise at various SNRs to replicate complex electronic countermeasure scenarios. On this basis, the noisy radar signals undergo Choi-Williams distribution ( CWD ) time-frequency transformation, converting the signals into two- dimensional (2D) time-frequency images ( TFIs). The TFIs are then denoised using the NLM algorithm. Finally, the denoised data is fed into the designed DEN for training and testing, with the selection results output through a softmax classifier. Simulation results demonstrate that at an SNR of - 8 dB, the algorithm can achieve a recognition accuracy of 97.22% for LPI radar signals, exhibiting excellent performance under low SNR conditions. Comparative demonstrations prove that the DEN has good robustness and generalization performance under conditions of small sample sizes. This research provides a novel and effective solution for further improving the accuracy of identification and selection of LPI radar signals.
Naive-LSTM enabled service identification of edge computing in power Internet of things
Great challenges and demands are presented by increasing edge computing services for current power Internet of things ( Power IoT) to deal with the serious diversity and complexity of these services. To improve the matching degree between edge computing and complex services, the service identification function is necessary for Power IoT. In this paper, a naive long short-term memory ( Naive-LSTM ) based service identification scheme of edge computing devices in the Power IoT was proposed, where the Naive-LSTM model makes full use of the most simplified structure and conducts discretization of the long short-term memory ( LSTM) model. Moreover, the Naive-LSTM based service identification scheme can generate the probability output result to determine the task schedule policy of Power IoT. After well learning operation, these Naive-LSTM classification engine modules in edge computing devices of Power IoT can perform service identification, by obtaining key characteristics from various service traffics. Testing results show that the Naive-LSTM based services identification scheme is feasible and efficient in improving the edge computing ability of the Power IoT.