Abstract:Abstract: In recent years, diseases and pests have caused a huge loss in agricultural production. Accurate identification of crop diseases and timely protection are important measures to ensure crop yield. Traditional methods of diagnosing agricultural diseases typically depend on the expertise and judgment of specialists. This approach is dependent on human subjective perception, which is prone to error and cannot ensure timeliness. The optimal time to cure agricultural diseases may be missed by traditional methods, resulting in financial losses. The neural networks and the development of deep learning have brought new technologies to the appraisal of agricultural diseases. However, certain large-scale neural networks cannot be implemented on mobile terminals to accomplish crop disease detection in realistic settings due to the low identification accuracy and a huge number of parameters. To address the problems of large size and low accuracy of traditional crop disease recognition models, we proposed a Lightweight Multi-scale Attention Convolutional Neural Networks (LMA-CNNs) to solve the above problems. First, in order to reduce the number of parameters and make the model lightweight, depthwise separable convolution was adopted as the main structure of the network; secondly, the residual attention module and multi-scale feature fusion module were designed on the basis of depthwise separable convolution; at the same time, the Leaky ReLU activation function was introduced to enhance the extraction of negative-valued features. The residual attention module enhanced the weight of useful feature information and weakened the weight of interference information such as noise by embedding channels and spatial attention mechanisms, and improved the recognition of important features by the network model. Residual connections could effectively prevent network degradation. The multi-scale feature fusion module used its convolution kernels of different scales to extract disease features of multiple scales, which improved the richness of features. The experimental results showed that the accuracy of the LMA-CNNs model on the test set of 59 types of disease images was 88.08%, and the number of parameters was only 0.14×107. Through comparative experiments, the LMA-CNNs model outperformed ResNet34, ResNeXt, ShuffleNetV2, MobileNetV3, and the more popular Vision Transformer recently. This study further verified the effectiveness of the LMA-CNNs model by comparing the network models designed by different researchers under the same dataset. Comparative experiments showed that the LMA-CNNs model reduced the number of model parameters on the premise of improving accuracy. Because of the problem of poor interpretability of the neural network model, this study used Grad-CAM to visualize the features extracted by the middle layer of the model and explained the model through the visualization results to obtain different feature information on different convolutional layers. As the number of layers increased, the LMA-CNNs model paid more attention to the diseased area. In summary, the LMA-CNNs model could extract more disease feature information, better balanc the model complexity and model recognition accuracy, and provid a reference for mobile crop disease recognition. In the future, we will continue to optimize the algorithm, deploy the model to the mobile terminal to detect crop diseases in real-field scenarios, and improve detection accuracy and efficiency.