Deep learning image-based automated application on classification of tomato leaf disease by pre-trained deep convolutional neural networks

The agriculture sector is one of the major sectors in India. India is well known for the production of various varieties of spices, fruits, vegetables, herbs, etc. Along with the pollution, the diseases that are affecting plants are increasing and there are various reasons for this. Tomato is one of the high-demand crops in the market and is produced in large quantities. There are many diseases that tomatoes get affected by because of the virus, fungus, bacteria, etc. In this project, we proposed a model to identify the diseases of tomato plants using images of tomato plant leaves. Our main goal is to develop a good model with decent accuracy and a mobile application that works with or without the internet for users, especially farmers. The Convolution Neural Network-based approach is used to create the model for this project. This proposed system model gives 98 % accuracy and that model is converted to the TF Lite model which is used in the application. This application can precisely predict the disease of the tomato leaf and suggest the treatment for it.


Introduction
Access India is a country known for agriculture. With 38% of the land area is suitable for agriculture. And all over the world, India is the second-largest producer of both rice and wheat. Cereals have the highest market for almost 46% of the Indian Agriculture market. Solanum Lycopersicum which is also known as Tomato made India rank second on the list of nations producing it with 852 thousand hectares of cultivation area [1][2][3]. Tomato is in second place for the crops getting affected with 12.8% of incidents. Generally, tomato grows on almost any well-drained soil, and it is a Rabi crop in plain regions whereas in hilly regions it can be grown in summer and rainy season [4]. We know that Andhra Pradesh has the highest tomato crop production with 21 million metric tons, and it is the second most state which is getting affected with 13.9% of incidents [5,6]. Along with production, the frequency of crops getting affected by diseases is also increasing. There are different ways to get diseases and all of them can't be identified with the farmer's expertise. In the early days, farmers used to monitor the plants manually and treat them according to that observation [7]. This requires a lot of work and time, but sometimes the prediction of diseases can't be accurate. Deep learning is widely used in all sectors and with the help of this, we can predict diseases using visually observable patterns [8,9]. Monitoring the crops is very important and deep learning made it easy to work with. Using the images of various diseases and deep learning we can predict the disease with good accuracy [10,11]. Adding a mobile application to it will become handy for farmers to work. Mobile is developed to click or choose images from mobile and predict the disease. Along with those symptoms and treatments are provided for each disease to provide the basic knowledge to newbies in agriculture.

Objectives
The objectives of the proposed project are as below: • To design an efficient system that may detect tomato diseases based on leaf pictures and can be used in farms and nurseries.
• An inference is performed using the TensorFlow Lite Java API.
• To develop an app that can work in places without internet connection so which will be helpful in farms and low network bandwidth places.
• To provide users with disease symptoms and their treatments.

Background and Literature Survey
There has been considerable research on using deep learning for automated disease detection and classification in various crops, including tomato plants.
In this literature survey, we will focus on recent studies that use pre-trained deep convolutional neural networks (CNNs) for tomato leaf disease classification. [

Methodology
The system architecture of this work is illustrated by the below block diagram.

Dataset
The tomato leaf disease images are taken from the plant leaf village dataset. This dataset consists of 14529 images belonging to 10 different classes. All these images belong to RGB Color space. Sample images from each class along with the disease names are depicted as follows (refer to Fig. 2). Here in our project we considered nine varieties of tomato diseases they are 1) Bacterial Spot, 2) Early Blight, 3) Late Blight, 4) Leaf mold, 5) Septoria Leaf Spot, 6) Mosaic Virus 7) Yellow Leaf Curl virus 8) Target spot, 9) Spider mites Two-spotted Spider mite.

Dataset Configuration
By using a prefetch buffer, the dataset can improve its performance. We can yield data from the disk without having I/O blocked using buffered prefetching. During the initial epoch, Cache () maintains the leaf images in memory after they're loaded off the disk. While the model is training, the Prefetch () function overlaps the data pre-processing and training step model execution.
To prefetch the number of batches of images and to get loaded by TensorFlow we use AUTOTUNE feature.

Pre-processing of the Dataset
To increase the amount of diversity of data during training, we use the data augmentation. Through applying several transformations like rotation, crop, flip, zoom, contrast, etc new samples will be generated from the existing training samples. This helps the model to expose to various aspects of the data. This is one of the good methods to overcome over-fitting issues and can be applied by adding the more preprocessing layers as a part of the model. In the proposed model, Keras preprocessing layers such as random zoom, random rotation, random flip, etc. are used for preprocessing. Fig.3, depicts the plot of augmented samples for which we have applied the Data Augmentation several times to the same plant image.

Classification
In this proposed method firstly, CNN architecture is used and whole layers and fil-ters are built from scratch. The dataset we considered has 14529 images belonging to 10 classes. By splitting the dataset into 80:20 ratio for training dataset 11624 images is taken and for the validation dataset 2905 images were taken as per 80:20 ratio split. This model was trained for 50 epochs where images were resized to 180x180 pixels. Furthermore, for feature extraction convolutional and pooling layers are used whereas for classification fully connected layers are used. The purpose of acti-vation layers is to introduce non-linearity into the network Relu, SoftMax (for the last layer) is used. To improve the previous results we have used optimizers like Adam optimizer and sparse categorical loss entropy. The best model used for this project i.e. Model Architecture is shown in Fig.  4. Finally, the model is converted to a TF-Lite model and embedded into a mobile application developed using android studio. Table 1 refers to the details about the model used and Table 2 refers to the model summary.

Deployment of the Model
This section refers to the integration of the best CNN model into a mobile application to make use of it in making decisions. The final model is deployed into a mobile application to make sure that farmers and other people can use this app on their smartphones to classify the leaf diseases of the tomato plant. To deliver userfriendly and efficient software that works with or without the internet TF Lite is used. Here the proposed model is converted to a lighter format using TensorFlow Lite due to the complexity and heaviness of CNN models which require lots of memory.
Conversion of the CNN model to a TF Lite format doesn't affect accuracy, reduces their file size, and increases execution speed. The inference is known as the process of executing a TF Lite model on a device to make predictions based on the given input data. TF Lite Model can be directly imported to android studio and Java programming language is used for functionality. XML is used for the app interface and after running inference disease class will be predicted. Mobile application is developed because nowadays everyone has a smartphone and it is easy to carry. There is no need for the internet to use this app so as soon as the picture is captured or uploaded the class will be predicted.

Implementation Analysis
The software details of the proposed system are described in this section. The environments used for this project are Android Studio, and Google Colab Pro. Java and Python are programming languages used for Android Studio and Colab.

• Google Colab Pro
Python is the programming language, TensorFlow, Matplotlib, NumPy, and OpenCV are the packages used for developing the model. Matplotlib package is utilized for plots, and OpenCV is used for fetching the image from the storage. For Developing the model and for converting the model into a TFLite model, the TensorFlow package is used. To process 14000 images and to train a model Google Colab is not sufficient, So Colab Pro which costs around 10$ (including all taxes) is required as it will provide more Ram, GPU, and storage for working.
• Android Studio XML is used for designing the application interface and for app functionalities Java programming language is used. For certain permissions like camera, storage and location code is written in manifest file.

Experimental Results
The final model considered is taken based on accuracies, losses of the training dataset, and validation dataset. Initially, the Experiment runs for 30 epochs and there is a constant increase in the accuracy, and a decrease in the loss for both training and validation datasets. To overcome Overfitting Data Augmentation is applied and a dropout layer is added. Along with epochs we experimented with dropout value, no. of layers, and image pixels. Observations of the experiment are shown in table 3 and their plots are listed in Fig. 5. The accuracy and loss variations of the model were trained using Adam optimizer and Sparse Categorical Cross entropy. Initially, the loss is excessive, and it's far steadily decreased after increasing the number of the epoch.  Table 3) The proposed work is a significant contribution to the existing research n automated tomato leaf disease detection using deep learning techniques. As shown in the comparison table, the proposed work achieved an accuracy of 98% using a custom CNN model trained on a dataset of 14,529 tomato leaf images. This accuracy is comparable to the results reported in previous studies that utilized pre-trained CNN models such as VGG-16, AlexNet, MobileNet, Inception-v3, ResNet-50, and GoogLeNet. Furthermore, the proposed work offers additional features such as real-time disease detection, symptom display, and treatment recommendations that are not present in the existing studies. The proposed mobile application can be used offline, making it a suitable tool for farmers and crop cultivators in remote areas. Overall, the proposed work addresses the limitations of existing studies and offers a promising solution for accurate and efficient tomato leaf disease detection. The Fig. 5(e) in the above plots is being selected to extend the performance evaluation. The selected final model is converted into a TF Lite model format and embedded into a mobile application using Android Studio. With the user-friendly app design, users can easily under-stand how to use this application. Signup Activity is for user Registration, login activity for user login. A home screen contains the camera icon which is used for capturing the disease picture, file icon is to choose the image from the app. We get an immediate classified result along with confidence percentage as soon as you upload the picture as shown in Fig. 8. We can also get the information like symptoms and treatments related to the disease. Setting feature has many sub features like Edit profile, Manage Accounts, Logout options are for personal information management, About, Privacy Policy, Terms and Conditions are to know more about app.

Conclusion and Future Work
A mobile application for real-time tomato leaf disease detection is the final system presented by this project. For achieving this system, firstly CNN model is trained with 14529 tomato leaf images and well optimized with decent accuracy for mobile deployment. The performance of the model is evaluated based on the training accuracy, validation accuracy of around 98%, and validation loss parameter. This developed application will almost precisely classify the leaf diseases of the tomato plant and after detection symptoms, treatments are also displayed. Various organic and chemical controls are displayed to treat the disease. For chemical control, various products are suggested according to the toxic level and along with components in it. Farmers and other crop cultivators can use this application without the internet also so this application can be used in remote areas also. The future scope of this project is to continuously improve the existing model by adding more variety of diseases and other crop diseases. Add the feature of location access, we can add voice support which will translate the text to the local language speaking at that place. According to the location accessed, we can further add beneficial features such as suggesting the next seasonal crop, productivity improvement, pesticides, and their effects, call support from nearby agriculture officers, etc. By contacting the local store owners, we can add a purchase support, so that farmers can buy the required products from the local store or nearby store.