Modeling and estimation of solar radiation of Karachi through artificial neural network (ANN) using temperature and dew-point

The most influential source of energy in our lives is solar energy. Solar energy reaches the earth in three different forms, i.e., Global, diffused, and Direct Solar Radiation. The Solar flux at the earth's surface depends on the intensity of these radiations and is a function of the values of latitude and longitude. The earth's temperature and hence dewpoint are greatly affected by solar flux. This idea is used for predicting solar radiation with input parameters, temperature, and dewpoint along with day number and month. The method of prediction of solar radiation used in the study is Artificial Neural Network (ANN). ANN has four variables in the input, ten neurons in the hidden layer, and three output parameters GSR. DSR and BSR. Six different types of errors, namely, Root Mean Square error (RMSE), Mean Absolute Error (MABE), Mean percent error (MAPE), Chi-square, Coefficient of Determination, Kolmogorov Smirnov, have been calculated for training, testing, and validation mode to check the accuracy of estimation. The values of all the errors are low, which indicates the prediction of solar radiation is reliable.


Introduction
Solar Radiation has a socio-economic impact on our daily lives, in addition to the various industries, such as photovoltaic, manufacturing, farming, architectural design, etc. Therefore, its forecasting has great importance to scientists and users for research and routine tasks. To predict solar radiation, researchers created some strategies. Qiu et al. proposed the XGboost model, which combines temperature and geographical data to estimate the daily radiations for those areas where historical data is not accessible [1]. Using a Long short-term memory (LSTM) network and gated recurrent unit (GRU) network, Singla et al.
developed two deep learning models for forecasting solar irradiance globally; these deep learning networks are trained using the climatic variables dew point, pressure, temperature, solar zenith angle, relative humidity, wind speed, and precipitation. The experiment outcomes demonstrated the effectiveness of GRU and LSTM networks [2]. Gouda et al. found in their study that the temperature and dew point enhance the performance of the models in humid environments [3]. Through Levenberg-Marquardt, Bayesian Regularization, and Scaled Conjugate Gradient in the research paper in all three instances, Choudhary et al. found that the network's overall performance is quite good for prediction of solar radiation [4]. Li et al. compared different models and suggested that the model on temperature, precipitation, and dew point performed better in spring [5]. The solar radiation data can also be utilized to calculate the solar energy potential at Damak. According to Shrestha et al., the dew point is highly connected with air temperature as opposed to relative humidity, as revealed by the correlation matrix [6]. Among several methods, Munir A. et al. conclude that Artificial Neural Network (ANN) is a very useful tool for solar radiation forecast accuracy. Their analysis revealed that temperature and dew point parameters are the most suitable and accurate for predicting solar radiation [7]. In diverse climates in 2020 and 2050, Akhlaghi et al. developed a Deep Neural Network (DNN) model for a Guideless Irregular Dew Point Cooler (GIDPC) that is understandable and interpretable. [8]. Qazi et al. concluded that the neural networks and adaptive neuro-fuzzy inference systems improve prediction accuracy for hourly and monthly solar radiation estimates, respectively. It is determined that more study on ANN and its applications is necessary. The employment of ANN in the industry may be aided by the encouraging outcomes that have been produced [9]. To forecast worldwide solar radiation on a horizontal surface at various locations, the temperature-based models fit well. The findings of Hassan et al. demonstrate the importance and applicability of the novel temperature-based models for the quick and precise estimation of the monthly average daily global solar radiation on a horizontal surface [10]. Ekici, C., and Teke, I. tried to make total global solar radiation modeled through parameters like dew point temperature, visibility, and maximum and minimum air temperatures are used and accurate results were obtained [11]. Dong et al. concluded that the most fundamental meteorological variables were temperature and humidity, and adding unnecessary additional variables impacted the model's ability to make predictions. At hourly scales, the component Month would be more significant than the factor Time [12]. Ukhurebor discovered the linear relationship between the air temperature and dew point, et al., the air temperature has a considerable impact on the dew point temperature, and a rise in the air temperature would also cause an increase in the dew point temperature [13]. The study by Sein et al. concluded that the winter saw the strongest positive seasonal correlation between daily mean air temperature and dew point temperature. In contrast, the rainy season saw the lowest correlation [14].

Artificial Neural Network
Neural networks have been successfully employed in various fields of science and technology for the last two decades and have been more useful than traditional statistical tools. Among all the fields, ANN is widely used in atmospheric science, environmental chemistry, and climatology to predict short and longterm changes with time for locations with known or unknown meteorological data.
The architecture of the neural network model is based on multiple node layers interconnected with neurons. Each neuron has its activation function provided with individual weight and biases. These layers must include input layers, hidden layers, and output layers. The input layer is fed with the data set, which is further divided into training, testing, and validation data sets. Hidden layers perform calculations, while the output layer is responsible for model generation.
ANN architecture has been built to predict three types of solar radiations in Karachi ( fig. 1). The input layer has dew point, temperature, and the number of days as variables, and each is connected to the neurons in the hidden layer through some weights and a bias. The Levenberg-Marquardt algorithm (LMA) fits nonlinear least square data curves. The training data set is treated by LMA under the input layer of the ANN domain. Once the Proposed model is tested against the meteorological data, it is used to predict input values after validation.
These weights are known as the gradient or coefficient of the variable. The neuron receives data using the following equation Here are the n, variables and are the n weight and is the input bias. A non-linear transformation through an activation function is applied to the eq. (1) for final information at neurons. One of the activation functions is a sigmoidal function; we used this function in the proposed ANN model.
The hidden layer is connected to the output layer and the neuron transfer information to the output layer. The output data is compared with the known data, and the error is calculated; if the error does not meet the convergence criterion, the process is repeated using backpropagation. In backpropagation, new weights are calculated till one gets optimal weights, which minimize the difference between ANN output and actual values.

Estimation of solar radiations
Accurate estimation of solar energy is a difficult task by conventional statistical methods. ANNs offer various architectures to solve this problem, and promising results are obtained. These architectures may use minimum variables as input data. This paper uses the number of Days, Months, and Dew points and temperatures to find the three types of solar radiation. ANN model is developed to estimate Diffused solar radiations (DSR), global solar radiations (GSR), and direct beam radiations (DBR) for Karachi city. This model is built with one hidden layer with ten neurons. The performance of the network was the best for ten neurons.

Data
Data of temperature and dew point for three years {i.e., 2016, 2018 and 2019 (The data for 2017 was incomplete)} were used in this study and were provided by Pakistan Meteorological Department. Three types of Solar (i.e., DSR, GSR, and DBR) have been estimated for these two meteorological parameters.

Results and Discussion
Every ANN network has three components, input, hidden layer(s), and output. The input part in this ANN network consists of four variables, day, month, temperature, and dew point. The reason for taking the earth's temperature and dew point as the input parameter is their close link to solar radiations. A single hidden layer is used, which consists of 10 neurons. Three different solar radiation were estimated as the output of the network. Sixty percent of input data was used for training the network; the remaining 40 percent was used to test the trained network. Fifty percent random data was used for validation purposes. The residues have been calculated by taking the absolute difference between estimated and recorded values of solar radiations. Three years (2016, 2018, 2019) of daily solar radiation data was used in the study. The data for 2017 was incomplete. Three types of solar Radiation, GSR, DSR, and BSR, were estimated and compared for these three years. Fig. 1-3 show daily global radiation, fig. 4-6 show daily direct beam solar radiation and fig. 7-9 show daily diffused solar radiations for 2016, 2018, and 2019. Each figure has three parts (a,b,c), training mode, testing mode, and validation mode, along with corresponding residues. It can be seen that the values of residues are sufficiently small, which is characteristic of a good ANN network.
For 2016 training data, the absolute difference regarding DSR, DBR, and GSR comes out as less than 4x10 -2 %, 4x10 -1 %, and 7x10 -2 %, respectively. Regarding 2018, the same values are less than 5 x 10 -2 %, 4 x 10 -1 %, and 8 x 10 -2 %, respectively, while for 2019, the absolute difference comes out to less than 6 x 10 -2 %, 4.5 x 10 -1 %, and 8 x 10 -2 %, respectively. Similarly, these values are also estimated for the testing and validation for the above years (fig 1-9). For each year, there are five columns; the first four columns give weights that link input values and neurons in the hidden layer, and the fifth column gives weight that connects neurons in the hidden layer with the output variables. Table 1 Weights for Global Solar Radiation Estimation

Conclusion
Four-inputs ANN network has been developed and implemented to estimate three types of solar radiation. Three years of solar radiation data for 2016, 2018, and 2019 were used to carry out this research. Daily solar radiations were used for training testing and validation purpose. The ANN has a single hidden layer with ten neurons that get information from four input parameters, i.e., day of the month, the month of the year, temperature, and dewpoint. Three types of radiations (GSR, DBR, and DSR) are obtained as an output from the neurons in the hidden layer. For every year and every type of solar radiation, three types of plots with residuals were generated according to training testing and validation. There are 27 such plots; in each case, the correlation coefficient was 0.99. Six criteria were used to check the reliability of ANN estimation: Root Mean Square Error RMSE, Mean Absolute Error MABE, Mean percent error MAPE, Chi-square, Coefficient of Determination, Kolmogorov Smirnov. These errors are given in tables 1-3. The lower values of errors support the goodness of network estimation. The Kolmogorov-Smirnov criteria, which shows the largest difference between estimated and recorded solar radiation values, also have a low value.