Wheat yield prediction based on weather parameters using multiple linear, neural network and penalised regression models

Wheat ( Triticum aestivum ) is second most consumed important staple food grain after rice, grown widely in the northern part of India. Wheat crop is thermo-sensitive in nature. Adverse changes in the weather parameters affect the crop growth and development and shows declining trends in yield. Crop yield forecast is essential in regard to storage, import, export and improves the decisions of government planning and policy-making to manage the produce. In traditional methods, crop cutting experiments were widely used for crop yield forecast. Models provide alternative methods for crop yield prediction. These methods are fast, cost effective and give the understanding about the factors which affect the crop yield. Statistical method is widely used for crop yield prediction using weather data (Lobell and Burke, 2010; Shi et al ., 2013). Singh et al . (2014) used weather indices such as minimum temperature and maximum temperature, rainfall, relative humidity for forecasting of rice and wheat yield in nine districts of eastern UP by stepwise regression. Garde et al . (2015)

To overcome the various challenges in crop yield prediction, in the present investigation, models was developed using SMLR, PCA-SMLR, ANN, PCA-ANN, LASSO and Elastic Net techniques for improving the accuracy of wheat yield prediction.

Data collection
Weather data viz maximum temperature, minimum temperature, rainfall, morning and evening relative humidity, sunshine hours during crop growing period of wheat as well as wheat yield data were collected from Hisar , Ludhiana (1971Ludhiana ( -2017, Amritsar (1972Amritsar ( -2017, Patiala (1972Patiala ( -2016 and IARI, New Delhi (1985-2018. Weather parameter range during wheat growing period for different locations is given in Table1. Data analysis was done after converting daily weather data into simple and weighted composite indices. 70% of the total dataset for each location were used for the calibration of the model and remaining 30% were used for validation of the model.

Development of wheat yield prediction model using different techniques
Simple and weighted weather indices are developed for each station. Summation of individual weather variable or interaction of two weather variable at a time were used for generating simple weather indices, sum product of individual weather variable or interaction of weather variables and its correlation with adjusted crop yield were resulted with weighted weather indices. Computation of simple and weighted weather indices were based on following formula. Simple and weighted weather indices used for developing model are given in Table 2.

Simple weather indices:
Weighted weather indices: Where, Xiw/ Xii' w = value of th/th weather variable in th week. r ʲiw/r ʲii' w = correlation coefficient of yield with ith weather variable or product of th or i'th weather variable in th week. m = week at which forecast done.

P = number of variables
Impact of important weather indices were determined by stepwise multiple linear regression technique and using different simple and weighted weather indices wheat yield prediction models was developed. Stepwise multiple linear regressionprincipal component analysis (SMLR-PCA) is a combination of feature selection and selection method used for the data analysis. Principal components scores or factors are calculated from the data analysis which is used as an input variable for stepwise multiple linear regression. PCA is a multivariate technique used for data reduction and reduce multicolinearity problems, transforms original set of correlated variables in to a new set of uncorrelated variables. Principal components (PCs) were selected based on their Eigen values. Eigen values more than 1 condition can able to describe more than 90 percent variability in the data. PCA scores were used as input for SMLR analysis. Artificial neural network consists of many artificial neurons that are connected together to network architecture specifically. Neural network has various architectures to approximate any linear function such as feed forward network, feedback network, lateral network etc. ANN composed of three layers namely, input layer, hidden layer and output layer. Multilayer perceptron (MLP) technique is one of the popular neural network types. This network interpreted as a form input-output model, with weights and threshold (biases) as free parameters of the model. By learning process, it attains optimized weighted value of variables, and it tries to produce the output based on the corresponding input provided. The main objective of the neural network is to produce its own output having reduced discrepancies with target output value, which will help to transform the input into meaningful output. In Principal component analysis-Artificial neural network (PCA-ANN) techniques data analysis were done through combination of feature selection. Principle components scores or factors are calculated from the data analysis which is used as an input variable for ANN. Least Absolute Shrinkage and Selection Operator (LASSO) is a model selection technique. Lasso models are used to overcome the shortcomings of ordinary least square (OLS) and ridge regression. LASSO estimators are used for consistent regression coefficient and automatic variable selection. Continuous shrinkage of some coefficients by imposing L1 penalty and others to zero, hence it helps to reduce multicollinearity and retain some good features of both subset selection and ridge regression. With large number of predictors, smaller subset selection exhibit stronger effect on interpretation of data. Subset selection is discrete and variable process, repressor are either retained or eliminated from the model in order to provide better interpretable model. Elastic net penalises the size of regression coefficients based on both L1 norm and L2 norm penalty. L1 norm used to generate sparse model, L2 penalty removes the limitation on the number of selected variables, encourage grouping effect, stabilises the L1 regularization path. Alpha and beta are the two model parameters, need to be optimized by minimizing average mean square error in cross validation. Tuning parameter alpha values set in LASSO and Elastic Net were 1 and 0.5. "glmnet" package in R software was used to solve LASSO and ENET.
Performance of statistical models were estimated by calculating R 2 , Root mean square error (RMSE), normalized root mean square error (nRMSE) and percentage deviation using the following formula.

Percentage Deviation= (Pᵢ-Oᵢ) *100/ Oᵢ
Where RMSE is root mean square error, nRMSE is normalized root mean square error Pi is the predicted value, O i is the observed value, N is the number of observations and M is the mean of observed value. Model performs excellent having nRMSE value less than 10%, good having nRMSE value between 10-20%, fair having nRMSE value between 20-30%.

Performance of wheat yield prediction model developed by different techniques
Yield prediction models for wheat crop have been developed using long term crop yield data as well as long period daily weather data during crop growing period (46 th to 15 th standard meteorological week) for respective location. The coefficient of determination (R 2 ) was significant at 1% probability level for all the locations. Performance of models was categorized based on value of RMSE and nRMSE during validation and are presented hereunder for different locations.

Hisar
Performance of the model developed using different techniques for wheat yield prediction of Hisar is shown in the Table  3. During calibration the models had the value of coefficient of determination R 2 between 0.75 for PCA-SMLR to 0.96 for PCA-ANN. The RMSE value during calibration ranged between 82.0 for PCA-ANN to 217.2 kg ha -1 for PCA-SMLR. During calibration lowest value of nRMSE was found for PCA-ANN (2.15 %) followed by ANN (3.49 %), SMLR (3.75%), LASSO (4.27%), elastic net (4.41%) and PCA-SMLR (5.69%). During validation RMSE value ranged between 313.9 for SMLR and 586.3 kg ha -1 for PCA-SMLR. Based on nRMSE values during validation, the model predictions were excellent for SMLR (3.75 %), LASSO (4.27%), Elastic net

Ludhiana
Performance of the model developed using different techniques for wheat yield prediction of Ludhiana is shown in the Table 4. During calibration models had the value of coefficient of determination R 2 ranged between 0.84 for PCA-ANN to 0.94 for elastic net. The RMSE value during calibration was lowest for SMLR (159.5 kg ha -1 ) followed by Elastic Net (175.3 kg ha -1 ), LASSO (182.9 kg ha -1 ), PCA-SMLR (208.2 kg ha -1 ), ANN (286.4 kg ha -1 ) and PCA-ANN (312.8 kg ha -1 ). During calibration nRMSE value was ranged between 4.3 % for SMLR to 8.0 % for PCA-ANN. During validation RMSE value ranged between 318.5 kg ha -1 for PCA-SMLR to 1070.6 kg ha -1 for ANN. Based on nRMSE values during validation, the model predictions were excellent for PCA-SMLR (6.66 %), Elastic Net (7.80 %) and LASSO (9.54%), good for SMLR (10.11%) and PCA-ANN (12.11%), fair for ANN having nRMSE value 22.36%. The most important weather parameter identified using SMLR for Ludhiana was Z151 and Z361. For PCA-SMLR model, time was found the most important parameter influencing the crop yield followed by PC2. For developing wheat yield prediction model using ANN techniques, optimum number of hidden neurons was 9. For PCA-ANN models, number of PCs and optimum number of hidden neurons was 7 and 1. Using LASSO the

Amritsar
Performance of the model developed using different techniques for wheat yield prediction of Amritsar is shown in the Table 5. During calibration value of coefficient of determination R 2 for model developed by different techniques was between 0.81 for ANN to 0.95 for SMLR. The RMSE value during calibration was between 150.7 kg ha -1 for SMLR to 366.3 kg ha -1 for ANN. The value of nRMSE was lowest for SMLR (4.79%) followed by Elastic Net (5.38%), LASSO (5.56%), PCA-SMLR (6.46%), PCA-ANN (7.56%) and ANN (11.28%). During validation RMSE value was lowest for Elastic net (423.9 kg ha -1 ) followed by LASSO (427.4 kg ha -1 ), PCA-SMLR(529.0 kg ha -1 ), SMLR (573.8 kg ha -1 ), PCA-ANN (606.3 kg ha -1 ) and ANN (853.6 kg ha -1 ). Based on nRMSE values during validation, the model predictions were excellent for elastic net (9.50 %) and LASSO (9.58 %), good for PCA-SMLR (11.85%), SMLR (12.85%), PCA-ANN (13.66%) and ANN (19.23%). The most important weather parameter identified using SMLR for wheat prediction of Amritsar was Z121, Z131 and Z250. For PCA-SMLR model, time was found the most important parameter influencing the crop yield followed by PC2. For developing wheat yield prediction model using ANN techniques, optimum number of hidden neurons was 7. For PCA-ANN models, number of PCs and optimum number of hidden neurons was 6 and 1. Using LASSO the most influencing weather parameter for wheat yield predication of Amritsar was Z121, Z131, Z150, Z151 and Z50. Using Elastic net, the most influencing weather parameter for wheat yield predication of Amritsar was Z21, Z31, Z121, Z131, Z150, Z151 and Z341. On the basis of RMSE and nRMSE value during validation of models developed using different techniques for wheat crop prediction for Amritsar Elastic Net performed best followed by LASSO, PCA-SMLR, SMLR, PCA-ANN and ANN.

Patiala
Performance of the model developed using different techniques for wheat yield prediction of Patiala is shown in the Table 6. During calibration value of coefficient of determination R 2 for model developed by different techniques was lowest for ANN (0.92) followed by PCA-ANN (0.93), PCA-ANN (0.94), SMLR (0.95), LASSO (0.98) and Elastic net (0.98). The RMSE value during calibration was between 109.3 kg ha -1 for elastic net 264.4 kg ha -1 for PCA-ANN. The value of nRMSE was lowest for Elastic net (3.2 %) followed by LASSO (3.28%), SMLR (4.77%), PCA-SMLR (5.75%), ANN (7.12%) and PCA-ANN (7.67%). During validation RMSE value was lowest for ANN (365.6 kg ha -1 ) followed by Elastic net (740.0 kg ha -1 ), PCA-SMLR (743.5 kg ha -1 ), PCA-ANN (749.4 kg ha -1 ), LASSO (772.6 kg ha -1 ), and SMLR (931.2 kg ha -1 ). Based on nRMSE values during validation, the model predictions were excellent for ANN (7.87 %), good for elastic net (15.96%), PCA-SMLR (15.99%), ANN (16.14%), LASSO (16.66 %) and SMLR (20.03%). The most important weather parameter identified using SMLR for wheat prediction of Patiala was Z141 and Z20. For PCA-SMLR model, time was found the most important parameter influencing the crop yield followed by PC1. For developing wheat yield prediction model using ANN techniques, optimum number of hidden neurons was 6. For PCA-ANN models, number of PCs and optimum number of hidden neurons was 5 and 1. Using LASSO the most influencing weather parameter for wheat yield predication of Patiala was Z11, Z20, Z21, Z41, Z130 and Z141. Z120 has negative influence on wheat yield. Using Elastic net, the most influencing weather parameter for wheat yield predication of Patiala was Z11, Z21, Z41, Z51, Z141, Z241 and Z451. On the basis of RMSE and nRMSE value during validation of models developed using different techniques for wheat crop prediction for Patiala ANN performed best followed by Elastic Net, PCA-SMLR, PCA-ANN, LASSO and SMLR.

IARI, New Delhi
Performance of the model developed using different techniques for wheat yield prediction of IARI, New Delhi is shown in the Table 7. During calibration value of coefficient of determination R 2 for model developed by different techniques was between 0.80 for PCA-SMLR to 0.98 for LASSO. The RMSE value during calibration was lowest 45.8 kg ha -1 for LASSO followed by 79.1 kg ha -1 for elastic net, 122.9 kg ha -1 for ANN, 136.9 kg ha -1 for SMLR, 157.6 kg ha -1 for PCA-SMLR and 161.8 kg ha -1 for PCA-ANN. During calibration all models developed by different techniques have nRMSE values less than 10 % with lowest value 1.35 % for LASSO followed by 2.33 % for elastic net, 3.57 % for ANN, 4.04% for SMLR, 4.65% for PCA-SMLR and 4.7% for PCA-ANN. During validation RMSE value was lowest for LASSO (258.0 kg ha -1 ) followed by PCA-SMLR (260.6 kg ha -1 ), Elastic net (351.9 kg ha -1 ), SMLR (382.7 kg ha -1 ), PCA-ANN (618.4 kg ha -1 ) and ANN (656.8 kg ha -1 ). Based on nRMSE values during validation, the model predictions were excellent for LASSO (6.11 %), PCA-SMLR (6.2%), elastic net (15.96%) and SMLR (9.06%), good for PCA-ANN (14.58%) and ANN (15.48%). The most important weather parameter identified using SMLR for wheat prediction of IARI, New Delhi was Z341. For PCA-SMLR model, time was found the most important parameter influencing the crop yield followed by PC3. For developing wheat yield prediction model using ANN techniques, optimum number of hidden neurons was 8. For PCA-ANN models, number of PCs and optimum number of hidden neurons was 10 and 2. Using LASSO the most influencing weather parameter for wheat yield prediction of IARI, New Delhi was Z10, Z120, Z131, Z141, Z151, Z240, Z241, Z261, Z271, Z470, Z670 while Z50, Z60, Z70, Z71, Z160, Z360, Z450, Z560 has negative influence on wheat yield prediction. Using Elastic net, the most influencing weather parameter for wheat yield predication was Z121, Z171, Z471, while Z60, Z160, Z360, Z460 has negative March 2022 influence on wheat yield prediction. On the basis of RMSE and nRMSE value during validation of models developed using different techniques for wheat crop prediction for IARI, New Delhi LASSO performed best followed by PCA-SMLR, Elastic Net, SMLR, PCA-ANN and ANN.
In our study, the performance based on RMSE and nRMSE during validation of different models for wheat crop prediction of different locations showed that Elastic Net and LASSO performed excellent for Hisar, Ludhiana, Amritsar, IARI, New Delhi and good for Patiala. Tibshirani (1996) proposed the method LASSO for shrinkage and selection for regression and generalized regression problems. He reported that LASSO does not focus on subsets but rather it defines a continuous shrinking operation that can produce coefficient that is exactly to zero.  Singh et al. (2014) used eighteen years weather data and yield data of rice and wheat for nine districts of Eastern Uttar Pradesh for developing yield prediction equations. They indicated that models explained 51 to 79 percent variations for rice yield and 65 to 92 percent variations for wheat yield in different districts. The performance of ANN was good during calibration while it was the worst model during validation which indicated over fitting. The overall ranking based on RMSE and nRMSE value during validation revealed that LASSO and Elastic net is performing best as compared to other models. Our result is in line with previous findings reported by (Das et al., 2018). They used six models SMLR, PCA-SMLR, ANN, PCA-ANN, LASSO and Elastic Net for prediction of rice yield based on weather parameters for west cost of India and he found that LASSO performed best followed by Elastic Net. LASSO and Elastic Net showed good performance due to the prevention of over fitting of model and reducing the magnitude of regression coefficient with feature selection by penalization decreases the model complexity. These penalised models give better computational advantage over SMLR or ANN as the features with zero coefficients can be eliminated from the model. The feature selection algorithms like LASSO, Elastic Net and SMLR performed better than methods utilising all the weather indices like ANN as feature selection reduces over fitting and avoids multicollinearity present in the dataset. Vashisth and Aravind (2020) reported that on the basis of percentage deviation and model accuracy Elastic Net model was found best followed by LASSO and SMLR for multistage mustard yield estimation done at vegetative, flowering and grain filling stage during Rabi 2018-19 and 2019-20. Kumar et al. (2019) evaluate the performance of stepwise and LASSO regression technique in variable selection and development of wheat forecast model for crop yield using weather data and wheat yield for the period of 1984-2015 for IARI, New Delhi. They reported that performance of LASSO regression is better than stepwise regression.

CONCLUSION
In the present study six models were developed for prediction of wheat yield for five different locations using long term weather data. Results showed that LASSO and Elastic Net performed excellent for Hisar, Ludhiana, Amritsar, IARI New Delhi and good for Patiala. PCA-SMLR performed excellent for Ludhiana and IARI, New Delhi, good for Hisar, Amritsar and Patiala. SMLR performed excellent for Hisar and IARI, New Delhi, good for Ludhiana and Amritsar, fair for Patiala. PCA-ANN performed good for all the five districts. ANN performed excellent for Hisar and Patiala, good for Amritsar and IARI, New Delhi and fair for Ludhiana. Hence out of six different models, Elastic Net and LASSO was found to be the best model followed by PCA-SMLR, SMLR, PCA-ANN and ANN respectively for wheat yield prediction.