Weather based fuzzy regression models for prediction of rice yield

Fuzzy regression models for forecasting rice yield in Kanpur district were developed and compared with the weather indices-based regression model. For this, weekly (23-35 SMW) weather data (1971, 1973-2011) were utilized. Significant variables in fuzzy approach were selected based on index of confidence (IC) and adequacy of models was compared with the weather indices-based regression models. It was found that variables such as total accumulation of minimum temperature, weighted interaction of bright sunshine hours and rainfall, weighted interaction of minimum and maximum temperature, unweighted interaction of maximum temperature and relative humidity in morning and weighted interaction of relative humidity in morning and evening respectively, are significant based on their IC and SSE (sum of square error) values. The validations of models were also attempted for three years (2008-09, 2010-11 and 2011-2012).This study also reveals that the parameters for adequacy of models for linear regression models vis-a-vis their fuzzy counterparts are much higher for all values of fitness criterion (h). Thus, fuzzy regression methodology is more efficient than linear regression technique.

Reliable and timely forecasts of crop production are required for various policy decisions relating to storage, distribution, pricing, marketing, import-export, etc. Fisher et al.(1924) has done pioneering work in developing models for crop weather relationship and yield forecasting. Hendricks et al. (1943) modified Fisher's technique and divided the crop season into 'n' weekly intervals and assumed that a second-degree polynomial in week number would be sufficiently flexible to express the relationship. This model was extended for two weather variables to study joint effects. Further, since the data for such studies extended over a long period of years, an additional variant'T' representing the year was included to make allowance for time trend. Multiple linear regression methodology has been widely employed for forecasting the growth and yield of crops (Agarwal and Mehta 2007), and insect pest population etc. (Sagar et al., 2017;Kumar et al. 2018). The limitation of statistical regression model is that the model can be applied only if the given data are distributed according to a statistical model. Zadeh et al. (1975) described the fuzzy uncertainty with ambiguity and vagueness and introduced the theory of fuzzy to build such a system as needed to deal with ambiguous and vague information. Tanaka et al. (1988) explained fuzzy uncertainty of dependent variables with the fuzziness of response functions or regression coefficients in regression model and introduces initially the fuzzy regression model. Fuzzy linear regression (FLR) is a fuzzy type of classical regression analysis in which some elements of the model are represented by fuzzy numbers. It is used in evaluating the functional relationship between the dependent and independent variables in a fuzzy environment. Balve et al. (2016) predicted the evapotranspiration using fuzzy inference system and found that fuzzy inference system performed better. Therefore, fuzzy regression approaches have been attempted for development of model for forecasting rice yield in Kanpur district as a case study.

MATERIALS AND METHODS
In this study, yield of rice crop of Kanpur district during the year 1971, 1973 to 2011, were procured from Directorate of Economics and Statistics, Department of Agriculture, Cooperation and Farmers Welfare, India and weekly weather data were procured from India Meteorological Department, Pune.District-wise data on rice productivity and weekly weather data for Kanpur location on maximum temperature (X1), minimum temperature (X2), morning relative humidity (X3), evening relative humidity (X4) and rainfall (X5) were considered for model development. As the objective was to forecast yield well in advance of harvest, therefore, weather data from 23 rd standard meteorological week (SMW) to 35 th SMW were used for development of models. In this study, models were developed through weather indices (WI) based regression and fuzzy regression approach. The R-square statistics for multiple linear regression and index of confidence (IC) values for Fuzzy regression approach were utilized for checking the adequacy of developed models.

Weather indices (WI) based regression model
The regression model of the developed by Fisher et al.(1924)

Fuzzy regression approach
Fuzzy linear regression (FLR) of the following form has been used. The basic model assumes a fuzzy linear function as WhereX=[X 0 ,X 1 ,…X N ] T is a vector of independent variables; is a vector of fuzzy coefficients presented in the form of symmetric triangular fuzzy numbers denoted by with its membership function described as (3) where  j is its central value and c_j is the spread value.Thus, the equation can be rewritten as (4) The algorithm developed by Wang et al.(2000) for variable selection was utilized in this study. Those set of variables were considered for model development which has lowest error sum of square (SSE) and highest index of confidence (IC) which is a ratio between regression sum of square (SSR) and total sum of square (SST). The IC in fuzzy regression technique is as similar to the coefficient of determination (R 2 ) in multiple linear regression techniques. Greater the value of IC and R 2 better are the prediction results.The values of IC measure the degree of variationY is having, between the lower limits & upper limits represented by Y L and Y U respectively.In fuzzy linear regression, the regression line yh=1 has the best ability to interpret the given data Yi,where the membership of the fuzzy parameter A j is symmetric as proved by Wang et al.(2000).The value of IC may also be calculated by IC=1-(SSE/SST), where, SSE, SST are error sum of square and total sum of square respectively. When SST is very low, the value of IC tends to close to 1, while in case when SSE equals to SST, the value of IC tends to close to 0. Thus, Higher the value of IC, lower the value of SST and better is the Yi h =1 used to represent Yi.
The analysis has been done by using SAS (Statistical Analysis System), version 9.3 software, available at ICAR-IARI, New Delhi. The procedure for development of fuzzy, regression and generation of weather indices were created as per requirement for development of models and adequacy of model for a comparison between weather indices and fuzzy regression approach.

RESULTS AND DISCUSSION
Weather indices were generated for each weather variables along with its interaction and these indices were  (1971, of which the validation of the models was attempted considering three years (2008-09, 2010-11 and 2011-2012).
In this study the objective function in fuzzy linear regression is to minimize the range (the width between upper and lower limits) of the predicted values i.e. minimizing the total spread of the fuzzy number Yi. Weather indices along with interaction of two weather variable at a time for 5 December 2018 weather variables leads to 30 explanatory variables (5 C2 , each for weighted and unweighted) were generated. All the variables couldn't be utilized to develop a model for predicting a crop yield. Thus, for all individual and combination of variables SSE and IC values were calculated and only those set of variables were selected which were having highest IC and smallest SSE values. The results were reported for such 30 combinational sets as shown in Table  2.Based on these combinations, the variable set (Z20, Z451, Z121, Z130, Z341) has the smallest SSE with value of 10.22 and highest IC value of 0.932. Z20 is unweighted minimum temperature; Z451 interaction of relative humidity in evening with rainfall; Z121 interaction of maximum and minimum temperature; Z130 unweighted interaction of maximum temperature and relative humidity in the evening; Z341 interaction of relative humidity morning and evening.The subset (Z20, Z451, Z121, Z130, Z341) has got the highest IC value among all subsets reported, which means that the center regression line represented by Y i h=1 of this subset has the best ability to predict Yi.Using these set of variables for model development for predicting rice yield will give high accuracy in the result.
A comparison among multiple linear regression (R 2 value) techniques and fuzzy linear regression (IC value) are shown in Table 2 for the selected set of variables respectively. It is derived from the table that all the IC values calculated from fuzzy regression are much greater than the R 2 values of linear regression which means fuzzy regression approach is much efficient than multiple linear regression technique and have more model stability.
The weather based fuzzy regression model for prediction of rice yield is presented as follows: (5) The developed model using set of variables (Z20, Z451, Z121, Z130, Z341) at serial number 25 were also validated for prediction of rice yield for three years viz. 2008-09, 2010-11 and 2011-2012. Table 3 represents the observed and forecasted values of the developed model under multiple linear regression methodology and forecasted range of weather indices based fuzzy regression methodology. This table also reveals that fuzzy regression approach has narrow width as compared to the multiple regression, thus, indicates prediction closed to observed one.

CONCLUSION
Comparison of fuzzy regression and multiple linear regression method revealed that there is a considerable difference in the adequacy of models in terms of the values of IC and R 2 . The validation of models developed using variable set (Z20, Z451, Z121, Z130, Z341) having highest IC and lowest SSE values were also attempted. It was found that the fuzzy regression approach has a narrow width as compared to the multiple linear regression methodology. The forecasted range of the fuzzy approach revealed that this approach is more efficient and has more potential for predicting the crop yield in agriculture and may be considered as a good alternative over multiple linear regression.