How to Use Data Science in Agriculture to Predict Yield and Diseases

Map of countries based on average wheat production
Map of countries attended in the trial with their average yield of wheat as intensity of colours (0.9–8.8 t/ac)

Data Analytics in Agriculture

Machine learning is the tip of iceberg in a data science for agriculture
In most applications of data science in agriculture, the machine learning model is just the tip of the iceberg, the better part of the work is around cleaning, understanding, preparing and manipulating data to be fed to a model.

a. Visualization of Main Features

data analytics in agriculture for main effects of N, P, K for the first round of fertilizer, N for the second round of fertilizer, accumulated precipitation, space between rows, sowing month, and harvest month and year for trials.
Amount of N, P, K for the first round of fertilizer, N for the second round of fertilizer, accumulated precipitation, space between rows, sowing month, and harvest month and year for trials.
data analytics in agriculture for main effects of Yield, 1000-grain weight, agronomic score and time-length of farming in wheat
Yield, 1000-grain weight, agronomic score and time-length of farming in wheat
data analytics in agriculture for main effects of the number of farms affected by different percentage of yellow rust, leaf rust, and lodging. Graphs are highly skewed to left meaning most of the farms did not report any significant issue.
The number of farms affected by different percentage of yellow rust, leaf rust, and lodging. Graphs are highly skewed to left meaning most of the farms did not report any significant issue.
data analytics in agriculture for main effects of insect and weed problems were the most common type of damages affected the wheat farms
Insect and weed problems were the most common type of damages affected the wheat farms.
data analytics in farming for main effects of foliar, root and spike diseases
Foliar disease had been a widespread problem compared to root or spike disease in wheat farms

b. Interaction Between Features

data science in agriculture for the interaction of amount of phosphate fertilizer on height, yield and leaf rust spread in wheat farms
Effect of amount of phosphate fertilizer on height, yield and leaf rust spread in wheat farms
data science in agriculture for the interaction ofUpper and lower 15% percentile of 1st round nitrogen on wheat 1000 grain
data science in agriculture for the interaction of Upper and lower 15% percentile of 2nd round nitrogen on wheat 1000 grain
Distribution of 1000-grain weight based on amount of N fertilizer in 1st and 2nd round. Dotted lines represent median value of the distributions (P-value < 0.01)
data science in agriculture for the interaction of Effect of space between rows lower than 15 cm and higher than 50 cm on wheat 1000-grain weight
data science in agriculture for the interaction of Effect of space between rows lower than 15 cm and higher than 50 cm on grain yield
Effect of space between rows on the 1000-grain weight and yield; The dotted lines represent the median value of the distribution.
data science in agriculture for the main effect of Space between rows (cm) applied in wheat farms in different participated countries
Space between rows (cm) applied in wheat farms in different participated countries
Distributions of yield for wheat farmed in Canada and Japan
Distributions of yield for wheat farmed in Canada and Japan
The distributions of yield and 1000-grain weight for wheat farmed in Canada and Japan

Machine Learning in Agriculture

1. Predicting Yield in Wheat

Different farm inputs were fitted into the neural net model to predict yield in wheat
Independent (controlled or uncontrolled) farming data are the input to neural net model and yield is the predicted output
Cost function error vs epocs; train vs test sets ; predicting wheat yield
Cost function error vs. Epochs; Train set shows higher error because of the dropout layer in model
Prediction of yield in wheat farms vs. true yield using neural net model
Prediction of 1000-grain weight of wheat vs. true value
Prediction of yield and 1000-grain weight of wheat vs. true value

2. Predicting Diseases in Wheat

The structure of neural net ML used to predict foliar disease extent in wheat farms
The structure of neural net ML used to predict foliar disease extent in wheat farms
Confusion matrix of true labels vs. predicted labels of foliar disease extent in wheat farms
Confusion matrix of true labels vs. predicted labels of foliar disease extent in wheat farms

3. Feature Importance for Data Science in Agriculture

Top 5 important features that affect the yield in wheat
Top 5 important attributes that affected the yield of wheat farms

Data Science in Agriculture and Data Leakage

Bottom Line

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store