Back to Library

Machine learning for high-throughput field phenotyping and image processing provides insight into the association of above and below-ground traits in cassava (Manihot esculenta Crantz)

Published by:
Online Location
Publication date
Number of Pages
Type of Publication:
Articles & Journals
Focus Region:
Focus Topic:
Agricultural Value Chains / Agri-Businesses
Information Technologies
Type of Risk:
Biological & environmental
Michael Gomez Selvaraj, Manuel Valderrama, Diego Guzman, Milton Valencia, Henry Ruiz & Animesh Acharjee


Background: Rapid non-destructive measurements to predict cassava root yield over the full growing season through large numbers of germplasm and multiple environments is a huge challenge in Cassava breeding programs. As opposed to waiting until the harvest season, multispectral imagery using unmanned aerial vehicles (UAV) are capable of measuring the canopy metrics and vegetation indices (VIs) traits at diferent time points of the growth cycle. This resourceful time series aerial image processing with appropriate analytical framework is very important for the automatic extraction of phenotypic features from the image data. Many studies have demonstrated the usefulness of advanced remote sensing technologies coupled with machine learning (ML) approaches for accurate prediction of valuable crop traits. Until now, Cassava has received little to no attention in aerial image-based phenotyping and ML model testing.

Results: To accelerate image processing, an automated image-analysis framework called CIAT Pheno-i was developed to extract plot level vegetation indices/canopy metrics. Multiple linear regression models were constructed at diferent key growth stages of cassava, using ground-truth data and vegetation indices obtained from a multispectral sensor. Henceforth, the spectral indices/features were combined to develop models and predict cassava root yield using diferent Machine learning techniques. Our results showed that (1) Developed CIAT pheno-i image analysis framework was found to be easier and more rapid than manual methods. (2) The correlation analysis of four phenological stages of cassava revealed that elongation (EL) and late bulking (LBK) were the most useful stages to estimate above-ground biomass (AGB), below-ground biomass (BGB) and canopy height (CH). (3) The multi-temporal analysis revealed that cumulative image feature information of EL+early bulky (EBK) stages showed a higher signifcant correlation (r=0.77) for Green Normalized Diference Vegetation indices (GNDVI) with BGB than individual time points. Canopy height measured on the ground correlated well with UAV (CHuav)-based measurements (r=0.92) at late bulking (LBK) stage. Among diferent image features, normalized diference red edge index (NDRE) data were found to be consistently highly correlated (r=0.65 to 0.84) with AGB at LBK stage. (4) Among the four ML algorithms used in this study, k-Nearest Neighbours (kNN), Random Forest (RF) and Support Vector Machine (SVM) showed the best performance for root yield prediction with the highest accuracy of R2=0.67, 0.66 and 0.64, respectively.

Conclusion: UAV platforms, time series image acquisition, automated image analytical framework (CIAT Pheno-i), and key vegetation indices (VIs) to estimate phenotyping traits and root yield described in this work have great potential for use as a selection tool in the modern cassava breeding programs around the world to accelerate germplasm and varietal selection. The image analysis software (CIAT Pheno-i) developed from this study can be widely applicable to any other crop to extract phenotypic information rapidly.