Abstract
Plant identification has applications in ethnopharmacology and agriculture. Since leaves are one of a distinguishable feature of a plant, they are routinely used for identification. Recent developments in deep learning have made it possible to accurately identify the majority of samples in five publicly available leaf datasets. However, each dataset captures the images in a highly controlled environment. This paper evaluates the performance of EfficientNet and several other convolutional neural network (CNN) architectures when applied to a combination of the LeafSnap, Middle European Woody Plants 2014, Flavia, Swedish, and Folio datasets. To normalize the impact of imbalance resulting from combining the original datasets, we used oversampling, undersampling, and transfer learning techniques to construct an end-to-end CNN classifier. We placed greater emphasis on metrics appropriate for a diverse-imbalanced dataset rather than stressing high performance on any one of the original datasets. A model from EfficientNet’s family of CNN models achieved a highly accurate F-score of 0.9861 on the combined dataset.
Recommended Citation
V. K. Gajjar et al., "Plant Identification in a Combined-Imbalanced Leaf Dataset," IEEE Access, vol. 10, pp. 37882 - 37891, Institute of Electrical and Electronics Engineers (IEEE), Apr 2022.
The definitive version is available at https://doi.org/10.1109/ACCESS.2022.3165583
Department(s)
Electrical and Computer Engineering
Keywords and Phrases
Leaf dataset; Imbalanced dataset; Convolutional neural networks; Transfer learning; Plant identification
International Standard Serial Number (ISSN)
2169-3536
Document Type
Article - Journal
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2022 The Authors, All rights reserved.
Creative Commons Licensing
This work is licensed under a Creative Commons Attribution 4.0 License.
Publication Date
07 Apr 2022