Comparing regression models with count data to artificial neural network and ensemble models for prediction of generic escherichia coli population in agricultural ponds based on weather station measurements Measurements

Yükleniyor...
Küçük Resim

Tarih

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Elsevier B.V.

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

Abstract View references (58) Indicator microorganisms are monitored in agricultural waters to foster produce safety. Various prediction models are used to estimate the population of indicator microorganisms and pathogens when no observation is available. The purpose of this study was to compare the performance of regression models with count data (zero-inflated Poisson and hurdle negative binomial) to artificial neural network and ensemble models (random forest and AdaBoost) for the prediction of generic Escherichia coli population in agricultural surface waters in relation with weather station measurements. Two-part count data models were built on E. coli population count frequencies (0, [1,10), [10,100), [100,1000), [1000, 10000), (>=10000)) based on the data structure. The use of artificial neural network, AdaBoost, and random forest were determined based on the mean absolute error (MAE) value over pre-tested six models. The MAE was also used to compare the performance of two-part count data models with artificial neural network and ensemble models. Over-dispersed E. coli population count frequencies was calculated between 2.2 and 52.2% for all ponds. Observed and predicted zero E. coli population counts for all ponds were matched from 82 to 100% for zero-inflated Poisson and 100% for hurdle negative binomial regression models. Overdispersion reduced the performance of tested models. AdaBoost-Twelve Estimators had the best performance with the lowest MAE values for all ponds (from 0.87 to 46.60). The ensemble models used in this study provided more promising performance when compared to tested regression models with count data. © 2021

Açıklama

Anahtar Kelimeler

AdaBoost, Hurdle model, Negative binomial model, Produce safety, Random forest, Zero inflated Poisson model

Kaynak

Microbial Risk Analysis

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Onay

İnceleme

Ekleyen

Referans Veren