Detecting the Spatial Patterns of Blue-green Algae in Harsha Lake using Landsat 8 Imagery
Siok Kun Sek
- 16 Jun 2022
3
TL;DR: In this paper , the authors compared the feasibility of using machine learning methods in comparison to traditional regression models to detect and map the bluegreen algae distribution in low-medium biomass waters (Chl-a < approx. 20 μg/L) from a Landsat 8 image with the support of some in situ Chl-A measurements in Harsha Lake, Ohio.
read more
Abstract: The incidence of harmful algal blooms (HABs) caused by blue-green algae has been increasing in coastal and freshwater ecosystems of the United States in recent years, and has had great influence on ecosystem, economic, and public health. This thesis aims at testing the feasibility of using machine learning methods in comparison to traditional regression models to detect and map the blue-green algae distribution in low-medium biomass waters (Chl-a < approx. 20 μg/L) from a Landsat 8 image with the support of some in situ Chl-a measurements in Harsha Lake, Ohio. Two algorithms were compared: one is the conventional empirical method – Stepwise Multiple Linear Regression – to see if there is a strong linear relationship between measured Chl-a concentrations and the Landsat 8 spectral data in the study area, and the other is one of the most popular machine learning methods–Random Forests. Major findings include: (1) both a conventional linear regression model and a Random Forests model worked well in mapping the extent and biomass of blue-green algae in Harsha Lake on September 21, 2015, but the Random Forests model outperformed the linear regression model; (2) the prediction surface from the Random Forests method illustrated that 89.30% of Harsha Lake’s area had Chl-a values less than 10 µg/L on the sampling date, while only 10.70% of the entire study area had Chl-a concentrations between 10 µg/L and 20 µg/L. Higher Chl-a values (especially for Chl-a larger than 10 µg/L) were mostly distributed in the mouths of rivers or streams, which might be caused by the influx of nutrients from agricultural or urban land use by rivers and streams. The results show the utility of the Random Forests approach based on Landsat 8 imagery in detecting and quantitatively mapping low biomass HABs, which is considered to be a challenging task.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Community Structures of Phytoplankton with Emphasis on Toxic Cyanobacteria in an Ohio Inland Lake during Bloom Season.
Ke Chen,Joel Allen,Jingrang Lu +2 more
TL;DR: The succession patterns of phytoplankton and toxin production over a season will be documented and data will be provided to predict risk occurrence to both human and ecological factors.
Responses of phytoplankton community structure and association to variability in environmental drivers in a tropical coastal lagoon.
Lipika Tarafdar,Ji Yoon Kim,Suchismita Srichandan,Madhusmita Mohapatra,Pradipta R. Muduli,Abhishek Kumar,Deepak R. Mishra,Gurdeep Rastogi +7 more
TL;DR: In this paper, the authors examined phytoplankton communities' spatiotemporal dynamics from a 5-year dataset (n = 780) collected from 13 sampling stations in Chilika Lagoon, India, where the salinity gradient defined the spatial patterns in environmental variables.
References
•Journal Article
R: A language and environment for statistical computing.
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
410.8K
Classification and Regression by randomForest
Andy Liaw,Matthew C. Wiener +1 more
- 01 Jan 2007
TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Gene selection and classification of microarray data using random forest
TL;DR: It is shown that random forest has comparable performance to other classification methods, including DLDA, KNN, and SVM, and that the new gene selection procedure yields very small sets of genes (often smaller than alternative methods) while preserving predictive accuracy.
An assessment of the effectiveness of a random forest classifier for land-cover classification
TL;DR: In this paper, the performance of the random forest classifier for land cover classification of a complex area is explored based on several criteria: mapping accuracy, sensitivity to data set size and noise.
2.6K
Variable selection using random forests
TL;DR: This paper proposes, focusing on random forests, the increasingly used statistical method for classification and regression problems introduced by Leo Breiman in 2001, to investigate two classical issues of variable selection, and proposes a strategy involving a ranking of explanatory variables using the random forests score of importance and a stepwise ascending variable introduction strategy.
2.3K