Congcong Li

Tsinghua University

39 Papers

25 Citations

Congcong Li is an academic researcher from Tsinghua University. The author has contributed to research in topics: Land cover & Computer science. The author has an hindex of 19, co-authored 34 publications. Previous affiliations of Congcong Li include Beijing Normal University & United States Geological Survey.

Author Tools

Create citation map

Create Author Profile

Analyze Congcong Li's Top Papers

Chat about Author

Papers

•Journal Article•10.1080/01431161.2012.748992

Finer resolution observation and monitoring of global land cover: first mapping results with Landsat TM and ETM+ data

Peng Gong, +42 more

- 01 Apr 2013

- Journal of remote sensing

TL;DR: In this article, the first 30 m resolution global land cover maps using Landsat Thematic Mapper TM and enhanced thematic mapper plus ETM+ data were produced. And the authors used four classifiers that were freely available were employed, including the conventional maximum likelihood classifier MLC, J4.8 decision tree classifier, Random Forest RF classifier and support vector machine SVM classifier.

...read moreread less

1.6K

Journal Article•10.1016/J.SCIB.2019.03.002

Stable classification with limited sample: transferring a 30-m resolution sample set collected in 2015 to mapping 10-m resolution global land cover in 2017

Peng Gong, +31 more

- 02 Mar 2019

- Chinese Science Bulletin

Abstract: 2019 Science China Press. Published by Elsevier B.V. and Science China Press. All rights reserved. As the world strives to reduce the impact of population growth, urbanization, agricultural expansion, and climate change on food security, energy and water shortage, resource over-exploration, biodiversity loss, environmental pollution, and ultimately human health, timely and higher resolution land cover information is urgently needed to achieve the sustainable development goals of the United Nations. Finer than 100-m resolution land cover mapping of the entire world was not in place until the 2010 s when 30-m resolution Landsat images covering the world were made freely available [1]. However, more and more applications such as crop field mapping, solar energy planning, wildlife management, and urban planning require not only higher resolution but also more frequent global land cover maps. The large volume of data required makes it more labor and computation intensive to develop finer resolution and more frequent maps. The labor intensiveness is not only a restrictive factor to visual interpretation of high resolution images but also a huge burden to training sample collection in automated mapping. Fortunately, with continuing Elsevier B.V. and Science China Pr ong), sohuwangjie@163.com google.com (N. Clinton). training sample collection and accumulation from our previous efforts [2,3], and both the complete storage and free accessibility of the 10-m resolution Sentinel-2 images and the huge computing capability provided by Google Earth Engine, we are in an advanced position to develop a 10-m resolution global land cover map. A multi-seasonal sample set including a training set and a validation set has been collected from Landsat 8 images acquired in 2014 and 2015. The training set contains approximately 340,000 sample units of various sizes (from 30 m 30 m to 500 m 500 m) located at approximately 93,000 sites worldwide [3]. The validation set contains approximately 140,000 sample units of land cover type in different seasons at over 38,000 locations. Our previous experiments have indicated that a random forest classifier is both computation efficient and optimal in performance when dealing with high dimensionality of data [4]. Preliminary test results indicated that an overall classification accuracy could be achieved at better than 71% using the 2015 training and validation sample sets [3]. Our goal here is to apply the training sample set to Sentinel-2 images acquired in 2017 to produce a 10 m resolution global land cover map with the random forest classifier. The question is whether or not we can directly ess. All rights reserved. P. Gong et al. / Science Bulletin 64 (2019) 370–373 371 apply the 2015 sample sets to 2017 images acquired with a different sensor. For the purpose of sample transfer to data acquired in other years or from different sensors, we wish to know how small a sample set could be sufficient to allow us to achieve a relatively consistent classification result. At the global scale, only a small percentage of the territory in the world would change land cover types due to human activities of land clearing or natural forces such as wildland fires, volcanos, hurricanes, etc. The annual percentage of land cover change can hardly exceed a few percent of the total land area on Earth. Therefore, it is meaningful to find out to what extent when the training sample is so obsolete that it is no longer suitable for transferring the sample to different years or to data obtained from a different sensor. On the other hand, training sample points are collected through image interpretation. The best image interpreter may still make 5%–10% of interpretation errors [2]. How tolerable is a classification algorithm to training sample errors introduced by image interpreters? For any classifier, its sensitivity to a smaller sized training sample or its tolerance to training errors or actual land cover changes from year to year should be determined. Here we define the concept of a stable classification. We use this concept to approximately determine how much reduction in training sample and how much land cover change or image interpretation error can be acceptable. If the mean accuracy of multiple runs of a classifier trained with a random drawing of a certain percentage of sample points from the total sample is within 1% of what can be achieved with the total sample set, we regard the obtained classification result ‘‘stable”. The 1% threshold is empirically chosen based on the fact that a loss of overall accuracy in 1% shall not significantly impact the application of a global land cover map. Using a random forest algorithmwith 200 trees (as explained later, this is the optimal performer from our experience), we conducted an experiment to find out how ‘‘limited” can the training sample be while a stable classification can still be maintained. For the experimentation, we used our 2015 training and validation sample sets. The classification features include 9 Landsat-8 image bands; indices of vegetation (normalized difference vegetation index, enhanced vegetation index), water (modified normalized difference water index), built-up (normalized difference built-up index), and burning (normalized burn ratio); 25%, 50%, 75%, percentile of the annual time-series image spectral values; mean and standard deviations of each of the previous features; elevation, slope, aspect, and hill shadow; and longitude and latitude of the sample location. We designed two sets of experiments. In the first, we gradually Fig. 1. Sample robustness to size reduction and errors in sample. (a) As sample size increa increases the accuracy decreases. In both cases, the 1,000 times random drawing of sam deviations much lower than 0.5%. reduce the number of training sample points by 1% each time and randomly repeat this process for 1,000 times. In the second, we randomly alter the category of a certain percent of the total sample and used the ‘‘noisy” sample to train the random forest classifier. We began to alter the land cover types in 5% increments of the total training sample. We repeated the experiments until 45% of the total sample were altered. For each increment of sample alteration, we randomly alter sample points for 1,000 times. In both sets of experiments we tested the classification accuracy using the validation sample set. The results are presented in Fig. 1. It can be seen that the mean overall accuracy of the sample reduction is very stable until as few as 40% of the training sample are used (72.15% vs. 73.13% obtained with the entire sample). Therefore, it is safe to state that we need only to use approximately 40% of the total sample to keep the classification stable (Fig. 1a). On the other hand, it can be seen from Fig. 1b that when the ‘‘error” (altered classes) of the training sample reaches 20%, the mean accuracy is still within 1% from that obtained with un-altered training sample. So the tolerance range of sample error can be set to 20%. These experiments suggest that it is possible to use 60% fewer sample points and even the land cover changed by 20% or the training sample contains 20% errors, we are still able to achieve ‘‘stable” classification with the random forest classifier in global land cover mapping. Therefore, we felt safe to transfer the entire training sample in classifying Sentinel-2 images obtained in 2017 because we assumed that the land cover types in the world did not change by more than 5% from 2015 to 2017. It should be possible to produce a stable land cover map based on our circa 2015 training sample set [3]. Since its launch in 2015, Sentinel-2 acquires data in 13 spectral bands including four 10-m resolution visible and near infrared bands, six 20-m resolution red-edge and middle infrared spectral bands, and three additional bands measuring atmospheric conditions. We used all but the atmospheric bands. After tests and adjustment, Sentinel-2 acquired more images covering the world in 2017. Therefore, we used 2017 data in mapping 10-m global land cover. To extend the samples for use in 10-m resolution Sentinel-2 images, we used the center of each sample location to match the nearest locations of the Sentinel data to extract and construct spectral features. Elevation data from Shuttle Radar Topographic Mission (SRTM) were also used as ancillary data (https://doi.org/10.1029/2005RG000183). The input features include the spectral values of the greenest time in each year, the 0, 25, 50, 75 and 100 percentile of time series, and indices of vegetation, water, building and snow as above mentioned in classifyses, the accuracy quickly reaches a plateau. (b) As the impurity percentage of sample ple points produced very stable overall classification accuracies with most standard Table 1 Confusion matrix for the 2017 global land cover map, FROM-GLC10, obtained from Sentinel-2 data. Classification CR FR GR SR WE WB TU IA BL SI Total PA (%) Cropland 1864 262 629 205 2 4 0 33 53 0 3052 61.07 Forest 304 7951 628 455 5 9 48 17 25 1 9443 84.20 Grassland 441 502 4378 632 15 15 111 66 625 4 6789 64.49 Shrubland 203 68

...read moreread less

•Journal Article•10.3390/RS6020964

Comparison of Classification Algorithms and Training Sample Sizes in Urban Land Classification with Landsat Thematic Mapper Imagery

Congcong Li, +4 more

- 24 Jan 2014

- Remote Sensing

TL;DR: In this paper, the spectral information provided by the Landsat Thematic Mapper (TM) data set and the same classification scheme over Guangzhou City, China, was tested with two unsupervised and 13 supervised classification algorithms, including a number of machine learning algorithms.

...read moreread less

383

•Journal Article•10.1007/S11434-012-5093-3

Mapping wetland changes in China between 1978 and 2008

Zhenguo Niu, +32 more

- 11 Apr 2012

- Chinese Science Bulletin

TL;DR: Wang et al. as mentioned in this paper analyzed the 2008 wetland distribution in China and discussed wetland changes and their drivers over the past 30 years using four wetland maps for all China have been produced, based on Landsat and CBERS-02B remote sensing data.

...read moreread less

348

•Journal Article•10.1007/S11434-012-5235-7

China’s urban expansion from 1990 to 2010 determined with satellite remote sensing

Lei Wang, +14 more

- 13 Jul 2012

- Chinese Science Bulletin

TL;DR: Based on the same data source of Landsat TM/ETM+ in 1990s, 2000s and 2010s, all urban built-up areas in China are mapped mainly by human interpretation as mentioned in this paper.

...read moreread less

326

...

Expand