13 Papers
180 Citations
Boduo Li is an academic researcher from University of Massachusetts Amherst. The author has contributed to research in topics: Analytics & Stream processing. The author has an hindex of 8, co-authored 13 publications. Previous affiliations of Boduo Li include Harbin Institute of Technology.
Chat about Author
Papers
PODS: a new model and processing algorithms for uncertain data streams
Thanh T. L. Tran,Liping Peng,Boduo Li,Yanlei Diao,Anna Liu +4 more
- 06 Jun 2010
TL;DR: This paper presents the PODS system that supports stream processing for uncertain data naturally captured using continuous random variables, and develops evaluation techniques for complex relational operators, i.e., aggregates and joins, by exploring advanced statistical theory and approximation.
•Proceedings Article
Capturing Data Uncertainty in High-Volume Stream Processing
Yanlei Diao,Boduo Li,Anna Liu,Liping Peng,Charles Sutton,Thanh T. L. Tran,Michael Zink +6 more
- 01 Sep 2009
TL;DR: In this paper, the authors present a data stream system that captures data uncertainty from data collection to query processing to final result generation using probabilistic modeling and inference to generate uncertainty description for raw data, and then a suite of statistical techniques to capture changes of uncertainty as data propagates through query operators.
Sparkler: supporting large-scale matrix factorization
Boduo Li,Sandeep Tata,Yannis Sismanis +2 more
- 18 Mar 2013
TL;DR: Sparkler provides a convenient and efficient extension to Spark for solving matrix factorization problems on very large datasets, and is faster than Spark by 4x to 21x, with bigger advantages for larger problems.
Supporting scalable analytics with latency constraints
Boduo Li,Yanlei Diao,Prashant Shenoy +2 more
- 01 Jul 2015
TL;DR: Results from real-world workloads show that the techniques implemented in Incremental Hadoop, reduce its latency from tens of seconds to sub-second, with 2x-5x increase in throughput, and outperforms state-of-the-art distributed stream systems, Storm and Spark Streaming, when combining latency and throughput.