Dmitriy Ryaboy
5 Papers
10 Citations
Dmitriy Ryaboy is an academic researcher from Twitter. The author has contributed to research in topics: Analytics & Session (web analytics). The author has an hindex of 5, co-authored 5 publications.
Chat about Author
Papers
Storm@twitter
Ankit Toshniwal,Siddarth Taneja,Amit Shukla,Karthik Ramasamy,Jignesh M. Patel,Sanjeev Kulkarni,Jason Jackson,Krishna Gade,Maosong Fu,Jake Donham,Nikunj Bhagat,Sailesh Mittal,Dmitriy Ryaboy +12 more
- 18 Jun 2014
TL;DR: The architecture of Storm and its methods for distributed scale-out and fault-tolerance are described, how queries are executed in Storm is described, and some operational stories based on running Storm at Twitter are presented.
1K
The unified logging infrastructure for data analytics at Twitter
George Lee,Jimmy Lin,Chuang Liu,Andrew Lorek,Dmitriy Ryaboy +4 more
- 01 Aug 2012
TL;DR: In this article, the authors present Twitter's production logging infrastructure and its evolution from application-specific logging to a unified "client events" log format, where messages are captured in common, well-formatted, flexible Thrift messages.
•Posted Content
The Unified Logging Infrastructure for Data Analytics at Twitter
TL;DR: This paper presents Twitter's production logging infrastructure and its evolution from application-specific logging to a unified "client events" log format, where messages are captured in common, well-formatted, flexible Thrift messages.
10
Full-text indexing for optimizing selection operations in large-scale data analytics
Jimmy Lin,Dmitriy Ryaboy,Kevin Weil +2 more
- 08 Jun 2011
TL;DR: It is shown that it is possible to leverage a full-text index to optimize selection operations on text fields within records in Hadoop, and moderate improvements in end-to-end query running times and substantial savings in terms of cumulative processing time at the worker nodes are shown.
Scaling big data mining infrastructure: the twitter experience
Jimmy Lin,Dmitriy Ryaboy +1 more
TL;DR: This paper discusses the evolution of Twitter's infrastructure and the development of capabilities for data mining on "big data", and observes that a major challenge in building data analytics platforms stems from the heterogeneity of the various components that must be integrated together into production workflows.