TL;DR: In this article, a new paradigm of learning partial differential equations from small data is presented, which is essentially data-efficient learning machines capable of leveraging the underlying laws of physics, expressed by time dependent and nonlinear partial differential equation, to extract patterns from high-dimensional data generated from experiments.
TL;DR: The case for open data and the economics of open data are discussed in this paper, with a focus on the benefits of data integration and the challenges of building data infrastructures.
Abstract: Chapter 1: Conceptualising Data What are data? Kinds of data Data, information, knowledge, wisdom Framing data Thinking critically about databases and data infrastructures Data assemblages and the data revolution Chapter 2: Small Data, Data Infrastructures and Data Brokers Data holdings, data archives and data infrastructures Rationale for research data infrastructures The challenges of building data infrastructures The challenges of building data infrastructuresData brokers and markets Chapter 3: Open and Linked Data Open data Linked data The case for open data The economics of open data Concerns with respect to opening data Chapter 4: Big Data Volume Exhaustive Resolution and indexicality Relationality Velocity Variety Flexibility Chapter 5: Enablers and Sources of Big Data The enablers of big data Sources of big data Directed Data Automated data Volunteered data Chapter 6: Data Analytics Pre-analytics Machine learning Data mining and pattern recognition Data visualisation and visual analytics Statistical analysis Prediction, simulation and optimization Chapter 7: The Governmental and Business Rationale for Big Data Governing people Managing organisations Leveraging value and producing capital Creating better places Chapter 8: The Reframing of Science, Social Science and Humanities Research The fourth paradigm in science? The re-emergence of empiricism The fallacies of empiricism Data-driven science Computational social sciences and digital humanities Chapter 9: Technical and Organisational Issues Deserts and deluges Access Data quality, veracity and lineage Data integration and interoperability Poor analysis and ecological fallacies Skills and human resourcing Chapter 10: Ethical, Political, Social and Legal Concerns Data shadows and dataveillance Privacy Data security Profiling, social sorting and redlining Secondary uses, control creep and anticipatory governance Modes of governance and technological lock-ins Chapter 11: Making Sense of the Data Revolution Understanding data and the data revolution Researching data assemblages Final thoughts
TL;DR: In this paper, the NavierStokes equations are locally well-posed for smooth enough initial data as long as one imposes appropriate boundary conditions on the pressure at ∞, where u is the velocity and p is the pressure.
TL;DR: This paper provides a methodology that incorporates the governing equations of the physical model in the loss/likelihood functions of the model predictive density and the reference conditional density as a minimization problem of the reverse Kullback-Leibler (KL) divergence.
TL;DR: The effectiveness of the proposed method to predict freeway travel times using a linear model in which the coefficients vary as smooth functions of the departure time is demonstrated by applying the method to two real-life loop detector data sets.
Abstract: Effective prediction of travel times is central to many advanced traveler information and transportation management systems. In this paper we propose a method to predict freeway travel times using a linear model in which the coefficients vary as smooth functions of the departure time. The method is straightforward to implement, computationally efficient and applicable to widely available freeway sensor data. We demonstrate the effectiveness of the proposed method by applying the method to two real-life loop detector data sets. The first data set––on I-880––is relatively small in scale, but very high in quality, containing information from probe vehicles and double loop detectors. On this data set the prediction error ranges from 5% for a trip leaving immediately to 10% for a trip leaving 30 min or more in the future. Having obtained encouraging results from the small data set, we move on to apply the method to a data set on a much larger spatial scale, from Caltrans District 12 in Los Angeles. On this data set, our errors range from about 8% at zero lag to 13% at a time lag of 30 min or more. We also investigate several extensions to the original method in the context of this larger data set.