TL;DR: The attention visualizations and case studies show that the novel structure-aware seq2seq architecture which consists of field-gating encoder and description generator with dual attention is capable of generating coherent and informative descriptions based on the comprehensive understanding of both the content and the structure of a table.
Abstract: Table-to-text generation aims to generate a description for a factual table which can be viewed as a set of field-value records. To encode both the content and the structure of a table, we propose a novel structure-aware seq2seq architecture which consists of field-gating encoder and description generator with dual attention. In the encoding phase, we update the cell memory of the LSTM unit by a field gate and its corresponding field value in order to incorporate field information into table representation. In the decoding phase, dual attention mechanism which contains word level attention and field level attention is proposed to model the semantic relevance between the generated description and the table. We conduct experiments on the WIKIBIO dataset which contains over 700k biographies and corresponding infoboxes from Wikipedia. The attention visualizations and case studies show that our model is capable of generating coherent and informative descriptions based on the comprehensive understanding of both the content and the structure of a table. Automatic evaluations also show our model outperforms the baselines by a great margin. Code for this work is available on https://github.com/tyliupku/wiki2bio.
TL;DR: A review of different neuro-fuzzy systems based on the classification of research articles from 2000 to 2017 is proposed to help readers have a general overview of the state-of-the-arts of neuro- fizzy systems and easily refer suitable methods according to their research interests.
Abstract: Neuro-fuzzy systems have attracted the growing interest of researchers in various scientific and engineering areas due to its effective learning and reasoning capabilities. The neuro-fuzzy systems combine the learning power of artificial neural networks and explicit knowledge representation of fuzzy inference systems. This paper proposes a review of different neuro-fuzzy systems based on the classification of research articles from 2000 to 2017. The main purpose of this survey is to help readers have a general overview of the state-of-the-arts of neuro-fuzzy systems and easily refer suitable methods according to their research interests. Different neuro-fuzzy models are compared and a table is presented summarizing the different learning structures and learning criteria with their applications.
TL;DR: This work defines the table union search problem and presents a probabilistic solution for finding tables that are unionable with a query table within massive repositories, and proposes a data-driven approach that automatically determines the best model to use for each pair of attributes.
Abstract: We define the table union search problem and present a probabilistic solution for finding tables that are unionable with a query table within massive repositories. Two tables are unionable if they share attributes from the same domain. Our solution formalizes three statistical models that describe how unionable attributes are generated from set domains, semantic domains with values from an ontology, and natural language domains. We propose a data-driven approach that automatically determines the best model to use for each pair of attributes. Through a distribution-aware algorithm, we are able to find the optimal number of attributes in two tables that can be unioned. To evaluate accuracy, we created and open-sourced a benchmark of Open Data tables. We show that our table union search outperforms in speed and accuracy existing algorithms for finding related tables and scales to provide efficient search over Open Data repositories containing more than one million attributes.
TL;DR: The BioTime database as mentioned in this paper contains raw data on species identities and abundances in ecological assemblages through time, which can be read into several software applications such as R or various database packages.
Abstract: The BioTIME database contains raw data on species identities and abundances in ecological assemblages through time. The database consists of 11 tables; one raw data table plus ten related meta data tables. For further information please see our associated data paper. This data consists of several elements: BioTIMESQL_02_04_2018.sql - an SQL file for the full public version of BioTIME which can be imported into any mySQL database. BioTIMEQuery_02_04_2018.csv - data file, although too large to view in Excel, this can be read into several software applications such as R or various database packages. BioTIMEMetadata_02_04_2018.csv - file containing the meta data for all studies. BioTIMECitations_02_04_2018.csv - file containing the citation list for all studies. BioTIMECitations_02_04_2018.xlsx - file containing the citation list for all studies (some special characters are not supported in the csv format). BioTIMEInteractions_02_04_2018.Rmd - an r markdown page providing a brief overview of how to interact with the database and associated .csv files (this will not work until field paths and database connections have been added/updated).
TL;DR: The proposed approach makes powerful machine learning techniques more usable to those who may not have expertise in these areas, including understanding which attributes contribute to a user's subjective preferences for data, and deconstructing attributes of importance for existing rankings.
Abstract: People often rank and order data points as a vital part of making decisions. Multi-attribute ranking systems are a common tool used to make these data-driven decisions. Such systems often take the form of a table-based visualization in which users assign weights to the attributes representing the quantifiable importance of each attribute to a decision, which the system then uses to compute a ranking of the data. However, these systems assume that users are able to quantify their conceptual understanding of how important particular attributes are to a decision. This is not always easy or even possible for users to do. Rather, people often have a more holistic understanding of the data. They form opinions that data point A is better than data point B but do not necessarily know which attributes are important. To address these challenges, we present a visual analytic application to help people rank multi-variate data points. We developed a prototype system, Podium, that allows users to drag rows in the table to rank order data points based on their perception of the relative value of the data. Podium then infers a weighting model using Ranking SVM that satisfies the user's data preferences as closely as possible. Whereas past systems help users understand the relationships between data points based on changes to attribute weights, our approach helps users to understand the attributes that might inform their understanding of the data. We present two usage scenarios to describe some of the potential uses of our proposed technique: (1) understanding which attributes contribute to a user's subjective preferences for data, and (2) deconstructing attributes of importance for existing rankings. Our proposed approach makes powerful machine learning techniques more usable to those who may not have expertise in these areas.
TL;DR: ItTVis is proposed, a novel interactive table tennis visualization system, which to the authors' knowledge, is the first visual analysis system for analyzing and exploring table tennis data.
Abstract: The rapid development of information technology paved the way for the recording of fine-grained data, such as stroke techniques and stroke placements, during a table tennis match. This data recording creates opportunities to analyze and evaluate matches from new perspectives. Nevertheless, the increasingly complex data poses a significant challenge to make sense of and gain insights into. Analysts usually employ tedious and cumbersome methods which are limited to watching videos and reading statistical tables. However, existing sports visualization methods cannot be applied to visualizing table tennis competitions due to different competition rules and particular data attributes. In this work, we collaborate with data analysts to understand and characterize the sophisticated domain problem of analysis of table tennis data. We propose iTTVis, a novel interactive table tennis visualization system, which to our knowledge, is the first visual analysis system for analyzing and exploring table tennis data. iTTVis provides a holistic visualization of an entire match from three main perspectives, namely, time-oriented, statistical, and tactical analyses. The proposed system with several well-coordinated views not only supports correlation identification through statistics and pattern detection of tactics with a score timeline but also allows cross analysis to gain insights. Data analysts have obtained several new insights by using iTTVis. The effectiveness and usability of the proposed system are demonstrated with four case studies.
TL;DR: In this article, the authors address the problem of ad hoc table retrieval by answering a keyword query with a ranked list of tables, and propose a method for performing semantic matching between queries and tables.
Abstract: We introduce and address the problem of ad hoc table retrieval: answering a keyword query with a ranked list of tables. This task is not only interesting on its own account, but is also being used as a core component in many other table-based information access scenarios, such as table completion or table mining. The main novel contribution of this work is a method for performing semantic matching between queries and tables. Specifically, we (i) represent queries and tables in multiple semantic spaces (both discrete sparse and continuous dense vector representations) and (ii) introduce various similarity measures for matching those semantic representations. We consider all possible combinations of semantic representations and similarity measures and use these as features in a supervised learning model. Using a purpose-built test collection based on Wikipedia tables, we demonstrate significant and substantial improvements over a state-of-the-art baseline.
TL;DR: The aim of this paper is to give a review on some of the most acknowledged methods of match analysis in table tennis, using the performance analysis classification of theoretical and practical performance analysis.
Abstract: In table tennis, many different approaches to scientific founded match analysis have been developed since the first ones in the 1960s. The aim of this paper is to give a review on some of the most ...
TL;DR: This paper proposed a generative model to map natural language questions into SQL queries by considering the structure of table and the syntax of SQL language, which significantly improves the quality of the generated SQL query.
Abstract: We present a generative model to map natural language questions into SQL queries. Existing neural network based approaches typically generate a SQL query word-by-word, however, a large portion of the generated results is incorrect or not executable due to the mismatch between question words and table contents. Our approach addresses this problem by considering the structure of table and the syntax of SQL language. The quality of the generated SQL query is significantly improved through (1) learning to replicate content from column names, cells or SQL keywords; and (2) improving the generation of WHERE clause by leveraging the column-cell relation. Experiments are conducted on WikiSQL, a recently released dataset with the largest question- SQL pairs. Our approach significantly improves the state-of-the-art execution accuracy from 69.0% to 74.4%.
TL;DR: This corrects the article DOI: 10.1038/nrd2018.14 to indicate that the author of the paper is a doctor rather than a scientist, as previously reported.
Abstract: Nature Reviews Drug Discovery (2018); 10.1038/nrd.2018.14 In the version of this article that was originally published online, an older version of the data set categorizing proteins into target development levels was used to create Figure 1 than the version used to create Table 1, and data from Figure 1 were referred to at several points in the text of the article.
TL;DR: Results indicate that both versions of Quizbot are essentially equally fun and easy to use, and can effectively support collaboration, with the tangible version outperforming the other one with respect to make the children reach consensus after a discussion, split and parallelize work, and treat each other with more respect.
Abstract: Gamification has been identified as an interesting technique to foster collaboration in educational contexts. However, there are not many approaches that tackle this in primary school learning environments. The most popular technologies in the classroom are still traditional video consoles and desktop computers, which complicate the design of collaborative activities since they are essentially mono-user. The recent popularization of handheld devices such as tablets and smartphones has made it possible to build affordable, scalable, and improvised collaborative gamified activities by creating a multi-tablet environment. In this paper we present Quizbot, a collaborative gamified quiz application to practice different subjects, which can be defined by educators beforehand. Two versions of the system are implemented: a tactile for tablets laid on a table, in which all the elements are digital; and a tangible in which the tablets are scattered on the floor and the components are both digital and physical objects. Both versions of Quizbot are evaluated and compared in a study with eighty primary-schooled children in terms of user experience and quality of collaboration supported. Results indicate that both versions of Quizbot are essentially equally fun and easy to use, and can effectively support collaboration, with the tangible version outperforming the other one with respect to make the children reach consensus after a discussion, split and parallelize work, and treat each other with more respect, but also presenting a poorer time management.
TL;DR: An automatic pipeline for extracting references between sentence text and table cells for existing PDF documents is provided that combines structural analysis of tables with natural language processing and rule-based matching.
Abstract: Document authors commonly use tables to support arguments presented in the text. But, because tables are usually separate from the main body text, readers must split their attention between different parts of the document. We present an interactive document reader that automatically links document text with corresponding table cells. Readers can select a sentence (or tables cells) and our reader highlights the relevant table cells (or sentences). We provide an automatic pipeline for extracting such references between sentence text and table cells for existing PDF documents that combines structural analysis of tables with natural language processing and rule-based matching. On a test corpus of 330 (sentence, table) pairs, our pipeline correctly extracts 48.8% of the references. An additional 30.5% contain only false negatives (FN) errors -- the reference is missing table cells. The remaining 20.7% contain false positives (FP) errors -- the reference includes extraneous table cells and could therefore mislead readers. A user study finds that despite such errors, our interactive document reader helps readers match sentences with corresponding table cells more accurately and quickly than a baseline document reader.
TL;DR: This paper demonstrates performance improvement to proposed table detection techniques based on the observation that tables tend to contain more numeric data and hence it applies color coding/coloration as a signal for telling apart numeric and textual data.
Abstract: Table detection is an important step in many document analysis systems. It is a difficult problem due to the variety of table layouts, encoding techniques and the similarity of tabular regions with non-tabular document elements. Earlier approaches of table detection are based on heuristic rules or require additional PDF metadata. Recently proposed methods based on machine learning have shown good results. This paper demonstrates performance improvement to these table detection techniques. The proposed solution is based on the observation that tables tend to contain more numeric data and hence it applies color coding/coloration as a signal for telling apart numeric and textual data. Deep learning based Faster R-CNN is used for detection of tabular regions from document images. To gauge the performance of our proposed solution, publicly available UNLV dataset is used. Performance measures indicate improvement when compared with best in-class strategies.
TL;DR: Software-Defined Networking and OpenFlow are actively being standardized and deployed and rely on switches that come from various vendors and differ in terms of performance and design.
TL;DR: This paper proposed a generative model to map natural language questions into SQL queries by considering the structure of table and the syntax of SQL language, which significantly improves the quality of the generated SQL query.
Abstract: We present a generative model to map natural language questions into SQL queries. Existing neural network based approaches typically generate a SQL query word-by-word, however, a large portion of the generated results are incorrect or not executable due to the mismatch between question words and table contents. Our approach addresses this problem by considering the structure of table and the syntax of SQL language. The quality of the generated SQL query is significantly improved through (1) learning to replicate content from column names, cells or SQL keywords; and (2) improving the generation of WHERE clause by leveraging the column-cell relation. Experiments are conducted on WikiSQL, a recently released dataset with the largest question-SQL pairs. Our approach significantly improves the state-of-the-art execution accuracy from 69.0% to 74.4%.
TL;DR: In this paper, an ad hoc customizable electronic gaming table is described, which includes an electronic game table controller, at least one display, an NFC tag operably connected to the gaming table controller and operable to communicate with a player device.
Abstract: An ad hoc customizable electronic gaming table is disclosed. The ad hoc customizable electronic gaming table includes an electronic gaming table controller, at least one display operatively coupled to the electronic gaming table controller, an NFC tag operably connected to the gaming table controller and operable to communicate with a player device, and a customization server constructed to communicate with the electronic utilizes one-time URLs to coordinate customization of a game state display by a player using the NFC tag as an interface to the player's own player device.
TL;DR: In this paper, ν-Support Vector Machine (ν-SVM) is used to explore strategies for the data-driven CHF look-up table construction, based on sparingly distributed experimental data points.
TL;DR: i2b2’s REST API can be used to query multiple healthcare data models, enabling shared tooling to have a choice of backend data stores and enables separation between data model and software tooling for some of the more popular open analytic data models in healthcare.
TL;DR: The importance of dissection in practical anatomy teaching, and the large number of body donations needed is described, and many authors have proposed different solutions, such as software with reconstructions of the human body.
Abstract: The purpose of this article was to describe and explain our experience with Anatomage table in the process of teaching and learning anatomy to medicine students who are preparing as military physicians. Anatomage combines stereoscopic images of the whole body with software in order to build a 3-dimensional (3-D) reconstruction of the different human body parts. These images were taken from two cadavers, male and female, who were frozen and cut into sections to allow for virtual dissection and reconstruction of the human body. Users can visualize anatomy exactly as they would on a fresh cadaver. The table allows for exploration and learning of human anatomy beyond the experience with a cadaver. It is possible to cut away from the body surface to the inner body using a scalpel, as well as to watch images of 3-D sections in the three spatial planes.We described the importance of dissection in practical anatomy teaching, and the large number of body donations needed. Thus, many authors have proposed different solutions, such as software with reconstructions of the human body. Anatomage allows for anatomy teaching and learning in an interactive way. Students can practice actively and take the images watched in a practical session with them in a storage device, in order to study and discuss them later in a lecture. Anatomage is also used for practical anatomy exams to students. Despite being rather costly, it stimulates the learning of anatomy by being directly used by students in various ways.
TL;DR: A new rising edge triggered D flip-flop structure with reset capability is presented and several common structures without reset ability are compared with the proposed structure and the results are indicated in the comparison table.
TL;DR: The actions of a database architect during a complex evolution of the database at the core of a software system are recorded and techniques developed by the software engineering community could be adapted to help in the development and evolution of relational databases.
Abstract: Modern relational database management systems provide advanced features allowing, for example, to include behaviour directly inside the database (stored procedures). These features raise new difficulties when a database needs to evolve (e.g. adding a new table). To get a better understanding of these difficulties, we recorded and studied the actions of a database architect during a complex evolution of the database at the core of a software system. From our analysis, problems faced by the database architect are extracted, generalized and explored through the prism of software engineering. Six problems are identified: (1) difficulty in analysing and visualising dependencies between database's entities, (2) difficulty in evaluating the impact of a modification on the database, (3) replicating the evolution of the database schema on other instances of the database, (4) difficulty in testing database's functionalities, (5) lack of synchronization between the IDE's internal model of the database and the database actual state and (6) absence of an integrated tool enabling the architect to search for dependencies between entities, generate a patch or access an up to date PostgreSQL documentation. We suggest that techniques developed by the software engineering community could be adapted to help in the development and evolution of relational databases.
TL;DR: This work uses an island metaphor, which represents every module as a distinct island, to get a first overview about the complexity of an OSGi-based software system by interactively exploring its modules as well as the dependencies between them.
Abstract: We propose the tool IslandViz for exploring modular software systems in virtual reality. We use an island metaphor, which represents every module as a distinct island. The resulting island system is displayed in the confines of a virtual table, where users can explore the software visualization on multiple levels of granularity by performing navigational tasks. Our approach allows users to get a first overview about the complexity of an OSGi-based software system by interactively exploring its modules as well as the dependencies between them.
TL;DR: In this article, the authors propose a method to detect rich semantic types such as credit card and ISBN numbers that encode semantic validations (e.g., checksum) from open-source repositories like GitHub.
Abstract: Given a table of data, existing systems can often detect basic atomic types (e.g., strings vs. numbers) for each column. A new generation of data-analytics and data-preparation systems are starting to automatically recognize rich semantic types such as date-time, email address, etc., for such metadata can bring an array of benefits including better table understanding, improved search relevance, precise data validation, and semantic data transformation. However, existing approaches only detect a limited number of types using regular-expression-like patterns, which are often inaccurate, and cannot handle rich semantic types such as credit card and ISBN numbers that encode semantic validations (e.g., checksum). We developed AUTOTYPE from open-source repositories like GitHub. Users only need to provide a set of positive examples for a target data type and a search keyword, our system will automatically identify relevant code, and synthesize type-detection functions using execution traces. We compiled a benchmark with 112 semantic types, out of which the proposed system can synthesize code to detect 84 such types at a high precision. Applying the synthesized type-detection logic on web table columns have also resulted in a significant increase in data types discovered compared to alternative approaches.
TL;DR: These live survey application will be compared in several types of features to identify which is suitable for education purpose and the output of this paper is a comparison table that can be used for educational purpose.
Abstract: Live survey application is being used to get opinion from others There are several types of live survey monkey that currently popular such as SurveyMonkey and Survey Gizmo These live survey application will be compared in several types of features to identify which is suitable for education purpose The comparison will be on the advantage and disadvantage of both applications, the security issues and solution on how to solve the issues Three (3) phases involve in the research methodology such as searching, identify and the result of the comparison The output of this paper is a comparison table that can be used for educational purpose
TL;DR: This paper proposed a method to add bonus to words worthy of recommendation, so that NMT can make correct predictions and integrate this bonus value into NMT to improve the translation results, which obtained remarkable improvements over the strong attention-based NMT.
Abstract: Neural Machine Translation (NMT) has drawn much attention due to its promising translation performance recently. However, several studies indicate that NMT often generates fluent but unfaithful translations. In this paper, we propose a method to alleviate this problem by using a phrase table as recommendation memory. The main idea is to add bonus to words worthy of recommendation, so that NMT can make correct predictions. Specifically, we first derive a prefix tree to accommodate all the candidate target phrases by searching the phrase translation table according to the source sentence. Then, we construct a recommendation word set by matching between candidate target phrases and previously translated target words by NMT. After that, we determine the specific bonus value for each recommendable word by using the attention vector and phrase translation probability. Finally, we integrate this bonus value into NMT to improve the translation results. The extensive experiments demonstrate that the proposed methods obtain remarkable improvements over the strong attentionbased NMT.
TL;DR: This paper proposes a new DEA approach to allocate the resource in a bidirectional interactive parallel system and considers not only the resource allocation of a certain DMU, but also the resource allocations of all DMUs for a centralized decision maker through centralized models.
Abstract: Resource allocation is a popular and important issue in the enterprise management. Recently, data envelopment analysis (DEA) as a non-parametric method for measuring the performance of decision-mak...
TL;DR: This paper presents a machine learning based eviction approach which can identify whether a flow entry is active or inactive and thus timely evict the inactive flow entries when flow table overflow occurs and can increase the usage of flow table and reduce the number of capacity misses by up to 80%, compared with the Least Recently Used eviction policy.
Abstract: Software Defined Networking (SDN) is fundamentally changing the way networks work, which enables programmable and flexible network management and configuration. As the de facto southbound interface of SDN, OpenFlow defines how the control plane can directly interact with the forwarding plane. In OpenFlow, flow tables play a significant role in packet forwarding. However, the capacity of flow table is limited due to power, cost, and silicon area constraints. The capacity-limited flow table cannot hold the explosive flows generated by the fine- grained granularity control mechanism used in SDN. Thus the flow table is frequently overflowed. In the case of overflow, eviction strategy which replaces existing flow entries with the new ones is critical to guarantee the efficient usage of the flow table. In this paper, we present a machine learning based eviction approach which can identify whether a flow entry is active or inactive and thus timely evict the inactive flow entries when flow table overflow occurs. Our simulations based on real network packet traces show that the proposed method can increase the usage of flow table by more than 55% and reduce the number of capacity misses by up to 80%, compared with the Least Recently Used eviction policy.
TL;DR: An improved algorithm based on adjacency table using a hash table to store adjacencies table, which considerably saves the finding time is proposed and the experimental results show that the improved algorithm has good performance especially for mining frequent itemsets in dense data sets.
Abstract: FP-Growth algorithm is an association rule mining algorithm based on frequent pattern tree (FP-Tree), which doesn’t need to generate a large number of candidate sets. However, constructing FP-Tree requires two scansof the original transaction database and the recursive mining of FP-Tree to generate frequent itemsets. In addition, the algorithm can’t work effectively when the dataset is dense. To solve the problems of large memory usage and low time-effectiveness of data mining in this algorithm, this paper proposes an improved algorithm based on adjacency table using a hash table to store adjacency table, which considerably saves the finding time. The experimental results show that the improved algorithm has good performance especially for mining frequent itemsets in dense data sets.
TL;DR: This work employs an island metaphor, which represents every module of an OSGi-based software system as a distinct island, and shows the resulting island system in the confines of a virtual table where users can explore the software visualization on multiple levels of granularity by performing intuitive navigational tasks.
Abstract: We present an approach for exploring OSGi-based software systems in virtual reality. We employ an island metaphor, which represents every module as a distinct island. The resulting island system is displayed in the confines of a virtual table, where users can explore the software visualization on multiple levels of granularity by performing intuitive navigational tasks. Our approach allows users to get a first overview about the complexity of an OSGi-based software system by interactively exploring its modules as well as the dependencies between them.
TL;DR: In this paper, the map update module divides each of the map segments into a plurality of sub-segments, and updates a first sub segment as an updating target among the plurality by loading the first subsegment into a map update buffer of the memory.
Abstract: A data storage device includes a nonvolatile memory device including an address map table in which a plurality of map segments including a plurality of logical-to-physical (P2L) entries are stored and a controller controlling the nonvolatile memory device. The controller includes a processor and a memory storing a map update module configured to be driven through the processor and perform map updating on the plurality of map segments. The map update module divides each of the map segments into a plurality of sub segments, updates a first sub segment as an updating target among the plurality of sub segments by loading the first sub segment into a map update buffer of the memory, and encodes second sub segments as a non-updating target among the plurality of sub segments and stores the encoded second sub segments in a page buffer of the nonvolatile memory device.