TL;DR: The experimental results demonstrate that a joint learning approach significantly outperforms a pipeline approach by incorporating global features and by selecting appropriate learning methods and search orders.
Abstract: This paper proposes a history-based structured learning approach that jointly extracts entities and relations in a sentence. We introduce a novel simple and flexible table representation of entities and relations. We investigate several feature settings, search orders, and learning methods with inexact search on the table. The experimental results demonstrate that a joint learning approach significantly outperforms a pipeline approach by incorporating global features and by selecting appropriate learning methods and search orders.
TL;DR: The IELab is a novel, collaborative approach to compiling large-scale environmentally extended multi-region input-output (MRIO) models, and will facilitate the harmonisation of fragmented, dispersed and misaligned raw data for the benefit of all interested parties.
TL;DR: In this article, the authors present a query optimization approach for multi-tenant database systems, where the original query is associated with data accessible by a user associated with a tenant, and metadata associated with the data is retrieved.
Abstract: Methods and systems for query optimization for a multi-tenant database system are provided. Some embodiments comprise receiving at a network interface of a server in a multi-tenant database system an original query transmitted to the multi-tenant database system by a user associated with a tenant, wherein the original query is associated with data accessible by the tenant, and wherein the multi-tenant database system includes at least a first index and a second index. Metadata associated with the data is retrieved, wherein at least a portion of the data is stored in a common table within the multi-tenant database system. A tenant-selective query syntax is determined by analyzing at least one of metadata generated from information about the tenant or metadata generated from the data accessible by the tenant. An improved query is then generated using the query syntax, wherein the improved query is based at least in part upon the original query and a result of a join between a first number of rows associated with the first index and a second number of rows associated with the second index.
TL;DR: In this paper, the authors describe a mathematical graph in which nodes represent data sources and/or portions of data sources (for example, database tables), and edges represent relationships among the data sources, and the table graph enables a compact and memory efficient storage of relationships among disparate data sources.
Abstract: Embodiments of the present disclosure relate to a computer system and interactive user interfaces configured to enable efficient and rapid access to multiple different data sources simultaneously, and by an unskilled user. The unskilled user may provide simple and intuitive search terms to the system, and the system may thereby automatically query multiple related data sources of different types and present results to the user. Data sources in the system may be efficiently interrelated with one another by way of a mathematical graph in which nodes represent data sources and/or portions of data sources (for example, database tables), and edges represent relationships among the data sources and/or portions of data sources. For example, edges may indicate relationships between particular rows and/or columns of various tables. The table graph enables a compact and memory efficient storage of relationships among various disparate data sources.
TL;DR: In this paper, the authors describe a switch that includes a packet processor, a persistent storage module, and a boot-up management module, which are stored in a first table in a local persistent storage.
Abstract: One embodiment of the present invention provides a switch. The switch includes a packet processor, a persistent storage module, and a boot-up management module. The packet processor identifies a switch identifier associated with the switch in the header of a packet. The persistent storage module stores configuration information of the switch in a first table in a local persistent storage. This configuration information is included in a configuration file, and the first table includes one or more columns for the attribute values of the configuration information. The boot-up management module loads the attribute values to corresponding switch modules from the first table without processing the configuration file.
TL;DR: The authors outline ways in which appropriate corpus resources may help students to develop com petence as writers within specific academic domains and discuss the sets of knowledge which writers (in general) need in order to produce appropriate and effective texts.
Abstract: The purpose of this chapter is to outline ways in which appropriate corpus
resources may help students to develop com petence as writers within specific
academic domains. I have discussed elsewhere (Tribble, 1997a) the sets of
knowledge which writers (in general) need in order to produce appropriate
and effective texts. These can be summarised as in Table 7.1.
TL;DR: This paper presents a technique called WideTable, which is built by denormalizing the database, and then converting complex queries into simple scans on the underlying (wide) table, to improve the speed of analytical data processing systems.
Abstract: This paper presents a technique called WideTable that aims to improve the speed of analytical data processing systems. A WideTable is built by denormalizing the database, and then converting complex queries into simple scans on the underlying (wide) table. To avoid the pitfalls associated with denormalization, e.g. space overheads, WideTable uses a combination of techniques including dictionary encoding and columnar storage. When denormalizing the data, WideTable uses outer joins to ensure that queries on tables in the schema graph, which are now nested as embedded tables in the WideTable, are processed correctly. Then, using a packed code scan technique, even complex queries on the original database can be answered by using simple scans on the WideTable(s). We experimentally evaluate our methods in a main memory setting using the queries in TPC-H, and demonstrate the effectiveness of our methods, both in terms of raw query performance and scalability when running on many-core machines.
TL;DR: In this paper, the authors provide systems, devices, methods and computer readable media for application installation security and privacy evaluation and indication, including an application installation module configured to receive an application package for installation on a device, wherein the package comprises a list of device resources to be accessed by the application.
Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for application installation security and privacy evaluation and indication. The system may include an application installation module configured to receive an application package for installation on a device, wherein the package comprises a list of device resources to be accessed by the application. The system may also include memory configured to store an impact score table comprising one or more security impact scores, each security impact score associated with access to one of the device resources. The system may further include a security/privacy evaluation module configured to calculate a security impact indicator (SII) based on a sum of the security impact scores selected by the accessed device resources listed in the package.
TL;DR: This work improves the well-known GAC-4 algorithm by managing the internal data structures in a different way, and proposes to reset the data structures by recomputing them from scratch whenever it saves time.
Abstract: We introduce GAC-4R, MDD-4, and MDD-4R three new algorithms for maintaining arc consistency for table and MDD constraints. GAC-4R improves the well-known GAC-4 algorithm by managing the internal data structures in a different way. Instead of maintaining the internal data structures only by studying the consequences of deletions, we propose to reset the data structures by recomputing them from scratch whenever it saves time. This idea avoids the major drawback of the GAC-4 algorithm, i.e., its cost at a shallow search-tree depth. We also show that this idea can be exploited in MDD constraints. Experiments show that GAC-4R is competitive with the best arc-consistency algorithms for table constraints, and that MDD-4R clearly outperforms all classical algorithms for table or MDD constraints.
TL;DR: This paper proposes offline FFTA (Fast Flow Table Aggregation) and its online improver iFFTA to shrink the flow table size and to provide practical fast updates and is the first online non-prefix aggregation scheme.
Abstract: In OpenFlow-based SDN, flow tables are TCAM-hungry and commodity switches suffer from limited concrete flow table size. One method for coping with the limitations is to use aggregation schemes to reduce the number of flow entries required to represent the same forwarding semantics. Unfortunately, the aggregation retards table updates and lengthens the updating time. During which, the data plane is inconsistent with the control plane, forwarding errors such as Reachability Failures, Forwarding Loops, Traffic Isolation and Leakage are prone to occur. Since network updates take place frequently in practice, the aggregation scheme must be efficient enough. In this paper we propose offline FFTA (Fast Flow Table Aggregation) and its online improver iFFTA to shrink the flow table size and to provide practical fast updates. iFFTA is the first online non-prefix aggregation scheme. Extensive experiments demonstrate: (1) FFTA is about 200× faster than the previously published best non-prefix aggregation scheme without loss of compression ratio on offline aggregation; and (2) iFFTA achieves about 3× faster than FFTA on online update incorporations with a loss of an acceptable compression ratio per update. Thus the user could make a combination use of FFTA and iFFTA for table aggregations: call iFFTA usually and recall the efficient FFTA once the switch is running out of concrete flow table space.
TL;DR: In this article, a vehicle detection method based on a convolutional neural network was proposed, which consists of collecting vehicle and non-vehicle samples and classifying the vehicle samples, the step S1 of preprocessing the samples, step S2 of training a CNN vehicle detector, and step S4 calculating an average similarity table of a characteristic pattern, the phase S5 of constructing a similarity characteristic pattern set, the stage S6 of obtaining a CNN-OP vehicle detector and phase S7 of obtaining detection images, phase S8 of constructing an image pyramid for the detection
Abstract: The invention discloses a vehicle detection method based on a convolutional neural network. The method includes the step S1 of collecting vehicle samples and non-vehicle samples and classifying the vehicle samples, the step S2 of preprocessing the samples, the step S3 of training a CNN vehicle detector, the step S4 calculating an average similarity table of a characteristic pattern, the step S5 of constructing a similarity characteristic pattern set, the step S6 of obtaining a CNN-OP vehicle detector, the step S7 of obtaining detection images, the step S8 of preprocessing the obtained detection images, the step S9 of constructing an image pyramid for the detection images, the step S10 of extracting characteristics, the step S11 of scanning characteristic patterns, the step S12 of classifying the characteristics, and the step S13 of combining detection windows and conducting output. An offline optimization scheme is put forward, the convolutional neural network which is completely trained is optimized, the strategy of scanning the windows after extracting the characteristics is adopted at the detection stage, and therefore the characteristics are prevented from being repeatedly calculated, and the detection speed of the system is increased.
TL;DR: In this article, the authors proposed a memory device that includes a flash memory, a memory, and a controller, which stores an address mapping table recording relationships between logical addresses and physical addresses of the blocks therein.
Abstract: The invention provides a memory device. The memory device includes a flash memory, a memory, and a controller. The flash memory includes a plurality of blocks for data storage. The memory stores an address mapping table recording relationships between logical addresses and physical addresses of the blocks therein. The controller divides the address mapping table stored in the memory to a plurality of mapping table units, updates relationships between the logical addresses and the physical addresses stored in the mapping table units, determines whether data access performed to the flash memory fulfills the conditions of a specific requirement, and when the data access fulfills the conditions of the specific requirement, the controller selects a target mapping table unit from the mapping table units, and stores the target mapping table unit and a corresponding time stamp as a mapping table unit data to the flash memory.
TL;DR: In this paper, a method and a system for improving the large data volume query performance is presented, which consists of loading data in a disk database into distributed caches in a cache ID-entity data key value pair mode, and storing the cache ID and key information of the entity data in cache ID table of a memory database simultaneously.
Abstract: The invention discloses a method and a system for improving the large data volume query performance and belongs to the technical field of large data volume query. The method comprises A, loading data in a disk database into distributed caches in a cache ID-entity data key value pair mode, and storing the cache ID and key information of the entity data in a cache ID table of a memory database simultaneously; B, querying the cache ID table according to a query request when the query request sent by a client is obtained to selecting an ID set meeting the query request; C, obtaining the entity data from corresponding distributed caches according to the cache ID set and returning the entity data to the client. By means o the system and the method, loads of the disk database can be effectively reduced, and the big data query performance is improved.
TL;DR: In this article, a system and method for virtual data warehouses having table link capabilities are provided, in particular some embodiments include a plurality of virtual data warehouse built on top of a data center running Apache Hive.
Abstract: System and method for virtual data warehouses having table link capabilities are provided. In particular, some embodiments include a plurality of virtual data warehouses built on top of a data center running Apache Hive. Each virtual data warehouse can be modeled as a database and manage data in forms of database tables. The virtual data warehouse can include links which import tables from other virtual data warehouses by reference. Each link may contain partition metadata for the table partitions by dates of the source table and retention metadata to declare the needed retention time period for the partitions of the source table. The links can be dynamic and update when the corresponding source table receives new partitions or drops partitions. When a virtual data warehouse is migrated to another data center, the system can retain necessary table partitions to remain on the current data center based on the partition metadata and retention metadata of the links.
TL;DR: In this paper, a multi-column index is generated based on an interleaving of data bits for selectivity for efficient processing of data in a relational database system, and the entries of the relational database table may then be stored according to the index values of the multichannel index.
Abstract: A multi-column index is generated based on an interleaving of data bits for selectivity for efficient processing of data in a relational database system. Two or more columns may be identified for inclusion in the multi-column index for a relational database table. Based, at least in part, on the interleaving of data bits for selectivity from the identified columns, a multi-column index is generated for the relational database table that provides a respective index value for each entry in the relational database table. The entries of the relational database table may then be stored according to the index values of the multi-column index.
TL;DR: Two query-processing algorithms are proposed: one is fast in practice for small queries (with small numbers of patterns as answers) by utilizing the indexes; and the other one is better in theory, with running time linear in the sizes of indexes and answers, which can handle large queries better.
Abstract: We aim to provide table answers to keyword queries against knowledge bases. For queries referring to multiple entities, like "Washington cities population" and "Mel Gibson movies", it is better to represent each relevant answer as a table which aggregates a set of entities or entity-joins within the same table scheme or pattern. In this paper, we study how to find highly relevant patterns in a knowledge base for user-given keyword queries to compose table answers. A knowledge base can be modeled as a directed graph called knowledge graph, where nodes represent entities in the knowledge base and edges represent the relationships among them. Each node/edge is labeled with type and text. A pattern is an aggregation of subtrees which contain all keywords in the texts and have the same structure and types on node/edges. We propose efficient algorithms to find patterns that are relevant to the query for a class of scoring functions. We show the hardness of the problem in theory, and propose path-based indexes that are affordable in memory. Two query-processing algorithms are proposed: one is fast in practice for small queries (with small patterns as answers) by utilizing the indexes; and the other one is better in theory, with running time linear in the sizes of indexes and answers, which can handle large queries better. We also conduct extensive experimental study to compare our approaches with a naive adaption of known techniques.
TL;DR: In this article, a method for narrating a table using at least one narration template, wherein the table is extracted from a data source is provided, and the method may include parsing the extracted table.
Abstract: A method for narrating a table using at least one narration template, wherein the table is extracted from a data source is provided. The method may include parsing the extracted table. The method may also include performing structural analysis on the parsed extracted table. The method may further include selecting at least one structural template based on the structural analysis of the parsed extracted table. Additionally, the method may include selecting the at least one narration template based on the at least one selected structural template. The method may also include applying the at least one selected narration template to the extracted table. The method may further include narrating the extracted table based on the applying of the at least one selected narration template to the extracted table.
TL;DR: In this paper, the problem of finding highly relevant patterns in a knowledge base for user-given keyword queries to compose table answers is studied, and two query processing algorithms are proposed: one is fast in practice for small queries (with small numbers of patterns as answers) by utilizing the indexes; and the other one is better in theory, with running time linear in the sizes of indexes and answers, which can handle large queries better.
Abstract: We aim to provide table answers to keyword queries using a knowledge base. For queries referring to multiple entities, like "Washington cities population" and "Mel Gibson movies", it is better to represent each relevant answer as a table which aggregates a set of entities or joins of entities within the same table scheme or pattern. In this paper, we study how to find highly relevant patterns in a knowledge base for user-given keyword queries to compose table answers. A knowledge base is modeled as a directed graph called knowledge graph, where nodes represent its entities and edges represent the relationships among them. Each node/edge is labeled with type and text. A pattern is an aggregation of subtrees which contain all keywords in the texts and have the same structure and types on node/edges. We propose efficient algorithms to find patterns that are relevant to the query for a class of scoring functions. We show the hardness of the problem in theory, and propose path-based indexes that are affordable in memory. Two query-processing algorithms are proposed: one is fast in practice for small queries (with small numbers of patterns as answers) by utilizing the indexes; and the other one is better in theory, with running time linear in the sizes of indexes and answers, which can handle large queries better. We also conduct extensive experimental study to compare our approaches with a naive adaption of known techniques.
TL;DR: In this article, a method for creating backup of data of a virtual environment to allow non-staged recovery is described, which may include receiving data of virtual environment through one or more data streams for backup.
Abstract: Methods for creating backup of data of a virtual environment to allow non-staged recovery are described. The described method may include receiving data of a virtual environment through one or more data streams for backup. The method also includes generating metadata corresponding to the received data and storing the received data at a first location of a backup storage unit. Further, the method includes storing the generated metadata at a second location of the backup storage unit, where the second location is different from the first location of the backup storage unit. The method further includes mapping the at least one predefined file to the stored data to create a mapping table to allow direct access to the stored data for non-staged recovery.
TL;DR: In this paper, the ASDR table replica can act as a backup of the original table entry in the event the owner node leaves the multi-mode system and returns to the single-node system to find the existence of the value associated with the entry.
Abstract: The present application is directed towards systems and methods of hunting for a hash table entry in a hash table distributed over a multi-node system. More specifically, when entries are created in an ASDR table, the owner node of the entry may replicate the entry onto a non-owner node. The replica can act as a backup of the ASDR table entry in the event the node leaves the multi-mode system. When the node returns to the multi-node system, the node may no longer have the most up to date ASDR table entries, and may hunt to find the existence of the value associated with the entry. Responsive to receiving a request for an entry that may be outdated on the node, the node sends a request down a replication chain for an updated copy of the ASDR table entry from one of the replicas. Responsive to receiving the replica copy of the entry, the node responds to the client's request for the entry.
TL;DR: In this article, a method for providing an answer to at least one analytical question containing at least two tables or at least three charts is presented and narrated in natural language using a narrated answer.
Abstract: A method providing an answer to at least one analytical question containing at least one table or at least one chart is provided. The method may include receiving an input question. The method may also include extracting a plurality of information from the input question based on a natural language analysis. The method may further include forming a well-defined sentence. The method may include extracting at least one table or at least one chart associated with the input question. The method may include forming at least one mathematical equation. The method may also include solving the at least one mathematical equation. The method may include determining the answer to the input question in natural language based on the solved at least one mathematical equation. The method may further include narrating the determined answer to the input question in natural language.
TL;DR: The proposed method Time outX, for the first time, combines traffic characteristics, flow types and Flow Table utilization ratio to decide the timeout of each entry and it outperforms current timeout setting strategies in both metrics of table miss number and blocked packet number, which indicates TimeoutX could make the best of Flow Table and support more flows.
Abstract: In Software Defined Networks (SDN), applications on the controller could enforce fine-grained control on flows by policies employing more packet fields. These policies are converted to flow entries and stored in switch Flow Table. To store these entries, Flow Table requires large storage space because an entry consisted of more packet fields needs more storage space and the number of entries also increases significantly due to fine-granularity definition of flows. However, Flow Table has limited storage space owing to the constraints of Ternary Content Addressable Memory (TCAM). As a result, the switch Flow Table in SDN faces scalability issue. We address this issue by means of adaptive Flow Table management, namely we manage how long the entries occupy the storage space by setting adaptive timeouts to them. Through this means, the storage space could be reused efficiently and more flows could be supported with the same Flow Table (without updating hardware devices). Our proposed method TimeoutX, for the first time, combines traffic characteristics, flow types and Flow Table utilization ratio to decide the timeout of each entry and it outperforms current timeout setting strategies in both metrics of table miss number and blocked packet number, which indicates TimeoutX could make the best of Flow Table and support more flows.
TL;DR: In this paper, a logical network includes at least one logical router and a table mapping engine for generating data tuples for distribution to the managed network elements in order for the managed networks to implement the logical router.
Abstract: Some embodiments provide a network controller for managing a logical network implemented across several managed network elements. The logical network includes at least one logical router. The network controller includes an input interface for receiving configuration state for the logical router. The network controller includes a table mapping engine for generating data tuples for distribution to the managed network elements in order for the managed network elements to implement the logical router. The network controller includes a route processing engine for receiving a set of input routes from the table mapping engine based on the configuration state for the logical router, performing a recursive route traversal process to generate a set of output routes, and returning the set of output routes to the table mapping engine. The table mapping engine uses the set of output routes to generate the data tuples for distribution to the plurality of managed network elements.
TL;DR: In this article, the authors present a method to determine a valid target address for a branch instruction from information stored in a relocation table, a linkage table, or both, the relocation table and the linkage table associated with a binary file and store the valid target addresses in a table in memory.
Abstract: Various embodiments are generally directed to an apparatus, method and other techniques to determine a valid target address for a branch instruction from information stored in a relocation table, a linkage table, or both, the relocation table and the linkage table associated with a binary file and store the valid target address in a table in memory, the valid target address to validate a target address for a translated portion of a routine of the binary file.
TL;DR: An Excel spreadsheet calculator to interpolate the bivariate data with 4 rows by 4 columns using Lagrange interpolation and the excel command given helps in the teaching and learning process of thistopic using Excel spreadsheet.
Abstract: Even though interpolating bivariate data by Lagrange interpolation is straightforward, its repetitivecalculations are quite boring and complicated if the number of data is large. Hence, there is a need to have asuitable tool in teaching and learning Numerical Methods for this topic. To simplify things, we have developedan Excel spreadsheet calculator to interpolate the bivariate data with 4 rows by 4 columns using Lagrangeinterpolation. The spreadsheet calculator can be used by educators and students who need its full solution. Inaddition, users only need to enter a dataset, two independent variables, and the values of the two independentvariables which are not in the dataset to obtain a bivariate approximation solutions table in the respectivetarget cells automatically. Besides, the excel command given helps in the teaching and learning process of thistopic using Excel spreadsheet. Keywords Excel Spreadsheet, bivariate approximation, Lagrange interpolation Cover Page Footnote
TL;DR: In this article, a replay table is generated that is populated with triggers for database operations performed on the source table for subsequent replay for the target partitions, where the database operations are replayed on the target table T subsequent to the moving of the data using the replay table.
Abstract: Partitioning of a source table of a database to a target table is initiated. Thereafter, a replay table is generated that is populated with triggers for database operations performed on the source table for subsequent replay for the target partitions. Data is later moved (e.g., asynchronously moved, etc.) from the source table to the target table. The database operations are replayed on the target table T subsequent to the moving of the data using the replay table. In addition, the source table is dropped when all of the data has been moved to the target table and there are no operations requiring replay. Related apparatus, systems, techniques and articles are also described.
TL;DR: In this article, a temporary anti-rollback table is provided to an electronic device requiring a replacement anti rollback table, and the table is verified by the device, and loaded to memory following a reboot.
Abstract: A temporary anti-rollback table - which is cryptographically signed, unique to a specific device, and includes a version number - is provided to an electronic device requiring a replacement anti-rollback table. The table is verified by the device, and loaded to memory following a reboot. The memory image of the table is used to perform anti-rollback verification of all trusted software components as they are loaded. After booting, the memory image of the table is written in a secure manner to non-volatile memory as a replacement anti-rollback table, and the temporary anti-rollback table is deleted. The minimum required table version number in OTP memory is incremented. The temporary anti-rollback table is created and signed using a private key at authorized service centers; a corresponding public key in the electronic device verifies its authenticity.
TL;DR: In this article, the authors proposed a database data migration method which is applied to a distributed-system cluster environment and is used for migrating data between a first database and a second database.
Abstract: The invention provides a database data migration method which is applied to a distributed-system cluster environment and is used for migrating data between a first database and a second database. The method includes configuring table task information corresponding to multiple table tasks of a database migration task, wherein the table tasks can be dispatched in batch; reading data of to-be-migrated source data tables of the table tasks from the first database according to the dispatched task tables, subjecting the data of the source data tables to sharding to acquire multiple sharding data tables, and importing the sharding data tables into a distributed file system; reading the sharding data tables from the distributed file system, and exporting the sharding data tables into the second database. According to the migration method, different data can be migrated from one database to another database only through one configuration, so that speed and stability of data migration are increased. The invention further provides a database data migration system.