Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Performance improvement
  4. 2018
  1. Home
  2. Topics
  3. Performance improvement
  4. 2018
Showing papers on "Performance improvement published in 2018"
Journal Article•10.1016/J.COMPBIOLCHEM.2018.03.024•
Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths.

[...]

Ernst Joachim Houtgast1, Vlad-Mihai Sima, Koen Bertels1, Zaid Al-Ars1•
Delft University of Technology1
01 Aug 2018-Computational Biology and Chemistry
TL;DR: This work presents its work on hardware accelerated genomics pipelines, using either FPGAs or GPUs to accelerate execution of BWA-MEM, a widely-used algorithm for genomic short read mapping, and introduces methods to ameliorate the impact of longer read length.

144 citations

Journal Article•10.3390/MI9110557•
Microhotplates for metal oxide semiconductor gas sensor applications—towards the CMOS-MEMS monolithic approach

[...]

Haotian Liu1, Li Zhang1, King Ho Holden Li1, Ooi Kiang Tan1•
Nanyang Technological University1
29 Oct 2018-Micromachines
TL;DR: The sensing mechanism, design and operation of these sensors are reviewed, with focuses on the approaches towards performance improvement and CMOS compatibility.
Abstract: The recent development of the Internet of Things (IoT) in healthcare and indoor air quality monitoring expands the market for miniaturized gas sensors. Metal oxide gas sensors based on microhotplates fabricated with micro-electro-mechanical system (MEMS) technology dominate the market due to their balance in performance and cost. Integrating sensors with signal conditioning circuits on a single chip can significantly reduce the noise and package size. However, the fabrication process of MEMS sensors must be compatible with the complementary metal oxide semiconductor (CMOS) circuits, which imposes restrictions on the materials and design. In this paper, the sensing mechanism, design and operation of these sensors are reviewed, with focuses on the approaches towards performance improvement and CMOS compatibility.

110 citations

Proceedings Article•10.1145/3267809.3267840•
Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training

[...]

Liang Luo1, Jacob Nelson2, Luis Ceze1, Amar Phanishayee2, Arvind Krishnamurthy1 •
University of Washington1, Microsoft2
11 Oct 2018
TL;DR: PHub is proposed, a high performance multi-tenant, rack-scale PS design that co-designs the PS software and hardware to accelerate rack-level and hierarchical cross-rack parameter exchange, with an API compatible with many DDNN training frameworks.
Abstract: Distributed deep neural network (DDNN) training constitutes an increasingly important workload that frequently runs in the cloud. Larger DNN models and faster compute engines are shifting DDNN training bottlenecks from computation to communication. This paper characterizes DDNN training to precisely pinpoint these bottlenecks. We found that timely training requires high performance parameter servers (PSs) with optimized network stacks and gradient processing pipelines, as well as server and network hardware with balanced computation and communication resources. We therefore propose PHub, a high performance multi-tenant, rack-scale PS design. PHub co-designs the PS software and hardware to accelerate rack-level and hierarchical cross-rack parameter exchange, with an API compatible with many DDNN training frameworks. PHub provides a performance improvement of up to 2.7x compared to state-of-the-art cloud-based distributed training techniques for image classification workloads, with 25% better throughput per dollar.

92 citations

Journal Article•10.1364/OE.26.015445•
Waveguide-based electro-absorption modulator performance: comparative analysis

[...]

Rubab Amin1, Jacob B. Khurgin2, Volker J. Sorger1•
George Washington University1, Johns Hopkins University2
11 Jun 2018-Optics Express
TL;DR: In this paper, a holistic performance analysis for waveguide-based electro-absorption modulators is performed and the performance metric switching energy per unit bandwidth (speed) is determined by the ratio of the differential absorption cross-section of the broadening and the waveguide effective mode area.
Abstract: Electro-optic modulators perform a key function for data processing and communication. Rapid growth in data volume and increasing bits per second rates demand increased transmitter and thus modulator performance. Recent years have seen the introduction of new materials and modulator designs to include polaritonic optical modes aimed at achieving advanced performance in terms of speed, energy efficiency, and footprint. Such ad hoc modulator designs, however, leave a universal design for these novel material classes of devices missing. Here we execute a holistic performance analysis for waveguide-based electro-absorption modulators and use the performance metric switching energy per unit bandwidth (speed). We show that the performance is fundamentally determined by the ratio of the differential absorption cross-section of the switching material's broadening and the waveguide effective mode area. We find that the former shows highest performance for a broad class of materials relying on Pauli-blocking (absorption saturation), such as semiconductor quantum wells, quantum dots, graphene, and other 2D materials, but is quite similar amongst these classes. In this respect these materials are clearly superior to those relying on free carrier absorption, such as Si and ITO. The performance improvement on the material side is fundamentally limited by the oscillator sum rule and thermal broadening of the Fermi-Dirac distribution. We also find that performance scales with modal waveguide confinement. Thus, we find highest energy-bandwidth-ratio modulator designs to be graphene, QD, QW, or 2D material-based plasmonic slot waveguides where the electric field is in-plane with the switching material dimension. We show that this improvement always comes at the expense of increased insertion loss. Incorporating fundamental device physics, design trade-offs, and resulting performance, this analysis aims to guide future experimental modulator explorations.

90 citations

Proceedings Article•10.1109/EDGE.2018.00013•
Are Existing Knowledge Transfer Techniques Effective for Deep Learning with Edge Devices

[...]

Ragini Sharma1, Saman Biookaghazadeh1, Baoxin Li1, Ming Zhao1•
Arizona State University1
2 Jul 2018
TL;DR: The results show that the performance of KT does vary by architectures and transfer techniques, and a good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student.
Abstract: With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational-heavy, whereas edge devices are usually equipped with limited computational and storage resources. Such models can be compressed and reduced for deployment on edge devices, but they may lose their capability and not perform well. Recent works used knowledge transfer techniques to transfer information from a large network (termed teacher) to a small one (termed student) in order to improve the performance of the latter. This approach seems to be promising for learning on edge devices, but a thorough investigation on its effectiveness is lacking. This paper provides an extensive study on the performance (in both accuracy and convergence speed) of knowledge transfer, considering different student architectures and different techniques for transferring knowledge from teacher to student. The results show that the performance of KT does vary by architectures and transfer techniques. A good performance improvement is obtained by transferring knowledge from both the intermediate layers and last layer of the teacher to a shallower student. But other architectures and transfer techniques do not fare so well and some of them even lead to negative performance impact.

57 citations

Journal Article•10.1109/TIE.2017.2714144•
Dynamic Performance Improvement of Five-Phase Permanent-Magnet Motor With Short-Circuit Fault

[...]

Huawei Zhou1, Guohai Liu1, Wenxiang Zhao1, Yu Xiaodong1, Menghu Gao1 •
Jiangsu University1
01 Jan 2018-IEEE Transactions on Industrial Electronics
TL;DR: A new vectorial approach to minimize pulsating torque and improve dynamic performance in a five-phase PM motor with short-circuit fault is proposed, which allows minimal reconfiguration of the control structure from healthy operation to fault-tolerance one and exhibits the improved dynamic performance.
Abstract: Multiphase permanent-magnet (PM) brushless motors are popularly adopted for their high efficiency and high power density. However, short-circuit phase fault results in serious problems, such as increased torque fluctuations and deteriorated dynamic performance. This paper proposes a new vectorial approach to minimize pulsating torque and improve dynamic performance in a five-phase PM motor with short-circuit fault. The novelty of the proposed strategy is voltage feedforward compensation based on the relation of the short-circuit current and its fault-phase back electromotive force. First, the compensatory voltages are used to eliminate the impact of the short-circuit current. Then, its combination with the orthogonal reduced-order transformation matrices derived from fault-tolerant current references can improve the dynamic performance of the faulty PM motor. The effect of the short-circuit phase fault on the PM motor model under rotating synchronous frame is also discussed. This control strategy allows minimal reconfiguration of the control structure from healthy operation to fault-tolerant one and exhibits the improved dynamic performance. The simulated and experimental results are presented as validation for the proposed strategy.

54 citations

Proceedings Article•10.1145/3174243.3174260•
Accelerating Graph Analytics by Co-Optimizing Storage and Access on an FPGA-HMC Platform

[...]

Soroosh Khoram1, Jialiang Zhang1, Maxwell Strange1, Jing Li1•
University of Wisconsin-Madison1
15 Feb 2018
TL;DR: An architecture-aware graph clustering algorithm is developed that exploits the FPGA-HMC platform»s capability to improve data locality and memory access efficiency and is further improved by designing a memory request merging unit to take advantage of the increased data locality resulting fromgraph clustering.
Abstract: Graph analytics, which explores the relationships among interconnected entities, is becoming increasingly important due to its broad applicability, from machine learning to social sciences. However, due to the irregular data access patterns in graph computations, one major challenge for graph processing systems is performance. The algorithms, softwares, and hardwares that have been tailored for mainstream parallel applications are generally not effective for massive, sparse graphs from the real-world problems, due to their complex and irregular structures. To address the performance issues in large-scale graph analytics, we leverage the exceptional random access performance of the emerging Hybrid Memory Cube (HMC) combined with the flexibility and efficiency of modern FPGAs. In particular, we develop a collaborative software/hardware technique to perform a level-synchronized Breadth First Search (BFS) on a FPGA-HMC platform. From the software perspective, we develop an architecture-aware graph clustering algorithm that exploits the FPGA-HMC platform»s capability to improve data locality and memory access efficiency. From the hardware perspective, we further improve the FPGA-HMC graph processor architecture by designing a memory request merging unit to take advantage of the increased data locality resulting from graph clustering. We evaluate the performance of our BFS implementation using the AC-510 development kit from Micron and achieve $2.8 \times$ average performance improvement compared to the latest FPGA-HMC based graph processing system over a set of benchmarks from a wide range of applications.

51 citations

Journal Article•10.1007/S12063-018-0129-8•
The moderating effect of management behavior for Lean and process improvement

[...]

Marcel F. van Assen1•
Tilburg University1
31 Jan 2018-Operations Management Research
TL;DR: In this paper, the authors investigate the effect of some important Lean related management actions on the relationship between Lean and the level of process improvement: envisioning and communicating the meaning of Lean, setting goals and active steering on improvement performance metrics and encouraging continuous improvement.
Abstract: It is commonly agreed that the success of Lean management is not only determined by its technical practices, but also by the so-called soft practices such as behavior and actions of employees and management. Lean Management behavior is in itself paradoxical in nature as it incorporates technical aspects (e.g., fact-based management, analysis and adhering to the standard operating procedures for sake of efficiency) and social, follower-related aspects (e.g., promotion of employee responsibility to continuously improve their work processes). In this paper, we investigate the (moderating) effect of some important Lean related management actions on the relationship between Lean and the level of process improvement: i) envisioning and communicating the meaning of Lean, ii) setting goals and active steering on improvement performance metrics and ii) encouraging continuous improvement. Survey data of 178 responses from Dutch organizations, shows that these management actions have a positive effect on both Lean and the level of process improvement. In addition, active steering on performance improvement has a reinforcing effect on the relationship between Lean and process improvement. For respondents with a low level of steering on performance improvement Lean does not lead to process improvement, while it does for respondents with average and high levels of steering on performance improvement. The more management operates on performance improvement, the more Lean will result in a higher level of process improvement.

49 citations

Journal Article•10.1016/J.ENERGY.2018.07.067•
Study on the performance improvement of urban rail transit system

[...]

Pan Deng1, Pan Deng2, Zhao Liting2, Qing Luo2, Chuansheng Zhang, Chen Zejun2 •
University of Wisconsin-Madison1, Tongji University2
15 Oct 2018-Energy
TL;DR: The simulations are used to analyze the influence of train light-weighting, train control, and the load ratio on the energy efficiency of train operation, which can help to improve the system performance.

43 citations

Proceedings Article•10.1109/ASMS-SPSC.2018.8510728•
Geographical Scheduling for Multicast Precoding in Multi-Beam Satellite Systems

[...]

Alessandro Guidotti1, Alessandro Vanelli-Coralli1•
University of Bologna1
1 Sep 2018
TL;DR: In this paper, a Geographical Scheduling Algorithm (GSA) is proposed to improve the performance of the precoding in multi-beam SatCom systems by considering multiple channel matrices.
Abstract: Current State-of-the-Art High Throughput Satellite systems provide wide-area connectivity through multi-beam architectures. Due to the tremendous system throughput requirements that next generation Satellite Communications (SatCom) expect to achieve, traditional 4-colour frequency reuse schemes are not sufficient anymore and more aggressive solutions as full frequency reuse are being considered for multi-beam SatCom. These approaches require advanced interference management techniques to cope with the significantly increased inter-beam interference both at the transmitter, e.g., precoding, and at the receiver, e.g., Multi User Detection (MUD). With respect to the former, several peculiar challenges arise when designed for SatCom systems. In particular, multiple users are multiplexed in the same transmission radio frame, thus imposing to consider multiple channel matrices when computing the precoding coefficients. In previous works, the main focus has been on the users’ clustering and precoding design. However, even though achieving significant throughput gains, no analysis has been performed on the impact of the system scheduling algorithm on multicast precoding, which is typically assumed random. In this paper, we focus on this aspect by showing that, although the overall system performance is improved, a random scheduler does not properly tackle specific scenarios in which the precoding algorithm can poorly perform. Based on these considerations, we design a Geographical Scheduling Algorithm (GSA) aimed at improving the precoding performance in these critical scenarios and, consequently, the performance at system level as well. Through extensive numerical simulations, we show that the proposed GSA provides a significant performance improvement with respect to the legacy random scheduling.

37 citations

Journal Article•10.1049/IET-ITS.2017.0059•
Feature selection-based approach for urban short-term travel speed prediction

[...]

Liang Zheng, Chuang Zhu, Ning Zhu, Tian He, Ni Dong1, Helai Huang •
Southwest Jiaotong University1
01 Aug 2018-Iet Intelligent Transport Systems
TL;DR: This study proposes a feature selection-based approach to identify reasonable spatial-temporal traffic patterns related to the target link, in order to improve the online-prediction performance and is a promising methodology for short-term traffic prediction.
Abstract: This study proposes a feature selection-based approach to identify reasonable spatial-temporal traffic patterns related to the target link, in order to improve the online-prediction performance. The prediction task is composed of two steps: one hybrid intelligent algorithm-based feature selector (FS) is proposed to optimise original state vectors, which are designed empirically during the offline process and optimised state vectors are employed to carry out the online prediction. Numerical experiments by three non-parametric algorithms are conducted with taxis' global positioning system data in an urban road network of Changsha, China. It is concluded that: (i) under optimised state vectors, the prediction accuracies improve or almost maintain the same; (ii) K-nearest neighbour (KNN) with the simplest state vectors obtains the greatest improvement of prediction performance; (iii) although the performance improvement of e-support vector regression is limited with optimised state vectors, it always outperforms backward-propagation neural network and KNN; and (iv) three non-parametric approaches with optimised state vectors outperform auto-regressive integrated moving average in relatively longer prediction horizons. In conclusion, such FS-based approach is able to improve or guarantee the prediction performance under the remarkably reduced model complexity, and is a promising methodology for short-term traffic prediction.
Proceedings Article•10.1145/3208040.3208047•
PShifter: feedback-based dynamic power shifting within HPC jobs for performance

[...]

Neha Gholkar1, Frank Mueller1, Barry Rountree2, Aniruddha Marathe2•
North Carolina State University1, Lawrence Livermore National Laboratory2
11 Jun 2018
TL;DR: To the best of the knowledge, PShifter is the first approach to transparently and automatically apply power capping non-uniformly across processors of a job in a dynamic manner adapting to phase changes.
Abstract: The US Department of Energy (DOE) has set a power target of 20-30MW on the first exascale machines. To achieve one exaFLOPS under this power constraint, it is necessary to manage power intelligently while maximizing performance. Most production-level parallel applications suffer from computational load imbalance across distributed processes due to non-uniform work decomposition. Other factors like manufacturing variation and thermal variation in the machine room may amplify this imbalance. As a result of this imbalance, some processes of a job reach the blocking calls, collectives or barriers earlier and wait for others to reach the same point. This waiting results in a wastage of energy and CPU cycles which degrades application efficiency and performance.We address this problem for power-limited jobs via Power Shifter (PShifter), a dual-level, feedback-based mechanism that intelligently and automatically detects such imbalance and reduces it by dynamically re-distributing a job's power budget across processors to improve the overall performance of the job compared to a naive uniform power distribution across nodes. In contrast to prior work, PShifter ensures that a given power budget is not violated. At the bottom level of PShifter, local agents monitor and control the performance of processors by actuating different power levels. They reduce power from the processors that incur substantial wait times. At the top level, the cluster agent that has the global view of the system, monitors the job's power consumption and provides feedback on the unused power, which is then distributed across the processors of the same job. Our evaluation on an Intel cluster shows that PShifter achieves performance improvement of up to 21% and energy savings of up to 23% compared to uniform power allocation, outperforms static approaches by up to 40% and 22% for codes with and without phase changes, respectively, and outperforms dynamic schemes by up to 19%. To the best of our knowledge, PShifter is the first approach to transparently and automatically apply power capping non-uniformly across processors of a job in a dynamic manner adapting to phase changes.
Journal Article•10.1109/TCAD.2017.2766156•
DLV: Exploiting Device Level Latency Variations for Performance Improvement on Flash Memory Storage Systems

[...]

Jinhua Cui1, Youtao Zhang2, Weiguo Wu1, Jun Yang2, Yinfeng Wang3, Jianhang Huang1 •
Xi'an Jiaotong University1, University of Pittsburgh2, Shenzhen Institute of Information Technology3
01 Aug 2018-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
TL;DR: DLV improves flash access speeds based on process variations and data retention time difference across flash blocks and integrates access speed optimization with access scheduling such that the average access response time can be effectively reduced on flash memory storage systems.
Abstract: NAND flash has been widely adopted in storage systems due to its better read and write performance and lower power consumption over traditional mechanical hard drives. To meet the increasing performance demand of modern applications, recent studies speed up flash accesses by exploiting access latency variations at the device level. Unfortunately, existing flash access schedulers are still oblivious to such variations, leading to suboptimal I/O performance improvements. In this paper, we propose DLV, a novel flash access scheduler for exploring scheduling opportunities due to device level access latency variations. DLV improves flash access speeds based on process variations and data retention time difference across flash blocks. More importantly, DLV integrates access speed optimization with access scheduling such that the average access response time can be effectively reduced on flash memory storage systems. Our experimental results show that DLV achieves an average of 41.5% performance improvement over the state-of-the-art.
Proceedings Article•10.1109/HPCA.2018.00057•
SmarCo: An Efficient Many-Core Processor for High-Throughput Applications in Datacenters

[...]

Dongrui Fan1, Li Wenming1, Xiaochun Ye1, Da Wang1, Hao Zhang1, Zhimin Tang1, Ninghui Sun1 •
Chinese Academy of Sciences1
1 Feb 2018
TL;DR: This paper proposes a novel architecture, called SmarCo, which allows high-throughput applications to be processed more efficiently in datacenters, and implements large-scale many-core architecture with in-pair threads to support high-concurrency processing and introduces a hierarchical ring topology and laxity-aware task scheduler to guarantee hard real-time response.
Abstract: Fast-growing high-throughput applications, such as web services, are characterized by high-concurrency processing, hard real-time response, and high-bandwidth memory access. The newly-born applications bring severe challenges to processors in datacenters, both in concurrent processing performance and energy efficiency. To offer a satisfactory quality of services, it is of critical importance to meet these newly emerging demands of high-throughput applications in the future datacenters in a more efficient way. In this paper, we propose a novel architecture, called SmarCo, which allows high-throughput applications to be processed more efficiently in datacenters. Based on the dominant characteristics of high-throughput applications, we implement large-scale many-core architecture with in-pair threads to support high-concurrency processing; we also introduce a hierarchical ring topology and laxity-aware task scheduler to guarantee hard real-time response; furthermore, we propose high-throughput datapath to improve memory access efficiency. We verify the efficiency of SmarCo by using simulators, large-scale FPGA and prototype with TSMC 40-nm technology node. The experimental results show that, compared to Intel Xeon E7-8890V4, SmarCo achieves 10.11X performance improvement and 6.95X energy-efficiency improvement with higher throughput and a better guarantee of real-time response.
Journal Article•10.1109/TCAD.2017.2772822•
Profit: Pr iority and P o wer/Per f ormance Opt i miza t ion for Many-Core Systems

[...]

Zhuo Chen1, Dimitrios Stamoulis1, Diana Marculescu1•
Carnegie Mellon University1
01 Oct 2018-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
TL;DR: In this paper, an online distributed reinforcement learning (OD-RL)-based DVFS control algorithm for many-core system performance improvement under both power and performance constraints is presented, where a per-core RL method is used to learn the optimal control policy of the voltage/frequency (VF) levels in a model-free manner.
Abstract: As power density emerges as the main constraint for many-core systems, controlling power consumption under the thermal design power while maximizing the performance becomes increasingly critical To dynamically save power, dynamic voltage frequency scaling techniques have proved to be effective and are widely available commercially Meanwhile, systems have certain performance constraints that the applications should satisfy to ensure quality of service In this paper, we present an online distributed reinforcement learning (OD-RL)-based DVFS control algorithm for many-core system performance improvement under both power and performance constraints At the finer grain, a per-core RL method is used to learn the optimal control policy of the voltage/frequency (VF) levels in a model-free manner At the coarser grain, an efficient global power budget reallocation algorithm is used to maximize the overall performance The experiments show that compared to the state-of-the-art algorithms: 1) OD-RL produces up to 98% less budget overshoot; 2) up to 23% higher energy efficiency; and 3) two orders of magnitude speedup over state-of-the-art techniques for systems with hundreds of cores Furthermore, priority-aware OD-RL can better satisfy performance constraints than OD-RL with: 1) $178\boldsymbol {\times }$ more epochs satisfying the performance constraints; 2) $56\boldsymbol {\times }$ better performance gain; and 3) $200\boldsymbol {\times }$ better performance-power tradeoffs under similar efficiency and scalability
Proceedings Article•10.1109/IEEE-IWS.2018.8400845•
Dual-input driving strategies for performance enhancement of a doherty power amplifier

[...]

Anna Piacibello1, Roberto Quaglia2, Vittorio Camarchia1, Chiara Ramella1, Marco Pirola1 •
Polytechnic University of Turin1, Cardiff University2
2 Jul 2018
TL;DR: The aim of this work is to assess the performance improvement offered by several driving strategies of a dual-input digital Doherty power amplifier with respect to the equivalent single-input topology and shows a superior performance of the digital DPA over the analog one, thus justifying the additional.
Abstract: The aim of this work is to assess the performance improvement offered by several driving strategies of a dual-input digital Doherty power amplifier with respect to the equivalent single-input topology. To offer a fair comparison, an analog amplifier and the equivalent digital version, which is equal in all parts except for the absence of the input power divider, are designed at 3.5 GHz. The flexibility of a dual-input control allows to implement power-dependent input signal splitting and phase alignment between the main and auxiliary branches, thus allowing to overcome several shortcomings of traditional analog Doherty amplifiers. The proposed analysis focuses on the gain and efficiency performance over a 6 dB back-off range. The comparison over the 3.1–3.7 GHz range shows a superior performance of the digital DPA over the analog one, thus justifying the additional.
Dissertation•
The effects of strategic attributes on organizational performance in the banking sector of Pakistan

[...]

Ammar Ahmed
1 Jan 2018
TL;DR: In this article, the authors examined the mediating effect of organizational commitment on the relationship between strategic orientation, organizational culture, organizational IMO, and organizational performance, and found significant positive direct relationships between organizational commitment and organizational culture.
Abstract: In recent times, there has been an increasing interest in the strategic attributes which aims to achieve the superior organizational performance that allows organizations, including the banks, to be competitive with time. Therefore, to achieve superior organizational performance and successful bank growth, the banks need to focus on their strategic attributes. The key strategic attributes include strategic orientation, organizational culture, organizational IMO, and organizational commitment. Drawing upon the resource-based view theory (RBV) and the social exchange theory (SET), this study examined the influence of these strategic attributes on organizational performance. Moreover, this study also examined the mediating effect of organizational commitment on the relationship between strategic orientation, organizational culture, organizational IMO and organizational performance. The data was collected from the 260 bank managers working in the branches of six-large banks of Pakistan. The results of PLS path modeling revealed the significant positive direct relationships between strategic orientation, organizational culture, organizational IMO and organizational commitment, and organizational performance. Similarly, the study also found significant positive direct relationships between strategic orientation and organizational culture, and organizational commitment. However, no significant relationship existed between organizational IMO and organizational commitment. Furthermore, the bootstrapping results revealed that organizational commitment mediated the relationships between strategic orientation, organizational culture, and organizational performance. In contrast, the study did not find any mediation of organizational commitment between organizational IMO and organizational performance relationship. In general, the findings showcased that organizational performance can be enhanced through the examined key strategic attributes of the study. Accordingly, the study has forwarded noteworthy claims regarding the mediating effect of organizational commitment on these variables. The study offers theoretical and practical contributions. This study also highlights the crucial role of these strategic attributes for performance improvement in the banking sector. Lastly, limitations and scope of further studies are also provided.
Journal Article•10.1016/J.COMPELECENG.2018.05.012•
Dynamic performance improvement of an ultra-lift Luo DC–DC converter by using a type-2 fuzzy neural controller

[...]

Amir Sharifian, Samaneh Fathi Sasansara1, M. Jabbari Ghadi2, Sahand Ghavidel2, Li Li, Jiangfeng Zhang2 •
University of Gilan1, University of Technology, Sydney2
01 Jul 2018-Computers & Electrical Engineering
TL;DR: A new control approach based on type-2 fuzzy neural controller (T2FNC) is employed in order to improve the dynamic response of an ultra-lift Luo DC–DC converter under different operational conditions.
Proceedings Article•10.1109/TCSET.2018.8336396•
Intelligent data flows management for performance improvement of optical label switched network

[...]

Volodymyr Andrushchak1, Taras Maksymyuk1, Stepan Dumych1, Mykola Kaidan1, Oksana Urikova1 •
Lviv Polytechnic1
1 Feb 2018
TL;DR: A new algorithm is proposed for data Hows management in optical label switched networks that provides the intelligence of scheduling and quality control functionality by using machine learning techniques.
Abstract: Modern optical transport networks are currently facing an unprecedented traffic growth driven by rapid development of cloud technologies, Internet of Things and ubiquitous computing. The global data volume doubles every two year, requiring urgent improvement of the transport infrastructure around the world. In this paper, we propose a new algorithm for data Hows management in optical label switched networks. Unlike existing solutions, our algorithm provides the intelligence of scheduling and quality control functionality by using machine learning techniques. Intelligence, introduced to the network, improves the accuracy of scheduling and overall performance. Although, initially our algorithm does not provide the near optimal performance like many other approaches, it is able to improve over time by learning from previous experience.
Proceedings Article•10.1109/AMC.2019.8371136•
Improving transient learning behavior in model-free inversion-based iterative control with application to a desktop printer

[...]

Robin de Rozario1, Tom Oomen1•
Eindhoven University of Technology1
1 Jun 2018
TL;DR: The Smoothed MFIIC (SMFIIC) method is developed, which does not suffer from the undesirable learning transient behavior and is achieved by adaptively regulating the learning speed to ensure smooth convergence.
Abstract: Model-Free Inversion-based Iterative Control (MFIIC) enables tracking performance improvement of systems that perform repeating tasks without using a model of the system. The aim of this paper is (i) to show that MFIIC can result in a severe loss of performance if the Signal-to-Disturbance-Ratio (SDR) approaches 1, and (ii) to propose a solution to this problem. The Smoothed MFIIC (SMFIIC) method is developed, which does not suffer from the undesirable learning transient behavior. This is achieved by adaptively regulating the learning speed to ensure smooth convergence. The existence of bad learning transients in MFIIC and the efficacy of SMFIIC are illustrated on an experimental desktop printer.
Journal Article•10.25271/SJUOZ.2018.6.3.516•
A State of Art Survey for OS Performance Improvement

[...]

Lailan M. Haji1, Subhi R. M. Zeebaree, Karwan Jacksi1, Diyar Qader Zeebaree•
University of Zakho1
30 Sep 2018-Science Journal of University of Zakho
TL;DR: A survey of the most important and state of the art approaches and models to be used for performance measurement and evaluation of different operating systems using multiple metrics is presented.
Abstract: Through the huge growth of heavy computing applications which require a high level of performance, it is observed that the interest of monitoring operating system performance has also demanded to be grown widely. In the past several years since OS performance has become a critical issue, many research studies have been produced to investigate and evaluate the stability status of OSs performance. This paper presents a survey of the most important and state of the art approaches and models to be used for performance measurement and evaluation. Furthermore, the research marks the capabilities of the performance-improvement of different operating systems using multiple metrics. The selection of metrics which will be used for monitoring the performance depends on monitoring goals and performance requirements. Many previous works related to this subject have been addressed, explained in details, and compared to highlight the top important features that will very beneficial to be depended for the best approach selection.
Proceedings Article•10.1109/SMARTCOMP.2018.00019•
Performance Improvement of File Operations on OverlayFS for Containers

[...]

Naoki Mizusawa1, Joichiro Kon2, Yuya Seki1, Jian Tao1, Saneyasu Yamaguchi3 •
Kogakuin University1, Texas A&M University2, Tohoku University3
1 Jun 2018
TL;DR: This work evaluates the performance of file operations on OverlayFS and discusses a method for improving the performance by disabling this synchronization, and shows that the method can significantly improve the writing performance with copy_up by 680 times at most.
Abstract: Server consolidation with virtualization is a popular method to address the issue of a large amount of power consumption of inter-connected computers in data centers. The more computers are consolidated, the more energy is saved. However, highly consolidating, wherein many servers are consolidated into one physical computer, results in large performance decline. Especially, I/O performance is severely decreased as reported. In this work, we focus on Docker, a popular container-based virtualizing system, and OverlayFS. OverlayFS is widely recognized method for improving I/O performance in Docker. First, we evaluate the performance of file operations on OverlayFS. In particular, we focus on the performance of file writing involving copy_up and show that the performance is severely low. Second, we investigate the performance and behavior the filesystem during copy_up and demonstrate that synchronization is the most important issue. Third, we discuss a method for improving the performance by disabling this synchronization. Fourth, we evaluate the improving method and show that the method can significantly improve the writing performance with copy_up by 680 times at most.
Proceedings Article•10.1109/TEST.2018.8624794•
Transmitter and Receiver Equalizers Optimization Methodologies for High-Speed Links in Industrial Computer Platforms Post-Silicon Validation

[...]

Francisco E. Rangel-Patino1, Jose E. Rayas-Sanchez1, Nagib Hakim2•
University of Guadalajara1, Intel2
1 Oct 2018
TL;DR: Direct and surrogate-based optimization methods, including space mapping, are proposed based on suitable objective functions to efficiently tune the transmitter and receiver equalizers in physical layer (PHY) tuning process, confirming dramatic speed up in PHY tuning and substantial performance improvement.
Abstract: As microprocessor design scales to nanometric technology, traditional post-silicon validation techniques are inappropriate to get a full system functional coverage. Physical complexity and extreme technology process variations introduce design challenges to guarantee performance over process, voltage, and temperature conditions. In addition, there is an increasingly higher number of mixed-signal circuits within microprocessors. Many of them correspond to high-speed input/output (HSIO) links. Improvements in signaling methods, circuits, and process technology have allowed HSIO data rates to scale beyond 10 Gb/s, where undesired effects can create multiple signal integrity problems. With all of these elements, post-silicon validation of HSIO links is tough and time-consuming. One of the major challenges in electrical validation of HSIO links lies in the physical layer (PHY) tuning process, where equalization techniques are used to cancel these undesired effects. Typical current industrial practices for PHY tuning require massive lab measurements, since they are based on exhaustive enumeration methods. In this work, direct and surrogate-based optimization methods, including space mapping, are proposed based on suitable objective functions to efficiently tune the transmitter and receiver equalizers. The proposed methodologies are evaluated by lab measurements on realistic industrial post-silicon validation platforms, confirming dramatic speed up in PHY tuning and substantial performance improvement.
Journal Article•10.1016/J.COMPIND.2018.02.008•
An integrated performance driven manufacturing management strategy based on overall system effectiveness

[...]

Boyd A. Nicholds1, John P.T. Mo1, Leigh O’Rielly•
RMIT University1
01 May 2018-Computers in Industry
TL;DR: An Overall System Efficiency (OSE) decision support model is described for use in the analysis and prediction of customer satisfaction goals, which uses customer service level in terms of stockout frequency as a trade-off parameter when optimising overall performance achievable from the production line.
Journal Article•10.1109/TR.2017.2743225•
Maintenance Optimization of Continuous State Systems Based on Performance Improvement

[...]

Zhiqiang Cai, Shubin Si, Yan Liu, Jiangbin Zhao
01 Jun 2018-IEEE Transactions on Reliability
TL;DR: The performance improvement for a continuous state system is introduced, which can be used to measure the improvement of systems performance comparing pre- and postmaintenance time.
Abstract: The continuous state system is a special kind of a system in which the states of the system and its components have continuous values, ranging from perfect functioning to complete failure. This paper introduces the performance improvement for a continuous state system, which can be used to measure the improvement of systems performance comparing pre- and postmaintenance time. The probabilistic characteristics of performance improvement are discussed in detail. Then, the performance improvement for multicomponent maintenance and corresponding calculation method are also put forward to establish the objective function for maintenance optimization. Third, a maintenance optimization model for such a system is studied, and corresponding performance improvement based genetic algorithm is provided to search a near global optimal solution. Finally, two numerical examples and an oil transportation system application case study are implemented to verify the effectiveness of the proposed method.
Book•
Computational Paradigm Techniques for Enhancing Electric Power Quality

[...]

L. Ashok Kumar, S. Albert Alexander
3 Dec 2018
TL;DR: In this article, power quality improvement and enhancement techniques with the aid of intelligent controllers and experimental results are discussed, which helps readers understand the power quality from its fundamental to experimental implementations.
Abstract: This book focusses on power quality improvement and enhancement techniques with aid of intelligent controllers and experimental results. It covers topics ranging from the fundamentals of power quality indices, mitigation methods, advanced controller design and its step by step approach, simulation of the proposed controllers for real time applications and its corresponding experimental results, performance improvement paradigms and its overall analysis, which helps readers understand power quality from its fundamental to experimental implementations. The book also covers implementation of power quality improvement practices. Key Features Provides solution for the power quality improvement with intelligent techniques Incorporated and Illustrated with simulation and experimental results Discusses renewable energy integration and multiple case studies pertaining to various loads Combines the power quality literature with power electronics based solutions Includes implementation examples, datasets, experimental and simulation procedures
Proceedings Article•10.1109/ASAP.2018.8445131•
Compressive Sensing on Storage Data: An Effective Solution to Alleviate I/0 Bottleneck in Data- Intensive Workloads

[...]

Hosein Mohammadi Makrani1, Hossein Sayadi1, Sai Manoj1, Setareh Raftirad1, Houman Homayoun1 •
George Mason University1
1 Jul 2018
TL;DR: By using Compressive Sensing (CS), a lossy data compression method, the bottleneck is lifted from the storage, increasing the bandwidth utilization of the memory to gain further performance improvement from a high-end memory.
Abstract: The gap between computation speed and I/O access on modern computing systems imposes processing limitations in data-intensive applications. Employing high-end memory has proven not to enhance the performance for I/O bound applications, given the low utilization of memory bandwidth in such applications, as highlighted in recent studies. Despite several solutions to improve the performance of storage, none of them is able to shift the bottleneck from the I/O access to the memory subsystem for I/O bound applications. In this paper, we show that in the case of data-intensive multimedia applications, by using Compressive Sensing (CS), a lossy data compression method, the bottleneck is lifted from the storage, increasing the bandwidth utilization of the memory to gain further performance improvement from a high-end memory. The reconstruction of compressed data is however time and memory consuming. To address this challenge, we employ and compare the hardware and software acceleration of Orthogonal Matching Pursuit (OMP), a greedy algorithm, which solves the problem by choosing the most significant variable to reduce the least square error. Our implementation results show that CS increases memory bandwidth utilization by 1.4x and using high bandwidth memory results in 24% performance improvement. Overall, the proposed solution of CS of storage data with FPGA accelerator achieves up to 45% speedup in an end-to-end implementation by only 4.6% accuracy degradation.
Proceedings Article•10.1109/IC2E.2018.00057•
Tuning Performance of Spark Programs

[...]

Hong Zhang1, Zixia Liu1, Liqiang Wang1•
University of Central Florida1
17 Apr 2018
TL;DR: An efficient performance optimization engine called Hedgehog is proposed to evaluate the performance based on "Law of Diminishing Marginal Utility" and give an optimal configuration setting and show that this optimization can gain 19.6% performance improvement compared to the naive configuration.
Abstract: Along with the explosive growth of data, there is a great demand to speedup the ability to process them. Although there are several platforms such as Spark that have made analysis easier to developers, the performance tuning for such platforms meanwhile becomes complex. In this paper, we propose an efficient performance optimization engine called Hedgehog to evaluate the performance based on "Law of Diminishing Marginal Utility" and give an optimal configuration setting. The initial experiments show that our optimization can gain 19.6% performance improvement compared to the naive configuration by tuning only 3 parameters.
Journal Article•10.25046/AJ030321•
A Survey on Parallel Multicore Computing: Performance & Improvement

[...]

Ola Surakhi, Mohammad Khanafseh, Sami Sarhan
01 Jun 2018-Advances in Science, Technology and Engineering Systems Journal
Book•
10-Step Evaluation for Training and Performance Improvement

[...]

Seung Youn Chyung
5 Nov 2018
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve