Scispace (Formerly Typeset)
  1. Home
  2. Conferences
  3. Parallel Computing Technologies
  4. 2017
  1. Home
  2. Conferences
  3. Parallel Computing Technologies
  4. 2017
Showing papers presented at "Parallel Computing Technologies in 2017"
Proceedings Article•10.1109/PARCOMPTECH.2017.8068329•
Identifying pitfalls in automatic parallelization of NAS parallel benchmarks

[...]

S. Prema, R. Jehadeesan, B. K. Panigrahi
1 Feb 2017
TL;DR: The need of a user-interactive environment that highlights the problems evoked during parallelization is underlines the obligation for minimal manual intervention concerning coding changes to resolve the problematic code section and make them amenable to parallelization.
Abstract: This paper provides an examination of OpenMP based auto-parallelizers and their limitations encountered during parallelization of NAS parallel benchmarks. It also elucidates the issues faced by the parallelizers during parallelization and the resolutions to overcome the problems. Compute-intensive loops are pinpointed using Gprof and the problematic loops within the hotspot area were recognized. Our work concentrates on identifying the pitfalls within the located hotspots and rendering solution in such cases. Analysis on measured speedup and its reasons are well illustrated. This paper underlines the need of a user-interactive environment that highlights the problems evoked during parallelization. It also underscores the obligation for minimal manual intervention concerning coding changes to resolve the problematic code section and make them amenable to parallelization.

21 citations

Journal Article•10.1007/S00224-016-9699-8•
Atomic Read/Write Memory in Signature-Free Byzantine Asynchronous Message-Passing Systems

[...]

Achour Mostefaoui1, Matoula Petrolia1, Michel Raynal2, Claude Jard1•
University of Nantes1, Institut Universitaire de France2
1 May 2017
TL;DR: This article presents a signature-free distributed algorithm which builds an atomic read/write shared memory on top of a fully connected peer-to-peer n-process asynchronous message-passing system in which up to t
Abstract: This article presents a signature-free distributed algorithm which builds an atomic read/write shared memory on top of a fully connected peer-to-peer n-process asynchronous message-passing system in which up to t

16 citations

Book Chapter•10.1007/978-3-319-62932-2_27•
Automation Development Framework of Scalable Scientific Web Applications Based on Subject Domain Knowledge

[...]

Igor Bychkov1, G. A. Oparin1, Vera G. Bogdanova1, A. A. Pashinin1, S. A. Gorsky1 •
Russian Academy of Sciences1
4 Sep 2017
TL;DR: An architecture and functional capabilities of automated toolkit for the service-oriented application creation based on applied programs package, and multi-agent control of this application parallel running in HDCE are described.
Abstract: Currently high-performance computing technologies using computational capabilities for solving scientific, are actively improving. The purpose of our research is the development of toolkit for construction and execution of scientific service-oriented application in heterogeneous distributed computing environment (HDCE). These tools provide the access for subject domain experts to the high-capacity computing resource, using these resources without extensive knowledge of computing architecture and low-level software, and the parallel execution of the user application on the base of the service-oriented technology and multi-agent control. We describe an architecture and functional capabilities of automated toolkit for the service-oriented application creation based on applied programs package, and multi-agent control of this application parallel running in HDCE. We demonstrate an example of the creation of the web-application for parametric feedback synthesis of linear dynamic object by these tools. The offered technology allows simplifying service creation and provides new qualitative opportunities of controlling parallel high-performance computations.

10 citations

Book Chapter•10.1007/978-3-319-62932-2_24•
Combining Parallelization with Overlaps and Optimization of Cache Memory Usage

[...]

S. G. Ammaev1, L. R. Gervich1, Boris Ya. Steinberg1•
Southern Federal University1
4 Sep 2017
TL;DR: Gauss-Seidel algorithm optimized by modified hyperplane method is faster than non-optimized in 2.5 times and this algorithm was paralleled by the technique of data placement with overlaps and got the speedup in 28 times on 16 processors in comparison with the non- Optimized sequential algorithm.
Abstract: This paper allows L. Lamport hyperplane method modified for improvement of the temporal data locality. Gauss-Seidel algorithm optimized by modified hyperplane method is faster than non-optimized in 2.5 times. This algorithm was paralleled by the technique of data placement with overlaps and we have got the speedup in 28 times on 16 processors in comparison with the non-optimized sequential algorithm.

9 citations

Book Chapter•10.1007/978-3-319-62932-2_31•
Probabilistic Causal Message Ordering

[...]

Achour Mostefaoui1, Stéphane Weiss1•
University of Nantes1
4 Sep 2017
TL;DR: A probabilistic but efficient causal broadcast mechanism for large systems with changing membership that uses few integer timestamps is proposed.
Abstract: Causal broadcast is a classical communication primitive that has been studied for more then three decades and several implementations have been proposed. The implementation of such a primitive has a non negligible cost either in terms of extra information messages have to carry or in time delays needed for the delivery of messages. It has been proved that messages need to carry a control information the size of which is linear with the size of the system. This problem has gained more interest due to new application domains such that collaborative applications are widely used and are becoming massive and social semantic web and linked-data the implementation of which needs causal ordering of messages. This paper proposes a probabilistic but efficient causal broadcast mechanism for large systems with changing membership that uses few integer timestamps.

8 citations

Book Chapter•10.1007/978-3-319-62932-2_42•
Scalable Computations of GeRa Code on the Base of Software Platform INMOST

[...]

Igor N. Konshin1, Ivan Kapyrin1•
Russian Academy of Sciences1
4 Sep 2017
TL;DR: The analysis of scalability of GeRa code on different computer platforms from multicore laptop to Lomonosov supercomputer is presented and the comparison of parallel efficiency for different linear solvers in the INMOST framework is performed.
Abstract: The hydrogeological modeling code GeRa is based on INMOST software platform, which operates with distributed mesh data and allows to assemble and solve the system of linear equations. The set of groundwater flow models with filtration, transport, and chemical processes are considered. The comparison of parallel efficiency for different linear solvers in the INMOST framework is performed. The analysis of scalability of GeRa code on different computer platforms from multicore laptop to Lomonosov supercomputer is presented.

8 citations

Book Chapter•10.1007/978-3-319-62932-2_2•
Generating Maximal Domino Patterns by Cellular Automata Agents

[...]

Rolf Hoffmann1, Dominique Désérable2•
Technische Universität Darmstadt1, Institut national des sciences appliquées2
4 Sep 2017
TL;DR: Considered is a 2D cellular automaton with moving agents that aims to find agents controlled by a Finite State Program (FSP) that can form domino patterns.
Abstract: Considered is a 2D cellular automaton with moving agents. The objective is to find agents controlled by a Finite State Program (FSP) that can form domino patterns. The quality of a formed pattern is measured by the degree of order computed by counting matching \(3 \times 3\) patterns (templates). The class of domino patterns is defined by four templates. An agent reacts on its own color, the color in front, and whether it is blocked or not. It can change the color, move or not, and turn into any direction. Four FSP were evolved for multi-agent systems with 1, 2, 4 agents initially placed in the corners of the field. For a \(12 \times 12\) training field the aimed pattern could be formed with a 100% degree of order. The performance was also high with other field sizes. Livelocks are avoided by using three different variants of the evolved FSP. The degree of order usually fluctuates after reaching a certain threshold, but it can also be stable, and the agents may show the termination by running in a cycle, or by stopping their activity.

8 citations

Book Chapter•10.1007/978-3-319-62932-2_32•
An Experimental Study of Workflow Scheduling Algorithms for Heterogeneous Systems

[...]

Alexey Nazarenko1, Oleg V. Sukhoroslov1•
Russian Academy of Sciences1
4 Sep 2017
TL;DR: The accuracy of the used network model helped to reveal drawbacks of simpler models commonly used for studying scheduling algorithms and developed open source simulation framework based on SimGrid toolkit allowed us to perform a large number of experiments in a reasonable amount of time and to ensure reproducible results.
Abstract: The paper studies the efficiency of nine state-of-the-art algorithms for scheduling of workflow applications in heterogeneous computing systems (HCS). The comparison of algorithms is performed on the base of discrete-event simulation for a wide range of workflow and system configurations. The developed open source simulation framework based on SimGrid toolkit allowed us to perform a large number of experiments in a reasonable amount of time and to ensure reproducible results. The accuracy of the used network model helped to reveal drawbacks of simpler models commonly used for studying scheduling algorithms.

7 citations

Book Chapter•10.1007/978-3-319-62932-2_45•
Parallel Calculation of Diameter Constrained Network Reliability

[...]

Sergei N. Nesterov, Denis A. Migov
4 Sep 2017
TL;DR: The analysis of the numerical experiments has allowed us to set some important parameters of the parallel algorithm for speeding up calculations, which are based on the well-known factoring method and on the factoring methods modification proposed by H. Cancela and L. Petingi.
Abstract: The problem of network reliability calculation in case of the diameter constraint is studied. The problem of computing this characteristic is known to be NP-hard. We introduce the parallel methods, which are based on the well-known factoring method and on the factoring method modification proposed by H. Cancela and L. Petingi. The analysis of the numerical experiments has allowed us to set some important parameters of the parallel algorithm for speeding up calculations.

6 citations

Book Chapter•10.1007/978-3-319-62932-2_37•
Comparison of Auction Methods for Job Scheduling with Absolute Priorities

[...]

A. V. Baranov1, Pavel Telegin1, Artem Tikhomirov1•
Russian Academy of Sciences1
4 Sep 2017
TL;DR: The model of geographically distributed computing system with absolute priorities of jobs is described in the paper and the decentralized scheduling algorithm using the auction methods is designed using the first-price sealed-bid auction and the English auction.
Abstract: The model of geographically distributed computing system with absolute priorities of jobs is described in the paper. Authors designed the decentralized scheduling algorithm using the auction methods. Two auction methods were researched and compared: the first-price sealed-bid auction and the English auction. The paper includes results of experimental comparison of researched auction methods.

6 citations

Book Chapter•10.1007/978-3-319-62932-2_47•
Globalizer – A Parallel Software System for Solving Global Optimization Problems

[...]

Alexander Sysoyev1, Konstantin Barkalov1, Vladislav Sovrasov1, Ilya Lebedev1, Victor Gergel1 •
N. I. Lobachevsky State University of Nizhny Novgorod1
4 Sep 2017
TL;DR: The Globalizer software system is described, which implements an approach to solving the global optimization problems using the block multistage scheme of the dimension reduction, which combines the use of Peano curve type evolvents and the multistages reduction scheme.
Abstract: In this paper, we describe the Globalizer software system for solving global optimization problems. The system implements an approach to solving the global optimization problems using the block multistage scheme of the dimension reduction, which combines the use of Peano curve type evolvents and the multistage reduction scheme. The scheme allows an efficient parallelization of the computations and increasing the number of processors employed in the parallel solving of the global optimization problems many times.
Book Chapter•10.1007/978-3-319-62932-2_18•
Multiple-Precision Residue-Based Arithmetic Library for Parallel CPU-GPU Architectures: Data Types and Features

[...]

Konstantin Isupov, Alexander Kuvaev, Mikhail Popov, Anton Zaviyalov
4 Sep 2017
TL;DR: A new software library for multiple-precision (integer and floating-point) and extended-range computations is considered, targeted at heterogeneous CPU-GPU architectures and the use of residue number system (RNS) lies in the basis of library multiple- Precision modules.
Abstract: In this paper a new software library for multiple-precision (integer and floating-point) and extended-range computations is considered. The library is targeted at heterogeneous CPU-GPU architectures. The use of residue number system (RNS), enabling effective parallelization of arithmetic operations, lies in the basis of library multiple-precision modules. The paper deals with the supported number formats and the library features. An algorithm for the selection of an RNS moduli set for a given precision of computations are also presented.
Book Chapter•10.1007/978-3-319-62932-2_13•
Auto-Vectorization of Loops on Intel 64 and Intel Xeon Phi: Analysis and Evaluation

[...]

Olga V. Moldovanova1, Mikhail G. Kurnosov1•
Russian Academy of Sciences1
4 Sep 2017
TL;DR: This work estimates speedup by running the loops in scalar and vector modes for different data types and determine loop classes which the compilers used in the study fail to vectorize.
Abstract: This paper evaluates auto-vectorizing capabilities of modern optimizing compilers Intel C/C++, GCC C/C++, LLVM/Clang and PGI C/C++ on Intel 64 and Intel Xeon Phi architectures. We use the Extended Test Suite for Vectorizing Compilers consisting of 151 loops. In this work, we estimate speedup by running the loops in scalar and vector modes for different data types and determine loop classes which the compilers used in the study fail to vectorize. We use the dual CPU system (NUMA, 2 x Intel Xeon E5-2620v4, Intel Broadwell microarchitecture) with the Intel Xeon Phi 3120A co-processor for our experiments.
Book Chapter•10.1007/978-3-319-62932-2_11•
The DiamondTetris Algorithm for Maximum Performance Vectorized Stencil Computation

[...]

Vadim D. Levchenko1, Anastasia Y. Perepelkina1•
Keldysh Institute of Applied Mathematics1
4 Sep 2017
TL;DR: An algorithm from the LRnLA family, DiamondTetris, for stencil computation is constructed aimed for Many-Integrated-Core processors of the Xeon Phi family and its strong points are locality, efficient use of memory hierarchy, and, most importantly, seamless vectorization.
Abstract: An algorithm from the LRnLA family, DiamondTetris, for stencil computation is constructed. It is aimed for Many-Integrated-Core processors of the Xeon Phi family. The algorithm and its implementation is described for the wave equation based simulation. Its strong points are locality, efficient use of memory hierarchy, and, most importantly, seamless vectorization. Specifically, only 1 vector rearrange operation is necessary per cell value update. The performance is estimated with the roofline model. The algorithm is implemented in code and tested on Xeon and Xeon Phi machines.
Book Chapter•10.1007/978-3-319-62932-2_38•
Parallel Algorithm for Solving Constrained Global Optimization Problems

[...]

Konstantin Barkalov1, Ilya Lebedev1•
N. I. Lobachevsky State University of Nizhny Novgorod1
4 Sep 2017
TL;DR: An experimental assessment of parallel algorithm efficiency was conducted by finding the numeric solution to several hundred randomly generated multidimensional multiextremal problems with non-convex constraints.
Abstract: This work considers a parallel algorithm for solving multiextremal problems with non-convex constraints. The distinctive feature of this algorithm, which does not use penalty functions, is the separate consideration of each problem constraint. The search process can be conducted by reducing the original multidimensional problem to a number of related one-dimensional problems and solving this set of problems in parallel. An experimental assessment of parallel algorithm efficiency was conducted by finding the numeric solution to several hundred randomly generated multidimensional multiextremal problems with non-convex constraints.
Book Chapter•10.1007/978-3-319-62932-2_43•
Parallel Computing for Time-Consuming Multicriterial Optimization Problems

[...]

Victor Gergel1, Evgeny Kozinov1•
N. I. Lobachevsky State University of Nizhny Novgorod1
4 Sep 2017
TL;DR: An efficient method for parallel solving the time-consuming multicriterial optimization problems, where the optimality criteria can be multiextremal, and the computation of the criteria values can require a large amount of computations is proposed.
Abstract: In the present paper, an efficient method for parallel solving the time-consuming multicriterial optimization problems, where the optimality criteria can be multiextremal, and the computation of the criteria values can require a large amount of computations, is proposed. The proposed scheme of parallel computations allows obtaining several efficient decisions of a multicriterial problem. During performing the computations, the maximum use of the search information is provided. The results of the numerical experiments have demonstrated such an approach to allow reducing the computational costs of solving the multicriterial optimization problems essentially – several tens and hundred times.
Proceedings Article•
A Probabilistic Causal Message Ordering Mechanism

[...]

Achour Mostefaoui, Stéphane Weiss
23 May 2017
TL;DR: This paper proposes a probabilistic but efficient causal broadcast mechanism for large systems with changing membership that uses few integer timestamps.
Abstract: Causal broadcast is a classical communication primitive that has been studied for more then three decades and several implementations have been proposed The implementation of such a primitive has a non negligible cost either in terms of extra information messages have to carry or in time delays needed for the delivery of messages It has been proved that messages need to carry a control information the size of which is linear with the size of the system This problem has gained more interest due to new application domains such that collaborative applications are widely used and are becoming massive and social semantic web and linked-data the implementation of which needs causal ordering of messagesThis paper proposes a probabilistic but efficient causal broadcast mechanism for large systems with changing membership that uses few integer timestamps
Book Chapter•10.1007/978-3-319-62932-2_34•
Islands-of-Cores Approach for Harnessing SMP/NUMA Architectures in Heterogeneous Stencil Computations

[...]

Lukasz Szustak1, Roman Wyrzykowski1, Ondřej Jakl2•
Częstochowa University of Technology1, Academy of Sciences of the Czech Republic2
4 Sep 2017
TL;DR: This paper faces the challenge of harnessing the heterogeneous nature of SMP/NUMA communications for a complex scientific application which implements the Multidimensional Positive Definite Advection Transport Algorithm (MPDATA), consisting of a set of heterogeneous stencil computations.
Abstract: SMP/NUMA systems are powerful HPC platforms which could be applied for a wide range of real-life applications. These systems provide large capacity of shared memory, and allow using the shared-variable programming model to take advantages of shared memory for inter-process communications and synchronizations. However, as data can be physically dispersed over many nodes, the access to various data items may require significantly different times. In this paper, we face the challenge of harnessing the heterogeneous nature of SMP/NUMA communications for a complex scientific application which implements the Multidimensional Positive Definite Advection Transport Algorithm (MPDATA), consisting of a set of heterogeneous stencil computations.
Book Chapter•10.1007/978-3-319-62932-2_16•
Predictive Modeling of Suffocation in Shallow Waters on a Multiprocessor Computer System

[...]

A.I. Sukhinov1, Alla V. Nikitina1, A. E. Chistyakov1, Vladimir Sumbaev1, Maksim Abramov1, Alena Semenyakina2 •
Institute of Service and Entrepreneurship of DGTU1, Southern Federal University2
4 Sep 2017
TL;DR: The model of the algal bloom, causing suffocations in shallow waters takes into account the transport of water environment; microturbulent diffusion; gravitational sedimentation of pollutants and plankton; nonlinear interaction of plankton populations; biogenic, temperature and oxygen regimes; influence of salinity.
Abstract: The model of the algal bloom, causing suffocations in shallow waters takes into account the follows: the transport of water environment; microturbulent diffusion; gravitational sedimentation of pollutants and plankton; nonlinear interaction of plankton populations; biogenic, temperature and oxygen regimes; influence of salinity. The computational accuracy is significantly increased and computational time is decreased at using schemes of high order of accuracy for discretization of the model. The practical significance is the software implementation of the proposed model, the limits and prospects of it practical use are defined. Experimental software was developed based on multiprocessor computer system and intended for mathematical modeling of possible progress scenarios of shallow waters ecosystems on the example of the Azov Sea in the case of suffocation. We used decomposition methods of grid domains in parallel implementation for computationally laborious convection-diffusion problems, taking into account the architecture and parameters of multiprocessor computer system. The advantage of the developed software is also the use of hydrodynamical model including the motion equations in the three coordinate directions.
Book Chapter•10.1007/978-3-319-62932-2_39•
Parallelizing Metaheuristics for Optimal Design of Multiproduct Batch Plants on GPU

[...]

Andrey Borisenko1, Sergei Gorlatch2•
Tambov State Technical University1, University of Münster2
4 Sep 2017
TL;DR: The results of the hybrid metaheuristics approach (ACO+SA) are very near to the global optimal solutions, but they are produced much faster than using the deterministic Branch-and-Bound approach.
Abstract: We propose a metaheuristics-based approach to the optimal design of multi-product batch plants, with a particular application example of chemical-engineering systems. Our hybrid approach combines two metaheuristics: Ant Colony Optimization (ACO) and Simulated Annealing (SA). We develop a sequential implementation of the proposed method and we parallelize it on Graphics Processing Units (GPU) using the CUDA programming environment. We experimentally demonstrate that the results of our hybrid metaheuristic approach (ACO+SA) are very near to the global optimal solutions, but they are produced much faster than using the deterministic Branch-and-Bound approach.
Book Chapter•10.1007/978-3-319-62932-2_40•
The Optimization of Traffic Management for Cloud Application and Services in the Virtual Data Center

[...]

Irina Bolodurina1, Denis Parfenov1•
Orenburg State University1
4 Sep 2017
TL;DR: A simulation model for the traffic in software-defined networks segments of virtual data centers involved in processing user requests to cloud application and services within a network environment is developed and enables to implement the traffic management algorithm of cloud applications and optimize the access to storage systems through the effective use of data transmission channels.
Abstract: Nowadays one of the problems of optimization is the control of the traffic in cloud applications and services in the network environment of virtual data center. Taking into account the multitier architecture of modern data centers, we need to pay a special attention to this task. The advantage of modern infrastructure virtualization is the possibility to use software-defined networks and software-defined data storages. However, the existing optimization of algorithmic solutions does not take into account the specific features of the heterogeneous network traffic routing with multiple application types. The task of optimizing traffic distribution for cloud applications and services can be solved by using software-defined infrastructure of virtual data centers. We have developed a simulation model for the traffic in software-defined networks segments of virtual data centers involved in processing user requests to cloud application and services within a network environment. Our model enables to implement the traffic management algorithm of cloud applications and optimize the access to storage systems through the effective use of data transmission channels. During the experimental studies, we have found that the use of our algorithm enables to decrease the response time of cloud applications and services and, therefore, increase the productivity of user requests processing and reduce the number of refusals.
Book Chapter•10.1007/978-3-319-62932-2_33•
PGAS Approach to Implement Mapreduce Framework Based on UPC Language

[...]

Shomanov Aday1, Akhmed-Zaki Darkhan1, Mansurova Madina1•
Al-Farabi University1
4 Sep 2017
TL;DR: Over the years from its introduction Mapreduce technology proved to be very effective parallel programming technique to process large volumes of data.
Abstract: Over the years from its introduction Mapreduce technology proved to be very effective parallel programming technique to process large volumes of data. One of the most prevalent implementations of Mapreduce is Hadoop framework and Google proprietary Mapreduce system.
Proceedings Article•10.1109/PARCOMPTECH.2017.8068335•
Accelerated spam filtering with enhanced KMP algorithm on GPU

[...]

Venkata Krishna Pavan Kalubandi1, M. Varalakshmi1•
VIT University1
1 Feb 2017
TL;DR: An accelerated spam filtering mechanism that uses GPUs is presented that utilizes an enhanced version of Knuth Morris Pratt pattern matching algorithm that outperforms the serial versions up to 12x and also performs more efficiently compared to other parallel versions.
Abstract: Spam filtering is one of the most important applications in email services that has become increasingly sophisticated due to the enormous usage of Internet. Traditionally, spam filters have been implemented on the CPU with a pattern matching algorithm. In this paper, an accelerated spam filtering mechanism that uses GPUs is presented. The filtering process utilizes an enhanced version of Knuth Morris Pratt pattern matching algorithm that outperforms the serial versions up to 12x and also performs more efficiently compared to other parallel versions. The parallel algorithm is to develop and advanced keyword based Naive Bayesian classifier speeds up the spam filtering up to 2 times compared to CPU.
Book Chapter•10.1007/978-3-319-62932-2_3•
Automated Parallelization of a Simulation Method of Elastic Wave Propagation in Media with Complex 3D Geometry Surface on High-Performance Heterogeneous Clusters

[...]

Nikita Andreevich Kataev1, Alexander Sergeevich Kolganov2, Alexander Sergeevich Kolganov1, Pavel Titov•
Keldysh Institute of Applied Mathematics1, Moscow State University2
4 Sep 2017
TL;DR: Application of DVM and SAPFOR is considered in order to automate mapping of 3D elastic waves simulation method on high-performance heterogeneous clusters and efficiency and acceleration of the parallel program are estimated and performance of the DVMH based program is compared with a program obtained after manual parallelization using MPI programming technology.
Abstract: The paper considers application of DVM and SAPFOR in order to automate mapping of 3D elastic waves simulation method on high-performance heterogeneous clusters. A distinctive feature of the proposed method is the use of a curved three-dimensional grid, which is consistent with the geometry of free surface. Usage of curved grids considerably complicates both manual and automated parallelization. Technique to map curved grid on a structured grid has been presented to solve this problem. The sequential program based on the finite difference method on a structured grid, has been parallelized using Fortran-DVMH language. Application of SAPFOR analysis tools simplified this parallelization process. Features of automated parallelization are described. Authors estimate efficiency and acceleration of the parallel program and compare performance of the DVMH based program with a program obtained after manual parallelization using MPI programming technology.
Book Chapter•10.1007/978-3-319-62932-2_15•
Software Implementation of Mathematical Model of Thermodynamic Processes in a Steam Turbine on High-Performance System

[...]

A.I. Sukhinov1, A. E. Chistyakov1, Alla V. Nikitina1, Irina Yakovenko2, Vladimir Parshukov, Nikolay Efimov, Vadim Kopitsa, Dmitriy Stepovoy •
Institute of Service and Entrepreneurship of DGTU1, Southern Federal University2
4 Sep 2017
TL;DR: The developed model takes into account the complex geometry of the steam turbine, does not require the significant changes in the processing of the design features and can be used to calculate the thermal processes other construction such as turbines.
Abstract: The aim of this paper is the development of the mathematical model of thermal processes in steam turbine based on the modern information technologies and computational methods, with help of which the accuracy of calculations of thermal modes. The practical significance of the paper are: the model of thermal processes in steam turbine is proposed and implemented, the information about the temperature modes of the steam turbine is derived, limits and prospects of the proposed mathematical model is defined. The thermal processes in the turbine are characterized by a strong non-uniformity of the heat flow, which has significantly influence to the reliability and efficiency of the facility. As a rule, it the influence of these parameters on the geometry is not considered in the designing of the system that results in premature wear of the machine. The developed model takes into account the complex geometry of the steam turbine, does not require the significant changes in the processing of the design features and can be used to calculate the thermal processes other construction such as turbines. Software solution was developed for two-dimensional simulation of thermal processes in steam turbine that takes into account the occupancy control volumes.
Book Chapter•10.1007/978-3-319-62932-2_19•
Parallel Implementation of Cellular Automaton Model of the Carbon Corrosion Under the Influence of the Electrochemical Oxidation.

[...]

Anastasiya E. Kireeva, Karl K. Sabelfeld1, N. V. Maltseva, E. N. Gribov1•
Novosibirsk State University1
4 Sep 2017
TL;DR: A cellular automaton model of electrochemical oxidation of the carbon using a two-dimensional sample of the electro-conductive carbon black “Ketjenblack ES DJ 600” and efficiency of the parallel code is analyzed.
Abstract: In the paper we present a cellular automaton model of electrochemical oxidation of the carbon. A two-dimensional sample of the electro-conductive carbon black “Ketjenblack ES DJ 600” is simulated. In the model the sample consists of a ring-formed granules of carbon. The carbon granules under the influence of the electrochemical oxidation are destroyed through a few successive stages. The rates of these oxidation stages are chosen to fit the simulation result with the experiment. In result of a computer simulation of carbon electrochemical oxidation the portions of surface atoms and atoms with different degree of oxidation were calculated and compared with the experimental data. In addition, a parallel implementation of the cellular automaton simulating the carbon corrosion is developed and efficiency of the parallel code is analyzed.
Book Chapter•10.1007/978-3-319-62932-2_20•
A Fine-Grained Parallel Particle Swarm Optimization on Many-core and Multi-core Architectures

[...]

Nadia Nedjah1, Rogério de Moraes Calazan2, Luiza de Macedo Mourelle1•
Rio de Janeiro State University1, Brazilian Navy2
4 Sep 2017
TL;DR: A fine-grained paralellization strategy that focuses on the work done w.r.t. each of the problem dimensions and does it in parallel, which is useful in computationally demanding optimization problems wherein the objective function has a very large number of dimensions.
Abstract: Particle Swarm Optimization (PSO) is a stochastic metaheuristics yet very robust. Real-world optimizations require a high computational effort to converge to a viable solution. In general, parallel PSO implementations provide good performance, but this depends on the parallelization strategy as well as the number and/or characteristics of the exploited processors. In this paper, we propose a fine-grained paralellization strategy that focuses on the work done w.r.t. each of the problem dimensions and does it in parallel. Moreover, all particles act in parallel. This strategy is useful in computationally demanding optimization problems wherein the objective function has a very large number of dimensions. We map the computation onto three different parallel high-performance multiprocessor architectures, which are based on many and multi-core architectures. The performance of the proposed strategy is evaluated for four well-known benchmarks with high-dimension and different complexity. The obtained speedups are very promising.
Book Chapter•10.1007/978-3-319-62932-2_7•
Fragmentation of IADE method using LuNA system

[...]

Norma Alias1, Sergey Kireev•
Universiti Teknologi Malaysia1
4 Sep 2017
TL;DR: A performance comparison of different algorithm’s implementations including LuNA and Message Passing Interface are given and a fragmented numerical algorithm of IADE method is designed in terms of the data-flow graph.
Abstract: The fragmented programming system LuNA is based on the Fragmented Programming Technology. LuNA is a platform for building automatically tunable portable libraries of parallel numerical subroutines. This paper focuses on the parallel implementation of the IADE method for solving 1D partial differential equation (PDE) of parabolic type using LuNA programming system. A fragmented numerical algorithm of IADE method is designed in terms of the data-flow graph. A performance comparison of different algorithm’s implementations including LuNA and Message Passing Interface are given.
Book Chapter•10.1007/978-3-319-62932-2_1•
Experimenting with a Context-Aware Language

[...]

Chiara Bodei1, Pierpaolo Degano1, Gian Luigi Ferrari1, Letterio Galletta1•
University of Pisa1
4 Sep 2017
TL;DR: It will be shown how applications and context interactions can be better specified, analysed and controlled, with the help of some experiments done with a preliminary implementation of \(\text {ML}_\text {CoDa}\).
Abstract: Contextual information plays an increasingly crucial role in concurrent applications in the times of mobility and pervasiveness of computing. Context-Oriented Programming languages explicitly treat this kind of information. They provide primitive constructs to adapt the behaviour of a program, depending on the evolution of its operational environment, which is affected by other programs hosted therein independently and unpredictably. We discuss these issues and the challenges they pose, reporting on our recent work on \(\text {ML}_\text {CoDa}\), a language specifically designed for adaptation and equipped with a clear formal semantics and analysis tools. We will show how applications and context interactions can be better specified, analysed and controlled, with the help of some experiments done with a preliminary implementation of \(\text {ML}_\text {CoDa}\).
Book Chapter•10.1007/978-3-319-62932-2_48•
A novel string representation and kernel function for the comparison of I/O access patterns

[...]

Raúl Botero Torres1, Julian M. Kunkel1, Manuel F. Dolz1, Thomas Ludwig1•
University of Hamburg1
4 Sep 2017
TL;DR: A conversion to a weighted string representation is proposed in this paper, together with a novel string kernel function called Kast Spectrum Kernel, which can be promisingly applied to other similarity problems involving tree-like structured data.
Abstract: Parallel I/O access patterns act as fingerprints of a parallel program. In order to extract meaningful information from these patterns, they have to be represented appropriately. Due to the fact that string objects can be easily compared using Kernel Methods, a conversion to a weighted string representation is proposed in this paper, together with a novel string kernel function called Kast Spectrum Kernel. The similarity matrices, obtained after applying the mentioned kernel over a set of examples from a real application, were analyzed using Kernel Principal Component Analysis (Kernel PCA) and Hierarchical Clustering. The evaluation showed that 2 out of 4 I/O access pattern groups were completely identified, while the other 2 conformed a single cluster due to the intrinsic similarity of their members. The proposed strategy can be promisingly applied to other similarity problems involving tree-like structured data.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve