Topic

qsort

About: qsort is a research topic. Over the lifetime, 23 publications have been published within this topic receiving 470 citations. The topic is also known as: quick sort function.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers

Proceedings Article•10.5555/509058.509090•

OpenMP on Networks of Workstations

[...]

Honghui Lu¹, Y. Charlie Hu¹, Willy Zwaenepoel¹•Institutions (1)

Rice University¹

7 Nov 1998

TL;DR: This paper identifies two aspects of the current OpenMP standard that make an implementation on NOWs hard, and suggests simple modifications to the standard that remedy the situation and presents performance results of a prototype implementation of an OpenMP subset on a NOW.

...read moreread less

Abstract: We describe an implementation of a sizable subset of OpenMP on networks of workstations (NOWs). By extending the availability of OpenMP to NOWs, we overcome one of its primary drawbacks compared to MPI, namely lack of portability to environments other than hardware shared memory machines. In order to support OpenMP execution on NOWs, our compiler targets a software distributed shared memory system (DSM) which provides multi-threaded execution and memory consistency. This paper presents two contributions. First, we identify two aspects of the current OpenMP standard that make an implementation on NOWs hard, and suggest simple modifications to the standard that remedy the situation. These problems reflect differences in memory architecture between software and hardware shared memory and the high cost of synchronization on NOWs. Second, we present performance results of a prototype implementation of an OpenMP subset on a NOW, and compare them with hand-coded software DSM and MPI results for the same applications on the same platform. We use five applications (ASCI Sweep3d, NAS 3D- FFT, SPLASH-2 Water, QSORT, and TSP) exhibiting various styles of parallelization, including pipelined execution, data parallelism, coarse-grained parallelism, and task queues. The measurements show little difference between OpenMP and hand-coded software DSM, but both are still lagging behind MPI. Further work will concentrate on compiler optimization to reduce these differences.

...read moreread less

59 citations

Journal Article•10.1002/(SICI)1097-024X(19990410)29:4<341::AID-SPE237>3.3.CO;2-I•

A killer adversary for quicksort

[...]

M. D. McIlroy¹•Institutions (1)

Dartmouth College¹

10 Apr 1999-Software - Practice and Experience

TL;DR: The general method works against any implementation of quicksort – even a randomizing one – that satisfies certain very mild and realistic assumptions.

...read moreread less

Abstract: Quicksort can be made to go quadratic by constructing input on-the-fly in response to the sequence of items compared. The technique is illustrated by a specific adversary for the standard C qsort function. The general method works against any implementation of quicksort – even a randomizing one – that satisfies certain very mild and realistic assumptions. Copyright © 1999 John Wiley & Sons, Ltd.

...read moreread less

48 citations

Journal Article•10.1145/360860.360870•

Some performance tests of “quicksort” and descendants

[...]

R. Loeser¹•Institutions (1)

Smithsonian Institution¹

01 Mar 1974-Communications of The ACM

TL;DR: Detailed performance evaluations are presented for six ACM algorithms, and quicksort requires the fewest comparisons to sort random arrays and qsort requires many more comparisons than its author claims.

...read moreread less

Abstract: Detailed performance evaluations are presented for six ACM algorithms: quicksort (No. 64), Shellsort (No. 201), stringsort (No. 207), “TREESORT3” (No. 245), quickersort (No. 271), and qsort (No. 402). Algorithms 271 and 402 are refinements of algorithm 64, and all three are discussed in some detail. The evidence given here demonstrates that qsort (No. 402) requires many more comparisons than its author claims. Of all these algorithms, quickersort requires the fewest comparisons to sort random arrays.

...read moreread less

35 citations

Fast Out-of-Core Sorting on Parallel Disk Systems

[...]

Matthew D. Pearson

1 Jun 1999

TL;DR: The implementation of Rajasekaran''s (l,m)-mergesort algorithm (LMM) for sorting on parallel disks is discussed, which is asymptotically optimal for large problems and has the additional advantage of a low constant in its I/O complexity.

...read moreread less

Abstract: This paper discusses our implementation of Rajasekaran''s (l,m)-mergesort algorithm (LMM) for sorting on parallel disks. LMM is asymptotically optimal for large problems and has the additional advantage of a low constant in its I/O complexity. Our implementation is written in C using the ViC* I/O API for parallel disk systems. We compare the performance of LMM to that of the C library function qsort on a DEC Alpha server. qsort makes a good benchmark because it is fast and performs comparatively well under demand paging. Since qsort fails when the swap disk fills up, we can only compare these algorithms on a limited range of inputs. Still, on most out-of-core problems, our implementation of LMM runs between 1.5 and 1.9 times faster than qsort, with the gap widening with increasing problem size.

...read moreread less

15 citations

Proceedings Article•10.1109/HPCC/SMARTCITY/DSS.2019.00038•

Efficient Parallel Sort on AVX-512-Based Multi-Core and Many-Core Architectures

[...]

Zekun Yin¹, Tianyu Zhang¹, André Müller², Hui Liu¹, Yanjie Wei, Bertil Schmidt², Weiguo Liu¹ - Show less +3 more•Institutions (2)

Shandong University¹, University of Mainz²

1 Aug 2019

TL;DR: This paper proposes an efficient hybrid sorting method which takes advantage of wide vector registers and the high bandwidth memory of modern AVX-512-based multi-core and many-core processors and shows the extensibility of the vectorized kernels to processing units with a varying of vector lanes.

...read moreread less

Abstract: Sorting kernels are a fundamental part of numerous applications. The performance of sorting implementations is usually limited by a variety of factors such as computing power, memory bandwidth, and branch mispredictions. In this paper we propose an efficient hybrid sorting method which takes advantage of wide vector registers and the high bandwidth memory of modern AVX-512-based multi-core and many-core processors. Our approach employs a combination of vectorized bitonic sorting and load-balanced multi-threaded merging. Thread-level and data-level parallelism are used to exploit both compute power and memory bandwidth. Our single-threaded implementation is ~30x faster than qsort in the C standard library and ~10x faster than C++'s std::sort. Compared with the Intel Performance Primitives (IPP) library which is one of the most efficient CPU-based radix sort implementation, we obtain a speedup of 1.3 to 2.6. Furthermore, we achieve a peak performance of sorting 1.14 billion floats per second on a Xeon Phi 7210 processor. Moreover, we show the extensibility of our vectorized kernels to processing units with a varying of vector lanes.

...read moreread less

14 citations

...

Expand

Performance Metrics

Papers

111

Citations

No. of papers in the topic in previous years
Year	Papers
2020	1
2019	1
2018	2
2016	2
2015	1
2014	2

qsort

Topic Tools

Papers

OpenMP on Networks of Workstations

A killer adversary for quicksort

Some performance tests of “quicksort” and descendants

Fast Out-of-Core Sorting on Parallel Disk Systems

Efficient Parallel Sort on AVX-512-Based Multi-Core and Many-Core Architectures

Related Topics (5)

Performance Metrics