About: Merge algorithm is a research topic. Over the lifetime, 541 publications have been published within this topic receiving 11606 citations. The topic is also known as: merge.
TL;DR: Tight upper and lower bounds are provided for the number of inputs and outputs (I/OS) between internal memory and secondary storage required for five sorting-related problems: sorting, the fast Fourier transform (FFT), permutation networks, permuting, and matrix transposition.
Abstract: We provide tight upper and lower bounds, up to a constant factor, for the number of inputs and outputs (I/OS) between internal memory and secondary storage required for five sorting-related problems: sorting, the fast Fourier transform (FFT), permutation networks, permuting, and matrix transposition. The bounds hold both in the worst case and in the average case, and in several situations the constant factors match. Secondary storage is modeled as a magnetic disk capable of transferring P blocks each containing B records in a single time unit; the records in each block must be input from or output to B contiguous locations on the disk. We give two optimal algorithms for the problems, which are variants of merge sorting and distribution sorting. In particular we show for P = 1 that the standard merge sorting algorithm is an optimal external sorting method, up to a constant factor in the number of I/Os. Our sorting algorithms use the same number of I/Os as does the permutation phase of key sorting, except when the internal memory size is extremely small, thus affirming the popular adage that key sorting is not faster. We also give a simpler and more direct derivation of Hong and Kung's lower bound for the FFT for the special case B = P = O(1).
TL;DR: A parallel implementation of merge sort on a CREW PRAM that uses n processors and O(logn) time; the constant in the running time is small.
Abstract: We give a parallel implementation of merge sort on a CREW PRAM that uses n processors and $O(\log n)$ time; the constant in the running time is small. We also give a more complex version of the algorithm for the EREW PRAM; it also uses n processors and $O(\log n)$ time. The constant in the running time is still moderate, though not as small.
TL;DR: The paper shows that the order of sorting and the types of sorting used form differences among the existing hidden-surface algorithms.
Abstract: : The paper asserts that the hidden-surface problem is mainly one of sorting. The various surfaces of an object to be shown in hidden-surface or hidden-line form must be sorted to find out which ones are visible at various places on the screen. Surfaces may be sorted by lateral position in the picture (XY), by depth (Z), or by other criteria. The paper shows that the order of sorting and the types of sorting used form differences among the existing hidden-surface algorithms. (Modified author abstract)
TL;DR: A fast sorting algorithm for the Connection Machine Supercomputer model CM-2 is developed and it is shown that any U(lg n)-depth family of sorting networks can be used to sort n numbers in U( lg n) time in the bounded-degree fixed interconnection network domain.
Abstract: Sorting is arguably the most studied problem in computer science, both because it is used as a substep in many applications and because it is a simple, combinatorial problem with many interesting and diverse solutions. Sorting is also an important benchmark for parallel supercomputers. It requires significant communication bandwidth among processors, unlike many other supercomputer benchmarks, and the most efficient sorting algorithms communicate data in irregular patterns. Parallel algorithms for sorting have been studied since at least the 1960’s. An early advance in parallel sorting came in 1968 when Batcher discovered the elegant U(lg2 n)-depth bitonic sorting network [3]. For certain families of fixed interconnection networks, such as the hypercube and shuffle-exchange, Batcher’s bitonic sorting technique provides a parallel algorithm for sorting n numbers in U(lg2 n) time with n processors. The question of existence of a o(lg2 n)-depth sorting network remained open until 1983, when Ajtai, Komlos, and Szemeredi [1] provided an optimal U(lg n)-depth sorting network, but unfortunately, their construction leads to larger networks than those given by bitonic sort for all “practical” values of n. Leighton [15] has shown that any U(lg n)-depth family of sorting networks can be used to sort n numbers in U(lg n) time in the bounded-degree fixed interconnection network domain. Not surprisingly, the optimal U(lg n)-time fixed interconnection sorting networks implied by the AKS construction are also impractical. In 1983, Reif and Valiant proposed a more practical O(lg n)-time randomized algorithm for sorting [19], called flashsort. Many other parallel sorting algorithms have been proposed in the literature, including parallel versions of radix sort and quicksort [5], a variant of quicksort called hyperquicksort [23], smoothsort [18], column sort [15], Nassimi and Sahni’s sort [17], and parallel merge sort [6]. This paper reports the findings of a project undertaken at Thinking Machines Corporation to develop a fast sorting algorithm for the Connection Machine Supercomputer model CM-2. The primary goals of this project were: