About: Unicode collation algorithm is a research topic. Over the lifetime, 5 publications have been published within this topic receiving 15 citations. The topic is also known as: UCA.
TL;DR: This document describes "i;unicode-casemap", a simple case-insensitive collation for Unicode strings that provides equality, substring, and ordering operations.
Abstract: This document describes "i;unicode-casemap", a simple case-insensitive
collation for Unicode strings. It provides equality, substring, and
ordering operations. [STANDARDS-TRACK]
TL;DR: This paper describes the four collation capabilities offered by PROC SORT in SAS and further detail the applicability, the advantages, the processing requirements, and the processing implications of each approach.
Abstract: Traditionally, data is ordered to facilitate further processing or to enable you to quickly find information in a report or other form of data presentation. The SAS System’s primary means of achieving an alternative collating sequence has been to specify a translation table (TRANTAB), using the PROC SORT SORTSEQ option, with which PROC SORT can reorder individual characters. SAS® 9.2 extends the SORTSEQ option to enable the specification of an arbitrary encoding for non-native binary collation. SAS 9.2 also extends the SORTSEQ option to enable the specification of linguistic collation, which is useful for presenting data because it produces results that are more intuitive and culturally acceptable. The linguistic collation capability is highly compatible with the Unicode Collation Algorithm and adaptable to user preference using various options. In this paper, we describe the four collation capabilities offered by PROC SORT in SAS. We further detail the applicability, the advantages, the processing requirements, and the processing implications of each approach. We conclude with information regarding the future directions of collation and sorting within the SAS System.
TL;DR: The lexicographical order relations used within dictionar ies are language-dependent, and this work explains how it implemented such orders in Scheme using generators of sorti ng orders.
Abstract: The lexicographical order relations used within dictionar ies are language-dependent, and we explain how we implemented such orders in Scheme We show how our sorting orders are derived from the Unicode collation algorithm Since the result of a Scheme function can be itself a function, we use generators of sorti ng orders Specifying a sorting order for a new natural language has been made as easy as possible and can be done by a programmer who just has basic knowledge of Scheme We also show how Scheme data structures allow our functions to be programmed efficiently