Journal Article10.1088/1742-5468/ad0222
Strahler number of natural language sentences in comparison with random trees
Kumiko Tanaka‐Ishii,Akira Tanaka +1 more
TL;DR: The Strahler number of natural language sentences is similar to that of river bifurcation and is bounded above and below by 3 and 4.
read more
Abstract: Abstract The Strahler number was originally proposed to characterize the complexity of river bifurcation and has found various applications. This article proposes a computation of the Strahler number’s upper and lower limits for natural language sentence tree structures. Through empirical measurements across grammatically annotated data, the Strahler number of natural language sentences is shown to be almost 3 or 4, similar to the case of river bifurcation as reported by Strahler (1957 Eos Trans. Am. Geophys. Union 38 913–20). Based on the theory behind this number, we show that there is a kind of lower limit on the amount of memory required to process sentences. We consider the Strahler number to provide reasoning that explains reports showing that the number of required memory areas to process sentences is 3–4 for parsing (Schuler et al 2010 Comput. Linguist. 36 1–30), and reports indicating a psychological ‘magical number’ of 3–5 (Cowan 2001 Behav. Brain Sci. 24 87–114). An analytical and empirical analysis shows that the Strahler number is not constant but grows logarithmically. Therefore, the Strahler number of sentences is derived from the range of sentence lengths. Furthermore, the Strahler number is not different for random trees, which could suggest that its origin is not specific to natural language.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
References
A mathematical theory of communication
TL;DR: This final installment of the paper considers the case where the signals or the messages or both are continuously variable, in contrast with the discrete nature assumed until now.
74.4K
•Journal Article
The magical number seven, plus or minus two: some limits on our capacity for processing information
TL;DR: The theory of information as discussed by the authors provides a yardstick for calibrating our stimulus materials and for measuring the performance of our subjects and provides a quantitative way of getting at some of these questions.
23.5K
The magical number 4 in short-term memory: a reconsideration of mental storage capacity.
TL;DR: A wide variety of data on capacity limits suggesting that the smaller capacity limit in short-term memory tasks is real is brought together and a capacity limit for the focus of attention is proposed.
Erosional development of streams and their drainage basins; hydrophysical approach to quantitative morphology
TL;DR: The most important single factor involved in erosion phenomena and, in particular in connection with the development of stream systems and their drainage basins by aqueous erosion is called crossgrading.
Quantitative analysis of watershed geomorphology
TL;DR: In this paper, two general classes of descriptive numbers are presented: linear scale measurements and dimensionless numbers, usually angles or ratios of length measures, whereby the shapes of analogous units can be compared irrespective of scale.
5.8K