Zipf's lawOriginally the term Zipf's law meant the observation of Harvard linguist George Kingsley Zipf (SAMPA: [zIf]) that the frequency of use of the nth-most-frequently-used word in any natural language is approximately inversely proportional to n. Zipf's law is an experimental law, not a theoretical one. The causes of Zipfian distributions in real life are a matter of some controversy. However, Zipfian distributions are commonly observed in many kinds of phenomena.\nZipf's law is often demonstrated by scatterplotting the data, with the axes being log(rank order) and log(frequency). If the points are close to a single straight line, the distribution follows Zipf's law.
Related lawsThe term Zipf's law has consequently come to be used to refer to frequency distributions of "rank data" in which the relative frequency of the nth-ranked item is given by the Zeta distribution, 1/(nsζ(s)), where s > 1 is a parameter indexing this family of probability distributions. Indeed, the term Zipf's law sometimes simply means the zeta distribution, since probability distributions are sometimes called "laws". This distribution is sometimes called the Zipfian distribution or Yule distribution. A more general law proposed by Benoit Mandelbrot has frequenciesExamples of collections approximately obeying Zipf's law:\n* frequency of accesses to web pages\n** in particular the access counts on the Wikipedia most popular page, with s approximately equal to 0.3 \n** page access counts on Polish Wikipedia (data for late July 2003) approximately obey Zipf's law with s about 0.5\n* words in the English language\n**for instance, in Shakespeare's play Hamlet, with s approximately 0.5, see Shakespeare word frequency lists\n* sizes of settlements\n* income distribution amongst individuals\n* size of earthquakes\n* notes in musical performances It has been pointed out (see external link below) that Zipfian distributions can also be regarded as being Pareto distributions with an exchange of variables.See also\n* Benford's law \n* Bradford's law \n* harmonic number of order\n* law (principle) \n* Mathematical economics \n* Pareto distribution\n* Pareto principle \n* power law\n* Zipf-Mandelbrot law\n* Heaps' law\n* Voynich manuscriptFurther reading\n* George K. Zipf, Human Behaviour and the Principle of Least-Effort, Addison-Wesley, Cambridge MA, 1949\n* W. Li, "Random texts exhibit Zipf's-law-like word frequency distribution", IEEE Transactions on Information Theory, 38(6), pp.1842-1845, 1992.\n* Alexander Gelbukh, Grigori Sidorov. "Zipf and Heaps Laws’ Coefficients Depend on Language". Proc. CICLing-2001, Conference on Intelligent Text Processing and Computational Linguistics, February 18–24, 2001, Mexico City. Lecture Notes in Computer Science N 2004, ISSN 0302-9743, ISBN 3-540-41687-0, Springer-Verlag, pp. 332–335.\n* Damian H. Zanette. Zipf's law and the creation of musical context. Online preprint at http://xxx.arxiv.org/abs/cs.CL/0406015\n* Kali R. The city as a giant component: a random graph approach to Zipf's law. Applied Economics Letters, 15 September 2003, vol. 10, iss. 11, pp. 717-720(4)External links\n*Comprehensive bibliography of Zipf's law\n*Zipf, Power-laws, and Pareto - a ranking tutorial\n*Seeing Around Corners (Artificial societies turn up Zipf's law)\n*PlanetMath article on Zipf's law\n*Benford's Law and Zipf's Law, An Introduction \nCategory:Statistics |
||||
"There is more stupidity than hydrogen in the universe, and it has a longer shelf life." - Frank Zappa |
