A Data Science Central Community
This challenge of the week consists of doing simulations to replicate the special clustering process described in our Zipf law article, resulting in the widespread Zipf distributions applicable to many natural and economical phenomenons. More specifically, we ask you to perform the following Monte-Carlo simulations (special clustering algorithm) to assess whether our assumptions, in our above article, are correct. Alternatively, a mathematical proof is OK.
Figure 1: Zipf’s law and the distribution of patents among applicants
Let's assume that we have k = one million atoms, for instance space dust particles. Test the following algorithm:
Algorithm (write it in Perl, R, Python, C or Java)
Each particle is assigned a unique bin ID between 1 and 1,000,000. Each particle represents a cluster with one element (the particle in question), and the bin ID is its cluster ID.
Iteration: repeat 200,000,000 times:
Once the algorithm stops, the final cluster configuration represents the current solar system, or companies in US as described in our original article. Does it really satisfy a Zipf distribution?