A Data Science Central Community

Mathematical Olympiads are popular among high school students. However, there is nothing similar for college students, except maybe IMC. Even IMC is not popular. It focuses mostly on the same kind of problems as high school Olympiads, and you can not participate if you are over 23 years old. In addition, it is organized by country, as opposed to globally, thus favoring countries with a large population. Topics such as probability are never considered.

This is an opportunity to create Mathematical Olympiads for college students, with no age or country restrictions. It could be organized online, offering interesting, varied, and challenging problems, allowing participants to read literature about the problems, and have a few weeks to submit a solution. In short, something like Kaggle competitions, except that Kaggle focuses exclusively on machine learning, coding, and data processing. Not sure where the funding could come from, but if I decided to organize this kind of competition, I would be able to fund it myself.

Below are examples of problems that I would propose. They do not require knowledge beyond advanced undergrad level in math, statistics, or probabilities. They are are more difficult, and more original, than typical exam questions. Participants are encouraged to use tools such as WolframAlpha to automatically compute integrals or solve systems of equations involved in these problems.

Is anyone interested in this new initiative? I could see this helping students not enrolled in a top university, though the majority of winners would probably come from a top school.

**Problem 1**

Using complex analysis or other means, prove that

See solution here.

**Problem 2**

Points are randomly distributed on the plane, with an average of *m* points per unit area. A circle of radius *R* is drawn around each point. What is the proportion of the plane covered by these (possibly overlapping) circles? How can you use this problem to compute an approximation of exp(-Pi)?

See solution here.

**Problem 3**

What is the minimal correlation between two random variables if the marginal distributions are exponential? Prove that it can not be lower than 1 - (Pi^2 / 6) = -0.644. Provide an example where the lower bound is attained.

See solution here (in section 9.)

**Problem 4**

A special version of the logistic map is produced by an iterative algorithm, as follows: the seed *x* = *x*(1) is anywhere in [0, 1] and *x*(*n*+1) = *g*(*x*(*n*)) with *g*(*y*) = SQRT(4**y**(1-*y*)). The equilibrium distribution satisfies the stochastic integral equation P(*X* < *y*) = P(*g*(*X*) < *y*). You can look at this as if *X* is a random variable with the observed values being the successive values of *x*(*n*). The equilibrium distribution does not depend on the seed *x* except for very rare, bad seeds, that do not behave well. Solve the stochastic integral equation to derive the equilibrium distribution. Also, what is the theoretical correlation between *x*(*n*) and *x*(*n*+1), at equilibrium? (answer: -1/2.)

See solution here (in last section.)

**Problem 5**

Prove the following result, which has been used to provide one of the most elementary proofs of the prime number theorem. This is a classic algebraic result that applies to many sequences of slowly increasing positive integers, not just to prime numbers. If *Q* is an infinite set of positive integers, with *Q*(*n*) being the subset of all integers in *Q* that are less than or equal to *n*, then under rather general conditions (identify these conditions), we have

where *d*(*p*) is the difference between *p* and the largest element of *Q* that is smaller than *p*.

See solution here.

**Problem 6**

Prove the following result:

You can find the solution in this paper (PDF document.)

**Problem 7**

Prove the following result:

See solution here.

**Problem 8**

Consider the following iterative algorithm:

*p*(0) = 0, *p*(1)= 1, *e*(1) = 2

**If** 4*p*(*n*) + 1 < 2*e*(*n*) **Then**

*p*(*n*+1) = 2*p*(*n*) + 1*e*(*n*+1) = 4*e*(*n*) - 8*p*(*n*) - 2*d*(*n*+1) = 1

**Else**

*p*(*n*+1) = 2*p*(*n*)*e*(*n*+1) = 4*e*(*n*)*d*(*n*+1) = 0

Note that *d*(*n*+1) = *p*(*n*+1) - 2*p*(*n*).

Prove that *d*(*n*) is the *n*-th decimal of SQRT(2)/2 in base 2.

See solution here.

**Problem 9 **

The recursive relation *g*(*n*) = *n* − *g*(*g*(*n* − 1)), *g*(0) = 0, appears in the context of Fibonacci numbers, as you can see in Hofstadter [“Bach, Esher, Godel,” pp. 151–154, Intereditions, 1985]. Prove that

See solution here, published in *Journal of Number Theory*.

**Problem 10**

On the digits of Pi/4 in base *b*. Prove that if *b* is an even integer and *n* > 3, then the *n*-th digit of *x* = Pi/4 is equal to INT(*b* * *x*(*n*)), with

The first digit (*n* = 1) starts after the decimal point. Show that this formula is not valid in general, for other values of x, or if *b* is an odd integer. You can find more about this in the following article.

**DSC Resources**

- Subscribe to our Newsletter
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- Hire a Data Scientist | Search DSC | Classifieds | Find a Job
- Post a Blog | Forum Questions

© 2019 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge