A Data Science Central Community

Take any positive integer *n*. If *n* is even, divide it by 2 to get *n* / 2. If *n* is odd, multiply it by 3 and add 1 to obtain 3*n* + 1. Repeat the process indefinitely. Does the sequence eventually reach 1, regardless of the initial value? For instance, if you start with the number 75,128,138,247, you eventually reach 1 after 1228 steps. If you start with the number 27, you climb as high as 9,232, but eventually reach 1 after 41 steps.

This is supposed to be a very difficult problem. Note that if a sequence reaches any power of 2 (say, 64) or any intermediate number found in the trillions of trillions of such sequences that are known to reach 1, then the sequence in question will obviously reach 1 too. For a sequence not to reach 1, the first element (as well as any subsequent element) would have to be different from any initial or intermediate number found in any series identified as reaching 1 so far. This makes it highly unlikely, yet the conjecture has not been proved yet.

For more on this problem, as well as about the above picture linked to this problem, click here.

It is interesting to note that if you replace the deterministic algorithm by a probabilistic one, for instance *n* becomes *n* / 2 with probability 0.5 and 3*n* + 1 with probability 0.5, then instead of reaching 1, you reach infinity. Also with the deterministic algorithm, if you replace 3*n* + 1 by 2*n* + 1, you would think that you would reach 1 even faster, but this is not the case: you reach 1 only if the initial value is a power of 2, and in all cases you eventually reach infinity.

**Possible proof**

If you want to prove (or disprove) this conjecture, a possible methodology is as follows. Let's recursively define f(*k*+1, *n*) = f(f(*k*, *n*)) for *k* = 0, 1, 2 and so on, with f(0, *n*) = *n*, and f(*x*) = *x* / 2 if *x* is even, 3*x* + 1 otherwise. The conjecture states that no matter the initial value *n*, there is always a number *k* (function of *n*) such that f(*k*, *n*) = 1: in short, you reach 1 after *k* steps.

Consider the following four cases, each occurring with a probability 0.25 (Mod stands for the modulo operator):

- Mod(
*n*, 4) = 0. Then f(2,*n*) =*n*/ 4. - Mod(
*n*, 4) = 1. Then f(3,*n*) = (3*n*+ 1) / 4 - Mod(
*n*, 4) = 2. Then f(1,*n*) =*n*/ 2. - Mod(
*n*, 4) = 3. This case is broken down into two sub-cases, see below.

The case Mod(*n*, 4) = 3 is broken down into the following two sub-cases, each occurring with probability 0.125:

- If Mod(
*n*, 8) = 3 then f(2,*n*) = (3*n*+ 1) / 2 and in this case we are back to case #2 above after 2 steps. - If Mod(
*n*, 8) = 7 then f(2,*n*) = (3*n*+ 1) / 2 and in this case we are back to case #4 above after 2 steps.

In both sub-cases, the sequence has been increasing, though we know that if Mod(*n*, 8) = 3, it will go down a bit (but still stay a little above *n*, more specifically around 9*n* / 8) after 3 additional steps.

So it looks like on average, we are decreasing over time (thus the fact that we would eventually reach 1 seems likely), but the challenging case if when Mod(*n*, 4) = 3, and even more challenging when Mod(*n*, 8) = 7. Can we get stuck in a sequence where every two steps, the residual modulo 8 is equal to 7 (the worst case that makes the sequence grows at its fastest pace?) And for how many cycles can we get stuck in such a configuration? These are difficult issues to address if you want to prove this conjecture.

The problem might also be approximately modeled as some kind of Markov chain, with 5 different states corresponding to the first 3 cases and the 2 sub-cases discussed earlier. One single iteration in the Markov chain corresponds respectively to 2, 3, 1, 5, and 2 steps of the above algorithm, to reach the next local dip in value (if any). For *n* large enough, one iteration of the Markov chain is thus approximately as follows:

- we reduce
*n*by 75% with probability 0.25 - we reduce
*n*by 25% with probability 0.25 - we reduce
*n*by 50% with probability 0.25 - we increase
*n*by 12.5% with probability 0.125 - we increase
*n*by 50% with probability 0.125

It is easy to compute the probability p(*N)* that the initial value *n* will not be reduced after *N* iterations of the Markov chain, for any positive *N*. However, even for very large *N*, this probability is still strictly positive, albeit very close to zero. Also, it is not clear if the memory-less property of Markov chains is violated here, which would either invalidate this approach, or possibly make it more difficult to handle this problem. Most likely, if it results in a proof, it would be an heuristic one.

**Related articles**

Tags:

© 2019 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions