Collatz conjecture

From Wikipedia, the free encyclopedia - View original article

 
Jump to: navigation, search
List of unsolved problems in mathematics
Does the Collatz sequence from initial value n eventually reach 1, for all n > 0?

The Collatz conjecture is a conjecture in mathematics named after Lothar Collatz, who first proposed it in 1937. The conjecture is also known as the 3n + 1 conjecture, the Ulam conjecture (after Stanislaw Ulam), Kakutani's problem (after Shizuo Kakutani), the Thwaites conjecture (after Sir Bryan Thwaites), Hasse's algorithm (after Helmut Hasse), or the Syracuse problem;[1][2] the sequence of numbers involved is referred to as the hailstone sequence or hailstone numbers (because the values are usually subject to multiple descents and ascents like hailstones in a cloud),[3][4] or as wondrous numbers.[5]

Take any natural number n. If n is even, divide it by 2 to get n / 2. If n is odd, multiply it by 3 and add 1 to obtain 3n + 1. Repeat the process (which has been called "Half Or Triple Plus One", or HOTPO[6]) indefinitely. The conjecture is that no matter what number you start with, you shall always eventually reach 1. The property has also been called oneness.[7]

Paul Erdős said, allegedly, about the Collatz conjecture: "Mathematics is not yet ripe for such problems." He also offered $500 for its solution.[8]

In 1972, J.H. Conway proved that a natural generalization of the Collatz problem is algorithmically undecidable.[9]

Statement of the problem[edit]

Statistic for the numbers 1 to 100 million, showing the total stopping time on the x-axis against the number of occurrences on the y-axis.
Numbers from 1 to 9999 and their corresponding total stopping time.

Consider the following operation on an arbitrary positive integer:

In modular arithmetic notation, define the function f as follows:

 f(n) = \begin{cases} n/2 &\text{if } n \equiv 0 \pmod{2}\\ 3n+1 & \text{if } n\equiv 1 \pmod{2} \end{cases}

Now, form a sequence by performing this operation repeatedly, beginning with any positive integer, and taking the result at each step as the input at the next.

In notation:

 a_i = \begin{cases}n & \text{for } i = 0 \\ f(a_{i-1}) & \text{for } i > 0. \end{cases}

(that is: a_i is the value of f applied to n recursively i times; a_i = f^i(n)).

The Collatz conjecture is: This process will eventually reach the number 1, regardless of which positive integer is chosen initially.

That smallest i such that ai = 1 is called the total stopping time of n.[10] The conjecture asserts that every n has a well-defined total stopping time. If, for some n, such an i doesn't exist, we say that n has infinite total stopping time and the conjecture is false.

If the conjecture is false, it can only be because there is some starting number which gives rise to a sequence which does not contain 1. Such a sequence might enter a repeating cycle that excludes 1, or increase without bound. No such sequence has been found.

Examples[edit]

For instance, starting with n = 6, one gets the sequence 6, 3, 10, 5, 16, 8, 4, 2, 1.

n = 11, for example, takes longer to reach 1: 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1.

The sequence for n = 27, listed and graphed below, takes 111 steps, climbing to 9232 before descending to 1.

{ 27, 82, 41, 124, 62, 31, 94, 47, 142, 71, 214, 107, 322, 161, 484, 242, 121, 364, 182, 91, 274, 137, 412, 206, 103, 310, 155, 466, 233, 700, 350, 175, 526, 263, 790, 395, 1186, 593, 1780, 890, 445, 1336, 668, 334, 167, 502, 251, 754, 377, 1132, 566, 283, 850, 425, 1276, 638, 319, 958, 479, 1438, 719, 2158, 1079, 3238, 1619, 4858, 2429, 7288, 3644, 1822, 911, 2734, 1367, 4102, 2051, 6154, 3077, 9232, 4616, 2308, 1154, 577, 1732, 866, 433, 1300, 650, 325, 976, 488, 244, 122, 61, 184, 92, 46, 23, 70, 35, 106, 53, 160, 80, 40, 20, 10, 5, 16, 8, 4, 2, 1 }
Collatz5.svg

Numbers with a total stopping time longer than any smaller starting value form a sequence beginning with:

1, 2, 3, 6, 7, 9, 18, 25, 27, 54, 73, 97, … (sequence A006877 in OEIS).

The longest progression for any initial starting number less than 100 million is 63,728,127, which has 949 steps. For starting numbers less than 1 billion it is 670,617,279, with 986 steps, and for numbers less than 10 billion it is 9,780,657,630, with 1132 steps.[11][12]

The powers of two converge to one quickly because 2^n is halved n times to reach one, and is never increased.

Visualizations[edit]

Program to calculate Hailstone sequences[edit]

A specific Hailstone sequence can be easily computed, as is shown by this pseudocode example:

    function hailstone(n)        while n > 1            show n            if n is odd then                set n = 3n + 1            else                set n = n / 2            endif        endwhile        show n 

This program halts when the sequence reaches 1, in order to avoid printing an endless cycle of 4, 2, 1. If the Collatz conjecture is true, the program will always halt (stop) no matter what positive starting integer is given to it.

Cycles[edit]

The conjecture could be proven, indirectly, as a consequence of the following:

These being true, all natural numbers would have a trajectory down to one.

In 1977, R. Steiner, and in 2000 and 2002, J. Simons and B. de Weger (based on Steiner's work), proved the nonexistence of certain types of cycles.

Notation[edit]

To explain this we refer to the "shortcut" definition of the Collatz map, f(n)=(3n+1)/2 for odd n and f(n)=n/2 for even n.

A cycle is a sequence (a_0,a_1,\ldots,a_q) where f(a_0)=a_1, f(a_1)=a_2, and so on, up to f(a_q)=a_0 in a closed loop. The only known cycle is (1,2).

If the cycle consists of an increasing sequence of odd numbers followed by a decreasing sequence of even numbers, it is called a 1-cycle. If it consists of an increasing sequence of odd numbers, then a decreasing sequence of even numbers, then another increasing sequence of odd numbers, then another decreasing sequence of even numbers, it is called a 2-cycle. In general, an m-cycle has m subsequences of one or more odd numbers, interlaced with m subsequences of one or more even numbers.[13]

Theorems[edit]

Supporting arguments[edit]

Although the conjecture has not been proven, most mathematicians who have looked into the problem think the conjecture is true because experimental evidence and heuristic arguments support it.

The power of three argument[edit]

Each integer s can be written as

   s = 2 ^ {n} \left( 2^{m} (2k-1) - 1 \right)

with integers n >= 0 and m,k > 0.

This number will be transformed to

   s' = \frac{ 3^m (2k-1) - 1}{2}

after n+2m steps of iteration.


Experimental evidence[edit]

The conjecture has been checked by computer for all starting values up to 5 × 260 ≈ 5.764×1018.[15] All initial values tested so far eventually end in the repeating cycle {4,2,1}, which has only three terms. From this lower bound on the starting value, a lower bound can also be obtained for the number of terms a repeating cycle other than {4,2,1} must have.[16] When this relationship was established in 1981, the formula gave a lower bound of 35,400 terms.[16]

Such computer evidence is not a proof that the conjecture is true. As shown in the cases of the Pólya conjecture, the Mertens conjecture and the Skewes' number, sometimes a conjecture's only counterexamples are found when using very large numbers. Since sequentially examining all natural numbers is a process which can never be completed, such an approach can never demonstrate that the conjecture is true, merely that no counterexamples have yet been discovered.

A probabilistic heuristic[edit]

If one considers only the odd numbers in the sequence generated by the Collatz process, then each odd number is on average 3/4 of the previous one.[17] (More precisely, the geometric mean of the ratios of outcomes is 3/4.) This yields a heuristic argument that every Hailstone sequence should decrease in the long run, although this is not evidence against other cycles, only against divergence. The argument is not a proof because it assumes that Hailstone sequences are assembled from uncorrelated probabilistic events. (It does rigorously establish that the 2-adic extension of the Collatz process has two division steps for every multiplication step for almost all 2-adic starting values.)

Rigorous bounds[edit]

Although it is not known rigorously whether all positive numbers eventually reach one according to the Collatz iteration, it is known that many numbers do so. In particular, Krasikov and Lagarias showed that the number of integers in the interval [1,x] that eventually reach one is at least proportional to x0.84.[18]

Other formulations of the conjecture[edit]

In reverse[edit]

The first 21 levels of the Collatz graph generated in bottom-up fashion. The graph includes all numbers with an orbit length of 21 or less.

There is another approach to prove the conjecture, which considers the bottom-up method of growing the so-called Collatz graph. The Collatz graph is a graph defined by the inverse relation

 R(n) = \begin{cases} \{2n\} & \text{if } n\equiv 0,1,2,3,5 \\ \{2n,(n-1)/3\} & \text{if } n\equiv 4 \end{cases} \pmod{6}.

So, instead of proving that all natural numbers eventually lead to 1, we can prove that 1 leads to all natural numbers. For any integer n, n ≡ 1 (mod 2) iff 3n + 1 ≡ 4 (mod 6). Equivalently, (n − 1)/3 ≡ 1 (mod 2) iff n ≡ 4 (mod 6). Conjecturally, this inverse relation forms a tree except for the 1–2–4 loop (the inverse of the 1–4–2 loop of the unaltered function f defined in the statement of the problem above). When the relation 3n + 1 of the function f is replaced by the common substitute "shortcut" relation (3n + 1)/2, the Collatz graph is defined by the inverse relation,

 R(n) = \begin{cases} \{2n\} & \text{if } n\equiv 0  \\ \{2n,(4^{m}2n-1)/3\} & \text{if } n\equiv 2 , m=0,1,2 ...  \\ \{2n,(4^{m}4n-1)/3\} & \text{if } n\equiv 1 , m=0,1,2 ... \end{cases}   \pmod{3}.

Conjecturally, this inverse relation forms a tree except for a 1–2 loop (the inverse of the 1–2 loop of the function f(n) revised as indicated above).

As an abstract machine that computes in base two[edit]

Repeated applications of the Collatz function can be represented as an abstract machine that handles strings of bits. The machine will perform the following three steps on any odd number until only one "1" remains:

  1. Append 1 to the (right) end of the number in binary (giving 2n + 1);
  2. Add this to the original number by binary addition (giving 2n + 1 + n = 3n + 1);
  3. Remove all trailing "0"s (i.e. repeatedly divide by two until the result is odd).

This prescription is plainly equivalent to computing a Hailstone sequence in base two.

Example[edit]

The starting number 7 is written in base two as 111. The resulting Hailstone sequence is:

          111         1111        10110       10111      100010     100011     110100    11011   101000  1011 10000 

As a parity sequence[edit]

For this section, consider the Collatz function in the slightly modified form

 f(n) = \begin{cases} n/2 &\text{if } n \equiv 0 \\ (3n +1)/2 & \text{if } n \equiv 1. \end{cases} \pmod{2}

This can be done because when n is odd, 3n + 1 is always even.

If P(…) is the parity of a number, that is P(2n) = 0 and P(2n + 1) = 1, then we can define the Hailstone parity sequence for a number n as pi = P(ai), where a0 = n, and ai+1 = f(ai).

Using this form for f(n), it can be shown that the parity sequences for two numbers m and n will agree in the first k terms if and only if m and n are equivalent modulo 2k. This implies that every number is uniquely identified by its parity sequence, and moreover that if there are multiple Hailstone cycles, then their corresponding parity cycles must be different.

The proof is simple: it is easy to verify by hand that applying the f function k times to the number a·2k + b will give the result a·3c + d, where d is the result of applying the f function k times to b, and c is how many odd numbers were encountered during that sequence. So the parity of the first k numbers is determined purely by b, and the parity of the (k + 1)th number will change if the least significant bit of a is changed.

The Collatz Conjecture can be rephrased as stating that the Hailstone parity sequence for every number eventually enters the cycle 0 → 1 → 0.

As a tag system[edit]

For the Collatz function in the form

 f(n) = \begin{cases} n/2 &\text{if } n \equiv 0 \\ (3n +1)/2 & \text{if } n \equiv 1. \end{cases} \pmod{2}

Hailstone sequences can be computed by the extremely simple 2-tag system with production rules abc, ba, caaa. In this system, the positive integer n is represented by a string of n a's, and iteration of the tag operation halts on any word of length less than 2. (Adapted from De Mol.)

The Collatz conjecture equivalently states that this tag system, with an arbitrary finite string of a's as the initial word, eventually halts (see Example: Computation of Collatz sequences for a worked example).

Extensions to larger domains[edit]

Iterating on all integers[edit]

An obvious extension is to include all integers, not just positive integers. In this case there are a total of 5 known cycles, which all integers seem to eventually fall into under iteration of f. These cycles are listed here, starting with the well-known cycle for positive n.

Odd values are listed in bold. Each cycle is listed with its member of least absolute value (which is always odd or zero) first.

CycleOdd-value cycle lengthFull cycle length
1 → 4 → 2 → 1 13
0 → 0 01
−1 → −2 → −1 12
−5 → −14 → −7 → −20 → −10 → −5 25
−17 → −50 → −25 → −74 → −37 → −110 → −55 → −164 → −82 → −41 → −122 → −61 → −182 → −91 → −272 → −136 → −68 → −34 → −17 718

The Generalized Collatz Conjecture is the assertion that every integer, under iteration by f, eventually falls into one of these five cycles.

Iterating with odd denominators or 2-adic integers[edit]

The standard Collatz map can be extended to (positive or negative) rational numbers which have odd denominators when written in lowest terms. The number is taken to be odd or even according to whether its numerator is odd or even. A closely related fact is that the Collatz map extends to the ring of 2-adic integers, which contains the ring of rationals with odd denominators as a subring.

The parity sequences as defined above are no longer unique for fractions. However, it can be shown that any possible parity cycle is the parity sequence for exactly one fraction: if a cycle has length n and includes odd numbers exactly m times at indices k0, …, km−1, then the unique fraction which generates that parity cycle is

\frac{3^{m-1} 2^{k_0} + \cdots + 3^0 2^{k_{m-1}}}{2^n - 3^m}.

For example, the parity cycle (1 0 1 1 0 0 1) has length 7 and has 4 odd numbers at indices 0, 2, 3, and 6. The unique fraction which generates that parity cycle is

\frac{3^3 2^0 + 3^2 2^2 + 3^1 2^3 + 3^0 2^6}{2^7 - 3^4} = \frac{151}{47},

the complete cycle being: 151/47 → 250/47 → 125/47 → 211/47 → 340/47 → 170/47 → 85/47 → 151/47

Although the cyclic permutations of the original parity sequence are unique fractions, the cycle is not unique, each permutation's fraction being the next number in the loop cycle:

(0 1 1 0 0 1 1) → \frac{3^3 2^1 + 3^2 2^2 + 3^1 2^5 + 3^0 2^6}{2^7 - 3^4} = \frac{250}{47}
(1 1 0 0 1 1 0) → \frac{3^3 2^0 + 3^2 2^1 + 3^1 2^4 + 3^0 2^5}{2^7 - 3^4} = \frac{{125}}{47}
(1 0 0 1 1 0 1) → \frac{3^3 2^0 + 3^2 2^3 + 3^1 2^4 + 3^0 2^6}{2^7 - 3^4} = \frac{211}{47}
(0 0 1 1 0 1 1) → \frac{3^3 2^2 + 3^2 2^3 + 3^1 2^5 + 3^0 2^6}{2^7 - 3^4} = \frac{340}{47}
(0 1 1 0 1 1 0) → \frac{3^3 2^1 + 3^2 2^2 + 3^1 2^4 + 3^0 2^5}{2^7 - 3^4} = \frac{170}{47}
(1 1 0 1 1 0 0) → \frac{3^3 2^0 + 3^2 2^1 + 3^1 2^3 + 3^0 2^4}{2^7 - 3^4} = \frac{85}{47}

Also, for uniqueness, the parity sequence should be "prime", i.e., not partitionable into identical sub-sequences. For example, parity sequence (1 1 0 0 1 1 0 0) can be partitioned into two identical sub-sequences (1 1 0 0)(1 1 0 0). Calculating the 8-element sequence fraction gives

(1 1 0 0 1 1 0 0) → \frac{3^3 2^0 + 3^2 2^1 + 3^1 2^4 + 3^0 2^5}{2^8 - 3^4} = \frac{125}{175}

But when reduced to lowest terms {5/7}, it is the same as that of the 4-element sub-sequence

(1 1 0 0) → \frac{3^1 2^0 + 3^0 2^1}{2^4 - 3^2} = \frac{5}{7}.

And this is because the 8-element parity sequence actually represents two circuits of the loop cycle defined by the 4-element parity sequence.

In this context, the Collatz conjecture is equivalent to saying that (0 1) is the only cycle which is generated by positive whole numbers (i.e. 1 and 2).

Iterating on real or complex numbers[edit]

Cobweb plot of the orbit 10-5-8-4-2-1-2-1-2-1-etc. in the real extension of the Collatz map (optimized by replacing "3n + 1" with "(3n + 1)/2" )

The Collatz map can be viewed as the restriction to the integers of the smooth real and complex map

f(z)=\frac 1 2 z \cos^2\left(\frac \pi 2 z\right)+(3z+1)\sin^2\left(\frac \pi 2 z\right),

which simplifies to \frac{1}{4}(2 + 7z - (2 + 5z)\cos(\pi z)).

If the standard Collatz map defined above is optimized by replacing the relation 3n + 1 with the common substitute "shortcut" relation (3n + 1)/2, it can be viewed as the restriction to the integers of the smooth real and complex map

f(z)=\frac 1 2 z \cos^2\left(\frac \pi 2 z\right)+(3z+1)\sin^2\left(\frac \pi 2 z\right),

which simplifies to \frac{1}{4}(1 + 4z - (1 + 2z)\cos(\pi z)).

Collatz fractal[edit]

Iterating the above optimized map in the complex plane produces the Collatz fractal.

The point of view of iteration on the real line was investigated by Chamberland (1996), and on the complex plane by Letherman, Schleicher, and Wood (1999).

Collatz map fractal in a neighbourhood of the real line

Optimizations[edit]

The As a parity sequence section above gives a way to speed up simulation of the sequence. To jump ahead k steps on each iteration (using the f function from that section), break up the current number into two parts, b (the k least significant bits, interpreted as an integer), and a (the rest of the bits as an integer). The result of jumping ahead k steps can be found as:

f k(a 2k + b) = a 3c(b) + d(b).

The c and d arrays are precalculated for all possible k-bit numbers b, where d(b) is the result of applying the f function k times to b, and c(b) is the number of odd numbers encountered on the way. For example, if k=5, you can jump ahead 5 steps on each iteration by separating out the 5 least significant bits of a number and using:

c(0..31) = {0,3,2,2,2,2,2,4,1,4,1,3,2,2,3,4,1,2,3,3,1,1,3,3,2,3,2,4,3,3,4,5}
d(0..31) = {0,2,1,1,2,2,2,20,1,26,1,10,4,4,13,40,2,5,17,17,2,2,20,20,8,22,8,71,26,26,80,242}.

This requires 2k precomputation and storage to speed up the resulting calculation by a factor of k, a space-time tradeoff.

For the special purpose of searching for a counterexample to the Collatz conjecture, this precomputation leads to an even more important acceleration due to Tomás Oliveira e Silva and is used in the record confirmation of the Collatz conjecture. If, for some given b and k, the inequality

f k(a 2k + b) = a 3c(b) + d(b) < a 2k + b

holds for all a, then the first counterexample, if it exists, cannot be b modulo 2k. For instance, the first counterexample must be odd because f(2n) = n; and it must be 3 mod 4 because f2(4n + 1) = 3n + 1. For each starting value a which is not a counterexample to the Collatz conjecture, there is a k for which such an inequality holds, so checking the Collatz conjecture for one starting value is as good as checking an entire congruence class. As k increases, the search only needs to check those residues b that are not eliminated by lower values of k. On the order of 3k/2 residues survive.[citation needed] For example, the only surviving residues mod 32 are 7, 15, 27, and 31; only 573,162 residues survive mod 225 = 33,554,432.

Syracuse function[edit]

If k is an odd integer, then 3k + 1 is even, so we can write 3k + 1 = 2ak′, with k' odd and a ≥ 1. We define a function f from the set I of odd integers into itself, called the Syracuse function, by taking f (k) = k′ (sequence A075677 in OEIS).

Some properties of the Syracuse function are:

The Syracuse conjecture is that for all k in I, there exists an integer n ≥ 1 such that fn(k) = 1. Equivalently, let E be the set of odd integers k for which there exists an integer n ≥ 1 such that fn(k) = 1. The problem is to show that E = I. The following is the beginning of an attempt at a proof by induction:

1, 3, 5, 7, and 9 are known to be elements of E. Let k be an odd integer greater than 9. Suppose that the odd numbers up to and including k − 2 are in E and let us try to prove that k is in E. As k is odd, k + 1 is even, so we can write k + 1 = 2ph for p ≥ 1, h odd, and k = 2ph − 1. Now we have:

The problematic case is that where p ≥ 2 , h not multiple of 3 and h ≡ (−1)p + 1 mod 4. Here, if we manage to show that for every odd integer k′, 1 ≤ k′ ≤ k − 2 ; 3k′ ∈ E we are done.

See also[edit]

Notes[edit]

  1. ^ Maddux, Cleborne D.; Johnson, D. Lamont (1997). Logo: A Retrospective. New York: Haworth Press. p. 160. ISBN 0-7890-0374-0. "The problem is also known by several other names, including: Ulam's conjecture, the Hailstone problem, the Syracuse problem, Kakutani's problem, Hasse's algorithm, and the Collatz problem." 
  2. ^ According to Lagarias (1985, p.4), the name "Syracuse problem" was proposed by Hasse in the 1950s, during a visit to Syracuse University.
  3. ^ Pickover, Clifford A. (2001). Wonders of Numbers. Oxford: Oxford University Press. pp. 116–118. ISBN 0-19-513342-0. 
  4. ^ "Hailstone Number". MathWorld. Wolfram Research, Inc. 
  5. ^ Hofstadter, Douglas R. (1979). Gödel, Escher, Bach. New York: Basic Books. pp. 400–402. ISBN 0-465-02685-0. 
  6. ^ Friendly, Michael (1988). Advanced Logo: A Language for Learning. Hillsdale, NJ: Lawrence Erlbaum Associates. ISBN 0-89859-933-4. 
  7. ^ Bourke, Paul (December 1992). "Decision Procedure for 'Oneness'". University of West Alabama. 
  8. ^ R. K. Guy: Don't try to solve these problems, Amer. Math. Monthly, 90(1983), 35–41. By this Erdos means that there aren't powerful tools for manipulating such objects.
  9. ^ "J. H. Conway proved the remarkable result that a simple generalization of the problem is algorithmically undecidable." Quoting Lagarias 1985:
  10. ^ * Jeffrey C. Lagarias (January 1985). "The 3x + 1 problem and its generalizations". American Mathematical Monthly 92 (1): 3–23. doi:10.2307/2322189. JSTOR 2322189. 
  11. ^ Leavens, Gary T.; Vermeulen, Mike (December 1992). "3x+1 Search Programs". Computers & Mathematics with Applications 24 (11): 79–99. doi:10.1016/0898-1221(92)90034-F. 
  12. ^ Roosendaal, Eric. "3x+1 Delay Records". Retrieved 27 November 2011.  (Note: "Delay records" are total stopping time records.)
  13. ^ a b c Simons,J.;de Weger, B.; "Theoretical and computational bounds for m-cycles of the 3n + 1 problem", Acta Arithmetica, (online version 1.0, November 18, 2003), 2005.
  14. ^ http://groups.google.com/group/sci.math/msg/1ed7cd277079efdd
  15. ^ Silva, Tomás Oliveira e Silva. "Computational verification of the 3x+1 conjecture". Retrieved 27 November 2011. 
  16. ^ a b Garner, Lynn E. (1981). "On the Collatz 3n + 1 Algorithm". Proceedings of the American Mathematical Society 82 (1): 19–22. doi:10.2307/2044308. JSTOR 2044308. 
  17. ^ http://www.cecm.sfu.ca/organics/papers/lagarias/paper/html/node3.html
  18. ^ Krasikov, Ilia; Lagarias, Jeffrey C. (2003). "Bounds for the 3x + 1 problem using difference inequalities". Acta Arithmetica 109 (3): 237–258. doi:10.4064/aa109-3-4. MR 1980260 .

References and external links[edit]

Papers[edit]

Books[edit]

External links[edit]