The Doctor Who episode titled "42" introduced its viewers to the concept of Happy Numbers. Briefly, you take a positive integer and separate and square each of its digits, then sum the squares to get a new number. Call this operation a Happy Sum. Repeat the happy sum until you either reach 1 or reach the repeating sequence 4, 16, 37, 58, 89, 145, 42, 20, 4. If a number reaches 1, it's a "Happy Number." If it doesn't, it's not a happy number.

e.g. 129 becomes 1^2 + 2^2 + 9^2 = 86.

86 becomes 8^2 + 6^2 = 100.

100 becomes 1^2 + 0^2 + 0^2 = 1.

So 129 is a happy number.

An open question is what percentage of all numbers are happy numbers. Thus far, percentages have been calculated and published up to 10^122. That's a phenomenally large number. How was the number of happy numbers from 1 to 10^122 calculated? By realizing that the order of the digits in a number doesn't matter. If 129 is a happy number, so it 192, 219, 291, 912, and 921. Likewise, so is any variation that only inserts zeros, such as 1029, 90210, etc. With 122 digits, there are only 23,640,810,986,000 unique selections of digits. Each of these combinations was tested to determine if it's a happy number. If so, the number of ways to arrange those digits into a number was added to the count of happy numbers with 122 digits. Computing the number of happy numbers at N digits this way is O(Log(N)^9).

But, there is a much faster way to count happy numbers. It relies on two realizations:

1) If 86 is a happy number, then every number whose happy sum is 86 is a happy number. 129 is a happy number it's happy sum 86 and 86 is a happy number. Likewise, 911111, 555311, and 66321 are happy numbers, because their happy sums are all 86.

2) Appending a single digit to an number increases its happy sum by the square of that digit.

With these, we can now write a formula for the number of numbers with N digits which have a happy sum of X:

H(N,X) = 0 for X < 0

H(N,X) = Sum for i = 0 to 9 of H(N-1,X-i^2)

We seed the count with H(0,0) = 1

In the first iteration we get

H(1,0) = 1

H(1,1) = 1

H(1,4) = 1

H(1,9) = 1

H(1,16) = 1

H(1,25) = 1

H(1,36) = 1

H(1,49) = 1

H(1,64) = 1

H(1,81) = 1

H(1,X) = 0 for all other values of X

For N digits, we only need to calculate H(N,X) up to X = 81N. Each computation of H(N,X) takes only 10 operations - collecting and summing the 10 values for H(N-1,X-i^2). With the counts known, the final operation is to add all the counts where X is a happy number.

With this approach, the number of operations to compute the number of 122-digit happy numbers is reduced from 23,640,810,986,000 to 6,077,430. Generally, this approach counts the number of happy numbers at N digits in O(Log(N)^2), instead of O(Log(N)^9).

I have used this approach to calculate the number (and percentage) of happy numbers up to 6,900 digits (that's just where my PC ran out of memory). I expect that others who have more time to work on this problem, and access to more computing power, will take this approach and extend the calculations much, much farther.

The interesting thing is that the percentage of happy numbers doesn't seem to converge. It oscillates between the extremes of 11.38% and 18.58%. The rate of oscillation seems to decrease as the digits increase, but with an infinite number of digits to add, the percentage can oscillate ever more slowly without ever converging.

The data is too large to include in this post, but here are a few sample values.

Number of happy numbers at the given number of digits:

N,Number of happy numbers

122,15544165655506391433684532303507685962217819565438704589001871991655459268216971175017952179595699881844518065420199858912

123,154485360845377664560973323837187366361428123300901954765448516303896594048403580319330850652889429925442411460554271489982

124,1534847004045147685904927341344183409044530364026064204411943494566662876130413227072620036419933782270174313123324336832815

125,15244593711682632352497273136169026311730421692813607912467354763254777757599100545721691881341183226700203445461184893053338

126,151374495542428661767707601614894539344499845653628322250474866334934846694486023804299925710916428070638010332077829610892465

127,1502752508393542025226048312991041934169287778778327571436271037047289517543589927010953832993410475450328077812076578706895739

128,14915285568001444315457587881830205823412383736697062777993396206355786330352327469265675663937406269551831517509536418065836791

129,148012072305605304157614902430948585259906458022371749912672972431288364833015956791011857076086142539333135797466519836183925798

130,1468580782316250337097725668112987220223784374713025106971398378487880498734546788259857859168494040144154444049065807068676864909

131,14569671860699052981557279925595393102599321310862050393639883547348874619528564167878799007539152969578465503205800690426113641476

132,144534684261826321239137350787262313306862079110647699168620243006088946027279051575588905818163761791765900039382791843732221218102

133,1433796014323529559268775155632772971373796122766235785276639719712597897783191701911675173017160388045798163980092693035013979682841

134,14224002641950045836974539646971579126868586450173722737778296290358352290662069548763690533185070032197451841899323923036643125922965

135,141125345482053742668626248505255315302105312713800325930819206905813704490018245885794088855898951456701964831557621819140891798203735

136,1400455866481458809270529396034094412472845640352080047932836612627695718141460347471225051377434407589772710973226035303774903364137463

137,13901115553303493150794782523701198143468240912504737084347126306894801460562206603412464619833751403076095231053435368480375845391177310

138,138032511480024972721340769531170892635334354516522682562981589991367517023542584138672051925966724428168970188800873521432766362565286198

139,1371198033093892896205129640484161632375581004512112567725099620321813098843721749620344121861844961470462663308430382578703704619628239660

140,13628248567244149265788686217497402831643146805152576812114204233973776959018793420100552113121495694589308835344933408582266132782342590189

141,135528937536641071964109787117144605026447788491359433699160727225955326092572352467011518353187028474566763845239038048169511933447465000977

142,1348661037876474618617414502654775482756497710663505944417272144567897085830957827297954417755577580155248816988453596796593311811464934922678

143,13429917946185159928323213612288610761800534156351041679052899039090052690368074394961399315955785682518527480907191838790069467229604336229219

144,133831353859975545767709060607659446530266204591279523001086825384741866117594880204303954290362064157213038919769426907073072529837172347657077

145,1334636288766538416868281373779332289665753030394917891206686582767377496491268344799141394237011685821199870719659782381410298845586456343014232

146,13319472244281350240314823692626714628493897884524267058262013387718947212357499688741611929648126091977862094021458113238450455491894788364483772

147,133021064141570959195385984627379052988536854188105221055775355159034970883343927052111448159033233664513762978873189162998669399758243317530831833

148,1329368859476717026028217919998016166268271659316887885144696698215301019151368050598289725738293855474060611120099628509284759287029846631184465633

149,13293440678374106345780938464345329773748484807555176459057623031849245793005164060149371982315807938833666152855601192817046537663648570423332898649

150,133004105322798589095229726818914132326479332066439924016567579436794365881550142133009760309643812973232815584618752212033808970768433004588287088993

Percentage of happy numbers at the given number of digits:

N,Percentage of happy numbers

100,15.21944023%

200,15.12995138%

300,14.63159060%

400,18.56506223%

500,16.90527241%

600,15.18765928%

700,14.42772706%

800,17.17536692%

900,16.62088861%

1000,14.67705935%

1100,15.27152393%

1200,14.41333187%

1300,13.55160038%

1400,14.24285858%

1500,16.54517766%

1600,16.67110306%

1700,13.46856922%

1800,15.54233525%

1900,15.91286476%

2000,14.24458227%

2100,11.76708033%

2200,13.27517060%

2300,12.24818225%

2400,11.72111717%

2500,15.18004135%

2600,15.01885103%

2700,13.15618170%

2800,14.25773404%

2900,14.65954304%

3000,12.52142474%

3100,12.03006897%

3200,12.62685213%

3300,12.17304447%

3400,12.56740858%

3500,15.62492461%

3600,17.22054088%

3700,15.54455857%

3800,15.38215233%

3900,16.29814112%

4000,15.64229625%

4100,14.01423502%

4200,15.20124847%

4300,16.43395210%

4400,15.58002249%

4500,15.43396975%

4600,16.48597547%

4700,15.82383788%

4800,14.87475039%

4900,14.95818778%

5000,14.53188325%

5100,14.22129580%

5200,13.99971056%

5300,14.11526813%

5400,13.74381805%

5500,13.54871710%

5600,14.12562143%

5700,15.29334245%

5800,15.05585464%

5900,14.46388302%

6000,14.11902622%

6100,14.34864916%

6200,14.81341784%

6300,15.10462711%

6400,14.49529915%

6500,14.17316293%

6600,14.48525284%

6700,13.99989288%

6800,13.50063539%

6900,13.88878029%

## Happy Numbers

**Moderators:** gmalivuk, Moderators General, Prelates

### Re: Don't they teach recreational math anymore?

This is fairly interesting.

EvanED wrote:be aware that when most people say "regular expression" they really mean "something that is almost, but not quite, entirely unlike a regular expression"

### Re: Happy Numbers

Edit: Oh, my mistake. You have to check all (<= 81*N) H(N,X) individually afterwards to get the number of happy numbers.

I see that your formula was added at OEIS (with the assumption that Bryan Wolf = BryanWolf :p).

Did you check the situation for different bases?

2 and 4 have a fraction of 100%, but maybe others are different.

Can you provide the fraction of happy numbers up to 10^N for all N as data somewhere? Maybe there is some structure in the oscillations.

I see that your formula was added at OEIS (with the assumption that Bryan Wolf = BryanWolf :p).

Did you check the situation for different bases?

2 and 4 have a fraction of 100%, but maybe others are different.

Can you provide the fraction of happy numbers up to 10^N for all N as data somewhere? Maybe there is some structure in the oscillations.

- eta oin shrdlu
**Posts:**450**Joined:**Sat Jan 19, 2008 4:25 am UTC

### Re: Happy Numbers

I assume you ran out of memory at N~6900 because you had to tabulate O(N) numbers of up to O(N) digits, to get your exact counts H(N,X). You can get substantial memory (and time) savings, if all you want is to see roughly how the fraction of happy numbers behaves with N, by using floating-point numbers instead. (To avoid machine-precision floating-point overflow, the values you save would then instead be these fractional counts 10

With this approximation I've gotten estimates up to about N=200000. Here's a graph of this happy fraction h(N):

(I can upload code or raw data if there's interest.)

In order to count the happy N-digit numbers, your algorithm requires that you explicitly compute the happy sums H(n,X) for each n=1,...,N. It's maybe worth noting that there's an algorithm that doesn't do this. The idea is that computation of H(n+1,X) given H(n,X) is just a convolution (represented by *),

H(n+1) = H(n) * K,

writing H(n) = [H(n,0), H(n,1), H(n,2), ...] and where the kernel K consists of 10 unit spikes, at the squared digits,

K(n) = 1, n = 0, 1, 4, 9, ..., 81 ;

K(n) = 0, otherwise.

Convolutions turn into multiplications in the Fourier domain, so

FFT[H(n)] = FFT[H(0)] FFT[K]

and since H(0) is just a delta function, its FFT is 1, and

H(N) = IFFT[ FFT[K]

For large N I think this should be faster than the straightforward implementation of the convolution, requring two O(N log(N)) FFTs and O(N) Nth powers instead of O(N

--

I have a vague heuristic explanation for the size and frequency of the oscillations. The first thing to notice is that almost all of the contribution to the count of happy numbers at N digits comes from a set of O(sqrt(N)) happy sums, which is basically just the central limit theorem. From the multinomial distribution of digit counts you can work out the mean (obviously m=28.5N; 28.5=(1+4+...+81)/10) and variance (v=721.05N) of the happy sum for a number randomly sampled from all numbers less than 10

So the happy fraction of N-digit numbers is found, roughly, by basically counting the fraction of a set of ~sqrt(v) numbers (centered around m) which are happy. Now if we pretend that a number is happy at random with some probability p, mostly independent of the happiness of its neighbors, then this count is roughly binomial, with probability p and ~sqrt(v) trials; so it has variance ~p(1-p)sqrt(v) = O(sqrt(N)), relative variance ~(1-p)/(p*sqrt(v)) = O(N

Because the happy fraction for the (N+1)-digit numbers, for large N, is sampling mostly the same ~sqrt(v) happy sums as for the N-digit numbers, there is a high correlation between these two happy fractions. You have to move to N+O(sqrt(N)) to make the overlap between the peaks of the happy-sum distribution small; so the period of these fluctuations should go as roughly O(sqrt(N)). Here's the same graph as before, but now scaling the horizontal axis as sqrt(N) (I still labeled the ticks with N though); now the horizontal scale of the fluctuations is approximately constant across the graph:

In this graph it looks like the amplitude of the fluctuations is decreasing with N. Subtracting something close to the middle and rescaling by N

Here it looks like the fluctuations might be getting larger, but I'm not sure.

^{-N}H(N,X).) I'm pretty sure this doesn't cause significant roundoff errors until at least N ~ 1/epsilon.With this approximation I've gotten estimates up to about N=200000. Here's a graph of this happy fraction h(N):

(I can upload code or raw data if there's interest.)

In order to count the happy N-digit numbers, your algorithm requires that you explicitly compute the happy sums H(n,X) for each n=1,...,N. It's maybe worth noting that there's an algorithm that doesn't do this. The idea is that computation of H(n+1,X) given H(n,X) is just a convolution (represented by *),

H(n+1) = H(n) * K,

writing H(n) = [H(n,0), H(n,1), H(n,2), ...] and where the kernel K consists of 10 unit spikes, at the squared digits,

K(n) = 1, n = 0, 1, 4, 9, ..., 81 ;

K(n) = 0, otherwise.

Convolutions turn into multiplications in the Fourier domain, so

FFT[H(n)] = FFT[H(0)] FFT[K]

^{n}and since H(0) is just a delta function, its FFT is 1, and

H(N) = IFFT[ FFT[K]

^{N}] .For large N I think this should be faster than the straightforward implementation of the convolution, requring two O(N log(N)) FFTs and O(N) Nth powers instead of O(N

^{2}) additions. I'm not sure how the accuracy compares with the straightforward approach, though.--

I have a vague heuristic explanation for the size and frequency of the oscillations. The first thing to notice is that almost all of the contribution to the count of happy numbers at N digits comes from a set of O(sqrt(N)) happy sums, which is basically just the central limit theorem. From the multinomial distribution of digit counts you can work out the mean (obviously m=28.5N; 28.5=(1+4+...+81)/10) and variance (v=721.05N) of the happy sum for a number randomly sampled from all numbers less than 10

^{N}. So for the N-digit numbers, most of the contribution to the count of happy numbers comes from the happy numbers within about sqrt(v)=sqrt(721.05N) of m=28.5N.So the happy fraction of N-digit numbers is found, roughly, by basically counting the fraction of a set of ~sqrt(v) numbers (centered around m) which are happy. Now if we pretend that a number is happy at random with some probability p, mostly independent of the happiness of its neighbors, then this count is roughly binomial, with probability p and ~sqrt(v) trials; so it has variance ~p(1-p)sqrt(v) = O(sqrt(N)), relative variance ~(1-p)/(p*sqrt(v)) = O(N

^{-1/2}) and so relative standard deviation O(N^{-1/4}). So the amplitude of the fluctuations in the happy fraction should be at most roughly O(N^{-1/4}), a pretty slow convergence. (I think it's even slower; for extremely large N you can repeat the same argument on the distribution of the happy sums to find an even narrower distribution. There are also a couple of questionable assumptions here, so I'm not too sure about this part.)Because the happy fraction for the (N+1)-digit numbers, for large N, is sampling mostly the same ~sqrt(v) happy sums as for the N-digit numbers, there is a high correlation between these two happy fractions. You have to move to N+O(sqrt(N)) to make the overlap between the peaks of the happy-sum distribution small; so the period of these fluctuations should go as roughly O(sqrt(N)). Here's the same graph as before, but now scaling the horizontal axis as sqrt(N) (I still labeled the ticks with N though); now the horizontal scale of the fluctuations is approximately constant across the graph:

In this graph it looks like the amplitude of the fluctuations is decreasing with N. Subtracting something close to the middle and rescaling by N

^{1/4}as argued above (i.e., the vertical axis is now (h(N)-0.14)*N^{1/4}) gives this:Here it looks like the fluctuations might be getting larger, but I'm not sure.

### Re: Happy Numbers

Very nice.

As you already noted, this is just an approximation, and p itself is fluctuating. p for N-digit numbers influences p' for numbers with [imath]\frac{10^N}{285}[/imath] to [imath]\frac{10^N}{28.5}[/imath] digits - as those p(N) are correlated, the fluctuations are even more spread out. No way to see them in existing data. 2*10^5-digit-numbers have relevant happy sums of ~28.5*2*10^5 or ~6 million, and the happy fraction is quite constant for numbers up to 10^3, 10^4, 10^5, 10^6 and 10^7.

While the short-term-oscillations go down with [imath]O(N^{-1/4})[/imath], I don't know if this is true for those long-term oscillations which are induced by short-term oscillations.

Now if we pretend that a number is happy at random with some probability p, mostly independent of the happiness of its neighbors

As you already noted, this is just an approximation, and p itself is fluctuating. p for N-digit numbers influences p' for numbers with [imath]\frac{10^N}{285}[/imath] to [imath]\frac{10^N}{28.5}[/imath] digits - as those p(N) are correlated, the fluctuations are even more spread out. No way to see them in existing data. 2*10^5-digit-numbers have relevant happy sums of ~28.5*2*10^5 or ~6 million, and the happy fraction is quite constant for numbers up to 10^3, 10^4, 10^5, 10^6 and 10^7.

While the short-term-oscillations go down with [imath]O(N^{-1/4})[/imath], I don't know if this is true for those long-term oscillations which are induced by short-term oscillations.

### Who is online

Users browsing this forum: No registered users and 11 guests