“Soooo big!” Counting on bit.ly, Part II

In Part I, I calculated that bit.ly is currently using a pool of 62^6 = 56,800,235,584 six-character strings that it can assign as re-directs, for example, to provide URL-shortening for tweets that are up against Twitter’s 140-character limit.

After I finished the calculations in part I and below, I looked for other discussions of this topic. There may be many, but they are not easy to find, because search strings including “bit.ly” are dominated by pages in which people have given re-directs to other pages using, well, bit.ly. (Duh.) One that did rise very high on Google was Scott Herold’s post on VMGuru from June 2009 [now vanished due to link rot or domain transfer]. He recognized, of course, the 26 lower- and 26 upper-case letters plus 10 digits for the suffixed strings, which at that time were (apparently) mostly five characters long. He had not recognized, however, that repetition was possible, so his calculation of the size of the URL pool required a more complex permutation formula and gave a smaller result than the simple approach I was able to use in Part I. In the first comment following Scott’s post, Arnim van Lieshout addressed both of these differences — multiple uses of a character and the appearance of 6-character strings — and did the appropriate N^R calculation for 5 characters. (I say all of this to emphasize that there was not much new in my Part I, at least at the conceptual level, even if I was newly discovering it for myself.)

Scott concluded his post with this statement: “I think it[‘]s safe to say that as long as the bit.ly database can handle the load, we don’t need to worry about them running out of URLs any time soon.” The conclusion of my Part I, however, was both an echo of and contrast with that comment: “Surely, then, 56 billion re-directs should last… us for a very long time, right? Hold that comforting thought for Part II….” [Cue the faintly-ominous organ music.]

There were two important sources of information that I used in what follows (which as far as I know is new, at least outside the corridors of bit.ly):

  • According to Dan Frommer in Business Insider, bit.ly went live on July 8, 2008. I assume this as the (time=0, assigned URLs=0) initial condition.
  • From compete.com, I got a count of the number of unique visitors per month to bit.ly for May 09 through May 10.

  • Taken together, that’s 14 data points for the growth in traffic at bit.ly since inception. Put it all in a spreadsheet, create a graph, fit a curve, calculate a correlation coefficient, and we have the following:

    Here are a couple of thoughts about this graph. Firstly, the fit of the curve to the data is pretty impressive at R^2=0.86 (R=0.93) and would rate a correspondingly high value on Chip Morningstar’s “Goodenoughness meter.” Secondly, it is possible that a major change in visitors’ habits occurred at the beginning of 2010, with a resultant steepening in the growth curve. But the general nature of the question I am going to ask and, more importantly, the quality of the answer I can get do not justify trying to be more precise right now. Think “goodenoughness.”

    The outcome in Part I was that bit.ly is currently using a pool of more than 56 billion unique 6-character strings for shortened URLs. The question here in Part II is how long before that pool is exhausted. Any answer is model-dependent, requires significant assumptions, and is inherently inexact and imperfect. So I’m going to do the simplest thing I can with the data: assume that each unique visitor solicits one shortened URL from bit.ly and calculate how many months will elapse until the total number of URLs assigned equals the number available.

    Theoretically, that means recognizing that the area under the curve extrapolated to the point of exhaustion equals the size of the pool. Operationally, that means integrating the two terms in the best-fit polynomial between 0 (July 08) and M, the number of months elapsed since bit.ly went live; setting the sum of the two terms equal to 56 billion and change; and solving for M. That leads to this equation:

    2232 M^3 + 116858 M^2 = 56,800,235,584.

    The same spreadsheet that produced the graph and best-fit curve can also solve this equation numerically. The result is M = 278 months, which is 23 years and 2 months. In fact, I could round to 20 years (one significant figure) without much effect. Either way, and whether I count from 2008 (when 5-character strings were apparently the norm), 2009, or 2010, the conclusion is pretty much the same: All other things being equal, bit.ly could exhaust its current URL pool somewhere around 2030.

    But of course all other things are never equal. So in Part III I will look at bit.ly as a “renewable resource.”

    This entry was posted in Technology and tagged , , , , . Bookmark the permalink.