Sun Microsystems’ x86 Strategy

Over the years Sun Microsystems has been extremely apprehensive of Linux, and x86 based servers. Sun did recognize cheap x86 hardware as a threat long time ago, but the company didn’t know what to do about it. Sun was afraid of entering the x86 market as the margins in the x86 market were not healthy, and such a move would have cannibalized its high-margin UltraSparc business.

After the dotcom bubble burst, Sun made a number of bizarre moves with regards to the x86 market, but these days Sun looks firmly committed to a game plan. Solaris x86 which Sun was once planning to discontinue has become central to the company’s plans. Sun is cutting back on UltraSparc development, and is readying itself to pursue life as a major x86 server vendor.

According to a report by The Register, Sun will be rolling-out a number of in-house engineered Opteron based servers and storage solutions in 2005. Another report by The Register claims that, Sun is planning to sell 414,000 Opteron based servers in 2007, and the company is aiming for a double-digit share of the x86 server market.

Some of Sun’s x86 gains will surely come at the expense of its high margin Sparc business so the company has to compensate for that loss. But, Sun can’t expect to gain market share quickly if it doesn’t price its x86 hardware competitively. Sun is faced with two conflicting goals in the x86 market, but the company has figured out a way to extract decent margins while pricing its x86 hardware competitively.

Sun has calculated that if it generates reasonable Opteron server volume, it can get massive discounts from AMD. Sun anticipates that none of the major x86 servers vendors will move away from Intel’s Xeon processors any time soon. For pragmatic reasons, Dell, HP, and IBM can’t afford a duplication of their server lines; IBM and HP are offering a few Opteron based servers, but they are not seriously marketing them. Consequently, Sun can own the Opteron based server market, and push AMD for the volume discounts it needs to make money on x86 sales.

If Sun succeeds Dell and others will be tempted to enter the Opteron market, but by that time Sun will have developed massive Opteron server design expertise, big volumes, and a complete range of Opteron server offerings. These first mover advantages will act as entry barriers and will stop Sun’s competitors from getting in the Opteron server market.

Sun also expects that Intel will not initiate a price war with AMD. A price war will only hurt Intel’s profitability as AMD isn’t making too much from server processors. Moreover, none of Intel’s major customers can force Intel to get into a price war with AMD. Intel’s major customers Dell, HP, and IBM can only try to pressure Intel, but that is the limit of what they can do.

Dell does get special treatment from Intel but not in the form of better pricing. If Dell was getting better pricing, IBM and HP would have reported Intel for anti-competitive behavior. Intel simply allocates large quotas of premium desktop processors (processors in short supply) to Dell, and Dell sells these processors at good margins. The only thing exclusive about the arrangement is that only Dell could have benefited from any such arrangement. Intel would have gladly duplicated the same arrangement with IBM and HP, but IBM didn’t have much of a PC business, and HP likely caters to a customer-base which isn’t too interested in acquiring the latest at a premium.

Intel will only get into a price war with AMD if Sun tries to take more than 25-30 percent of the x86 server market. Of course, Sun won’t hit that kind of market share any time soon, and the company has plenty of room for profitable growth.

Sun has the hardware costs under control, but it also needs to control Solaris development costs. According to an article by Wired, Sun spent $500 million developing Solaris 10. Solaris 9 was released in April 2002 so Sun must be spending roughly $200 million a year developing new functionality for Solaris.

An operating system also requires support infrastructure and maintenance, and these costs can be quite significant. Apart from the usual, compilers/assemblers/debuggers, utilities, patches/updates, drivers, code reviews, etc., Sun also has to support multiple Solaris ports. Sun is already committed to Sparc, Opteron, and Xeon ports, and the company will likely need to support an Itanium port as well. This suggests that Sun’s total Solaris development costs are no less than $400 million a year.

This amount of money is no pocket change, Dell spent $464 million on R&D in all of 2003, and Dell is much bigger in terms of revenues than Sun. The development cost is all the more significant for Sun as in the x86 market Solaris will be competing against Linux, and Linux development costs get shared by thousands of volunteers and businesses. Right now Redhat and Novell are charging good money for Linux, but this doesn’t mean they will continue to do so even when Solaris becomes a threat. The Linux development model allows Redhat and Novell a lot of leverage in setting prices, and they will use this leverage if required.

Any attempts to duplicate the Linux development model won’t work for Sun. Sun is releasing Solaris under an open source license, but it is mostly a marketing ploy to gain mind-share, and win Microsoft’s blessing. Sun knows Linux developers are not going to switch to Solaris in the short-term, and many Linux features are being maintained by commercial companies anyway.

Sun intends to combat the Linux advantage by assuring that the money it puts in Solaris yields a competitive advantage in the form of clear technical superiority over the competition. Sun will also attempt to tightly integrate software and hardware development in order to quickly bring advanced functionality to the market. But, the real key to Sun’s success will be volumes.

If Sun manages to sell millions of servers, Solaris development costs will get dispersed over the large number of units shipped and become irrelevant. Moreover, Sun will be able to make money from add-on sales, and service/maintenance contracts. Also, Solaris will displace Linux as the open source operating system of choice, and this will allow Sun to steal IBM and HP’s Unix customers.

Microsoft will be very supportive of Sun’s attempts to push Solaris as the open source operating system of choice. In the long-term a popular and technically superior Solaris will divert development effort from Linux and hurt Linux vendors. Solaris will be open source but it will still be controlled by Sun and will not pose a threat to Microsoft as Sun’s ambitions are quite limited. Sun wants to make money and is not interested in starting margin squeezing price wars with Intel and Microsoft.

Sun’s overall strategy suggests that UltraSparc has no future. Sun has poured large sums into UltraSparc development in the past and is being forced to do so even now, but UltraSparc has been clinically dead for a long time now. It lags benchmarks and likely costs the company more than 50 percent of its R&D budget. If Sun’s Opteron sales takeoff, the company will become less reliant on UltraSparc revenue, and the incentive to keep wasting money on UltraSparc will diminish.

Sun has placed a very bold bet on x86, and the company will emerge highly profitable and competitive if it manages to execute its game plan effectively. The downside is that if the game plan fails so will Sun. In that case, Sun will get swamped by hardware and software development costs and quickly go out of business.



by Usman Latif  [Dec 20, 2004]

Why Good Ideas Fail, Part II

In recent times the software industry has come under a lot of pressure from open source developers. Open source software (OSS) has the tendency to persevere and continue attacking a market in spite of failure. This characteristic is precisely the prerequisite for success in the software business. Consequently, OSS is the most unsettling competition to the software industry.

The success of Linux has emboldened open source developers, and people are now openly questioning the viability of the whole software business. Is the commercial software industry really necessary or will majority of software development become open source?

One would expect the software industry to innovate in the face of competition from OSS, but this has not happened so far. Microsoft is the most vulnerable company, but Microsoft has only wasted money on research; Microsoft research is focused on producing papers and not products. Moreover, the company has more than 50 billion dollars sitting idle in the bank; this is money which should have been used for product development, but Microsoft simply hasn’t figured out what it wants to develop.

Software is important to users and this importance is increasing with time; users care about software because it directly contributes towards productivity. Inspite of this utility, the average computer user is spending more on junk food than on software. If things continue the way they are going, soon investors will be convinced that all the money the software industry was capable of making has been made, and it is time to move on.

Software industry has to innovate as growth of the lucrative industrialized world PC market has slowed, and software companies can no longer rely on new PC sales for growth. Innovating in the software industry is not easy as existing software users tend to play it safe by sticking with products that have demonstrated clear productivity benefits. New and innovative products do sell, but in order to be successful they require patience and persistence on the part of software companies. This necessarily implies bigger investments, more risk, and a longer wait on returns.

New startups are always poorly funded, and can not be expected to try breaking into mature software markets. However, well funded companies capable of introducing innovative products are also taking a relaxed attitude. Instead of going the tough route of selling new software products, these companies want to make easy money by releasing endless upgrades.

Software companies believe they can out-innovate OSS products. This is wistful thinking; once an open source software project takes roots, it starts improving at a pace similar to those of commercial offerings. Moreover, software upgrades suffer from diminishing returns: users care about the first few updates, but after a while additional functionality stops being of interest to them. This behavior is not acknowledged by software companies as they have managed to sell upgrades in the past. However, past experience is not a good guide in an immature industry. Products that entered the market at the inception of the software industry are only now starting to experience diminishing returns; continued dependence on upgrades will ultimately lead to disaster.

The software industry has to address two problems in order to avert the crisis situation which is developing: the industry needs to justify long term investments in innovative products, and it needs to keep OSS at bay.

If the software industry manages to address the first problem it will automatically address the second problem. OSS has become a threat precisely because the software industry has not been innovating for the last 10 years. Currently, the software industry is making most of its money from a few well understood products that are not too hard to clone. This situation has given open source developers well defined targets that are easy to attain.

The bigger problem facing the software industry is that of justifying long term investments. A sophisticated software product such as an operating system can require a long time to develop and gain acceptance in the marketplace. Such projects are inherently high risk, and can fail even at the implementation stage. Because of the long time to profitability, opportunity costs become a big issue, and the total cost of the project can run into hundreds of millions of dollars.

There does not seem to be an easy way out; OSS is not going to go away and long term investments in new products are inherently risky. Fortunately, other industries have faced the same problems and successfully risen to the challenges; there is no reason the software industry can’t do the same.

The pharmaceutical drugs industry faces problems that are similar to those of the software industry but an order of magnitude worse. On average a new pharmaceutical drug requires $800 million to develop, and the development process can last 15 years. Once a drug is on the market it has a limited window of opportunity to make money. This window of opportunity is the period during which the drug has patent protection. After the patents for the drug expire, generic drug manufacturers start producing the drug and the product stops being a big money maker.

In both the software and the pharmaceutical industries, the primary threat to profitability comes from the eventual easy replicability of the ideas behind a product. Non-generic competition can be a threat at times, but does not alter the pricing structure of a market catastrophically.

Pharmaceutical companies understand the threat of generic competition very well. Pharmaceutical companies make highly risky long term investments in innovative products to counteract the effects of their big money makers becoming generic; failure to do so typically results in a massive loss of profitability. This is precisely what the software industry has to do to stay ahead of OSS (no advocation of software patents implied here).

Generic drugs constitute a majority of the drugs available to patients, but this has not stopped pharmaceutical companies from growing and staying extremely profitable. Pharmaceutical companies make their money by targeting lucrative market segments; thus, they are able to charge fat margins. Trends in the software industry suggest the same thing will eventually happen with software. Most of the software in popular use now will eventually become open source, and software companies will need to target market segments to make money.

In the pharmaceutical industry generic and non-generic drugs are complementary. Generic drugs play an important role, they ensure cheap availability of drugs to patients and force pharmaceutical companies to innovate. OSS seems to be assuming an analogous role in the software industry. Generic drugs are cheap but not free; generic drug manufacturers charge nominal profits on top of the manufacturing costs. Linux OSS distributions are starting to exhibit the same trend.

The profit making window of opportunity for a product varies vastly in the pharmaceutical industry, and depends on patent expirations. Software sales behave somewhat similarly but for entirely different reasons. Software products do not stop yielding fat margins abruptly, but they do stop making money eventually. In the software industry the profit making window of opportunity is a function of the popularity and implementation complexity of a product. This is a consequence of the limited resources available for OSS development.

OSS developers cannot afford to waste development effort on everything that comes along; they have to wait and watch to discover the successful products, and then develop implementations. Depending on the complexity of the product, this whole process can easily take more than 10 years. Also, a good open source implementation does not immediately kill commercial products; software migration is a very slow process and commercial products can continue to make money for quite a long while.

The pharmaceutical industry uses a lot of tricks to maintain consistent profitability. The industry uses diversification to guard against risk, it uses pipelining to guarantee a steady flow of new products, and it uses market research to direct development funds. All of these concepts are totally alien to the software industry.

Most software companies are one product companies, and have nothing in the pipeline apart from upgrades. As OSS developers get better organized, one product software companies will find it tough to survive. Only companies with diversified product portfolios, big R&D; budgets, and large pipelines will manage to deliver consistent results. Companies unable to restructure themselves will become roadkill.

Microsoft is a classic case of a company in dire need of restructuring. All of Microsoft’s products are high volume products; consequently, they are high priority targets for OSS developers. The OS functionality is already generic and MS Office is quickly headed that way. Microsoft has absolutely nothing in the pipeline to make up for the loss of revenue that is imminent. The company has so little confidence in the growth of its software business that it is diversifying out of software and into products like XBox.

Microsoft can only avoid the confrontation with OSS developers by segmenting the high volume markets. This is not too hard to do in case of MS Office. For instance, the $50 spreadsheet is being used for everything from financial analysis, and statistical analysis to Monte-Carlo simulations; this is akin to selling cheap Aspirin as a cure-all. Microsoft can easily create specialized products for the bigger segments, and make good money doing so. People will gladly pay a lot more than $50 for the productivity benefits such specialization will bring. OSS will try to match Microsoft, but if Microsoft continuously pipelines new products, the company can keep growing and stay profitable.

Microsoft’s big problem is the OS market. The OS market can be segmented as well, but the company does not have the expertise to do so. Currently, the OS design expertise of Microsoft is no greater than that of OSS developers. Microsoft immediately needs to start four-five OS projects with different design goals, and pipeline more of them in the future. A good chunk of these projects will fail, but Microsoft will eventually have some successful products, and the expertise to combat open source operating systems.

Microsoft might be a slow learner but it has adapted to changing market conditions in the past, and can be reasonably expected to do the same this time as well. Other big software companies will likely follow suit. This restructuring will be good for innovation in the software industry, and it will also be good for the ‘failed good ideas’ of the past.

Innovation is something the software industry simply cannot do without. The competitive nature of the industry ensures that good ideas cannot permanently die in this industry. Good ideas of the past are always waiting for smart and enterprising entrepreneurs to resurrect them.


by Usman Latif  [Mar 14, 2004]

Why Good Ideas Fail

Sometimes an excellent software product comes to market, gets rave reviews, and then disappears. This happens regardless of a clear need and want for the product. The problem is not specific to any particular software domain; BeOS, and Lotus Improv were both great products but seem to have nothing in common apart from being good ideas that failed.

Lotus Improv introduced a radical spreadsheet metaphor, and although BeOS was not as revolutionary it too brought plenty of good ideas with it. Both products received extensive media coverage and favorable reviews, but made no impact on the software market.

Why does software that everyone seems to like and want die? Surely, the ideas behind the products and the implementations can not be at fault. Ideas from such software have been recycled successfully in vastly inferior forms by competing products: Microsoft Excel’s pivot tables were inspired by Lotus Improv, but have been a big success. BeOS too has yielded many nice ideas; some of these ideas have been copied and others are being copied.

Was poor marketing the reason for the demise of BeOS and Lotus Improv? Marketing is always a factor in the success of a product; however, these products did receive extensive favorable media coverage. Better marketing might have enabled these products to do slightly better, but it is unlikely that marketing alone could have rescued them. So, what exactly did go wrong?

Software users tend to stick with software they are familiar with. This is not irrational behavior; software is complex to learn and use productively, and people prefer not having to relearn computer skills over and over again. Most people are inquisitive about new products but can not see straight away if these products carry any productivity benefits. Moreover, the utility of new functionality often becomes apparent only after individuals get accustomed to it.

People do adopt new software but may require a very long time to do so. Rapid adoption of new software occurs only if there are no pre-existing substitutes for the software. When the PC software industry was young many software companies went through rapid expansion because their products had no counterparts. As the PC software industry matured, software companies found it much harder to sell new ideas. For instance, Lotus 123 was the first good spreadsheet available on the PC and it became an instant hit, but MS Excel took a long time to displace Lotus 123, even though it was a superior product. Excel would have taken even longer to displace Lotus 123 if Windows 3.0 had not come along. The popularity of Windows 3.0 changed the market in favor of Excel, and thus accelerated Excel’s acceptance.

Good ideas fail mostly because they are not allowed sufficient time to succeed. More time allows an idea to mature, and also enables it to benefit from favorable marketplace changes. Additionally, the backers behind the idea get to learn from failure and are able to market the idea better.

The time needed for a new product to establish itself is mainly a function of the effort required of users to switch to the new product and be productive. Image viewers and the like can be instant successes, but more sophisticated applications need many years to gain a foothold. The experience of Linux suggests that a new OS can require more than a decade to gain credibility.

Lotus and Be both blundered by not allowing their products sufficient time on the market. Be was financially constrained and had to call it quits; however, Lotus acted in an incredibly stupid manner by killing the company’s only product with any chance of success.

The practice of discontinuing good ideas prematurely is widespread in the software industry. The story of BeOS and Lotus Improv has been repeated so often that innovation in the PC software industry has stalled. Software companies are convinced that users are indifferent to good ideas, and money spent on developing new ideas is a waste. The OS scene is one big casualty of this attitude of the software industry.

The lack of innovation in the OS scene is certainly not due to a lack of ideas. Good ideas are so abundant that people need not even look for them. An entire book, Unix Hater’s Handbook, is dedicated to pointing out all the things Unix got wrong. Some of the shortcomings mentioned in that book have been addressed, but others simply cannot be addressed in the framework of current generation operating systems. Of course, no software company has the courage to explore these ideas.

Things are hardly any better in the application software market. Only recently, a new startup, Quantrix, reintroduced the ideas pioneered by Lotus Improv (see the Quantrix tour for an overview of Lotus Improv ideas) to the Windows market. For almost a decade nobody had the courage to touch an insanely cool idea.

Software companies love to blame ideas instead of management for product failures. The industry simply does not understand that ideas need time to succeed, and has a hit or miss attitude towards products. This attitude is a consequence of the initial success of the PC software industry. The PC software industry is barely 25 years old and in the initial 10 years of its inception many products became instant hits. This initial phase of the software industry established a culture of impatience. Although, the PC software market has changed and products rarely become instant hits nowadays, the old habits continue to persist.

Only Microsoft seems to be immune from this culture of impatience. Unlike, Lotus and most other software companies who achieved instant success, Microsoft went through 5-6 extremely lean years after the founding of the company; this seems to have shaped the management culture at Microsoft. Unfortunately, Microsoft has primarily limited itself to copying successful ideas, and is not interested in radically different ideas.

Apart from commercial software another plausible source of innovation in the software industry is non-profit software development; however, non-profit development is biased towards cloning commercial software. Non-profit software development is about creating the most amount of utility for the most amount of people, and this goal is best served by copying successful ideas. Going for radical innovation is a suboptimal use of the very limited resources available for non-profit development. Linux exists because Linus wisely chose not to be ambitious. Had Linus tried fancy ideas, Linux might not have existed today.

The future of software looks bleak; Microsoft and other big software companies are unwilling to back good ideas, and non-profit developers are unable to do so. Can the software innovation stalemate be broken or is software innovation dead for good? Part II of this article will examine the issues the software industry needs to tackle in order to bring innovation back to the software scene.

LAST UPDATED by Usman Latif  [Feb 29, 2004]

Minesweeper Cascade Algorithm

A cascade in Minesweeper (Windows version) occurs when a single click uncovers many adjacent squares. A cascade can expand and uncover large portions of the board. Understanding the cascade algorithm is not only interesting from a programming point of view but can also provide a slight advantage when playing the game.

Inferring the behavior of the cascade algorithm just by playing Minesweeper is fairly difficult. However, it should be obvious that a cascade occurs only when all adjacent squares (next to the square clicked) do not contain mines. An interesting situation occurs when all the mines adjacent to the square clicked are marked as mines. One would expect a cascade to occur under this situation but this is not the case and can be verified.

Intuitively it seems that a cascade expands by uncovering adjacent mine-free blocks of 3×3 squares. To verify this conjecture I disassembled Minesweeper executable. The functions, StepBox, StepXY, and CountBombs were found to be relevant to the cascade algorithm. The cascade algorithm is summarized below:

  1. Initialize a queue
  2. If current square is non-mine uncover it and add to queue, otherwise gameover
  3. Remove a square from queue
  4. Count mines adjacent to it
  5. If adjacent mine count is zero, add any adjacent covered squares to queue and uncover them
  6. Go to step 3 if queue is not empty, otherwise finish

The description above is very high level and slightly simplified as compared to the actual algorithm. The actual data-structure used in the algorithm is a circular queue, implemented using an array, with a capacity of 100. Of course this assumes that there will never be more than a 100 elements in the queue. To justify this assumption Windows Minesweeper forces a maximum board size.

In the algorithm, step 5 looks to be fairly complex. The number of adjacent squares differs from place to place on the Minesweeper board. A corner square has only three adjacent squares, a square next to an edge of the board has five adjacent squares, and every other square has eight adjacent squares. It seems Minesweeper needs three cases dealing with each of these scenarios. Even counting the number of adjacent squares seems like quite a lot of code, so what does Minesweeper do?

The solution is quite clever. The visible Minesweeper board is the center of a larger board. For example, the 9×9 board is the center of an 11×11 board. The extra squares in the 11×11 board form a border around the 9×9 board. This allows every square of the visible board to have 8 adjacent squares, and therefore no special cases result. The cascade algorithm would break down if a non-visible square was put in the queue. This is avoided by initializing the extra squares as non-mine and uncovered. Now, these squares can never be put in the queue as step 5 of the algorithm only puts squares in the queue which are covered.

Another interesting detail is the iterative nature of the cascade algorithm. It is clear that the queue data-structure can be replaced with any container data-structure. A stack is an obvious choice. Which begs the question: why the implementors did not use recursion to implement the algorithm? The use of recursion in this case would have considerably simplified the algorithm as it would obviate the need for the queue data-structure. The recursive algorithm would work as follows:

  1. If current square is a mine gameover, otherwise uncover square
  2. Count mines adjacent to current square
  3. If adjacent mine count is zero, uncover all adjacent covered squares and make a recursive call for every one of them (steps 2-3)

Understanding the cascade algorithm yields some hints for gameplay as well. A cascade is more likely when the square clicked has fewer adjacent squares. This is true of the corners and edges. On the flip side, the cascade will have less room to expand when there are fewer adjacent blocks. Consequently the area uncovered by the cascade will be smaller.


LAST UPDATED by Usman Latif  [Nov 24, 2003]

Digital Implementation of the Library of Babel

In the well-known story “The Library of Babel” author Jorge Luis Borges envisions a library that contains all possible books of 410 pages. The library (Library of Babel) contains all books that have ever been written and also includes all possible books that can ever be written. The library is not limited to books that make sense. It stores all books consisting of any combination of a 25 symbol alphabet.

On first thought it seems that a digital implementation of the Library of Babel is impossible as it contains almost unimaginable amount of content. Surprisingly, this is not the case and digital implementations of such a library can exist.

In the digital Library of Babel every book is stored in ascii. Ascii allows more symbols than the 25 used by the Library of Babel, but is more convenient to use and avoids the effort of defining a new characterset.

A digital Library of Babel needs to store all the books in a data-structure and provide an interface for people to search for a particular book. To locate a particular book in the Library of Babel, searches based on titles, author names, and ISBN numbers cannot be used.

Titles don’t provide any useful information about the books in the library. The library contains gazillions of books with the same title but different contents. Exactly the same argument applies to author names as every author has written every possible book.

ISBN numbers in order to be practical must identify a book uniquely but not contribute to the book’s uniqueness. Otherwise, the library will contain many copies of every book which differ only in ISBN numbers.

Even with this restriction ISBN numbers are useless in the Library of Babel. The ISBN number identifying a book (on average) has to be as big as the contents of the book. ISBN numbers are practical only if the ratio of books that exist to books that are possible is very small. Suppose the world has only two books, written with symbols a and b, and having the contents:abab,baaa

It is easy to identify the first book with the tag a and the second book with the tag b. The Library of Babel is not so sparsely populated. It contains all possible books which includes the following list:a,b,c..,aa,ab,ac..,ba,bb,bc..

It is possible to identify the books consisting of abab and baaa with a and b respectively, but then the books with contents a and b have to be identified with a larger tag than their length. On average an identification tag as big as the book itself will be required.

Without loss of generality it can be assumed that books in the Library of Babel don’t have titles, author names, ISBN numbers, and any other identifying tags. Such tags as noted above don’t contribute anything towards easing searching for books and are only a waste of space.

The only choice for uniquely identifying the books in Library of Babel is the text of the books. Specifying only partial text does not identify books uniquely. The library is too densely populated and even leaving one bit out of the full-text results in non-unique matches. Therefore, the full-text of a book is required in order to locate a particular book.

Unlike the original library the digital version does not impose any restrictions on the length of the books contained in it. The digital Library of Babel contains the books as raw text. It does not store any formatting for the books. The users are free to format the books as they wish after they have retrieved them.

The following data-structure given as Haskell code can be used to implement the library:data Tree = Branch (Tree,Tree)
treeOfBabel = Branch (treeOfBabel,treeOfBabel)

The implementation uses a circular data-structure, treeOfBabel, to store all the data in the library using O(1) storage space. Searching for a book involves traversing the correct path in the tree. At each branch in the tree the first tree corresponds to a 0 bit and the second tree corresponds to a 1 bit. Given a search string, the search function navigates the tree using the bits of the search string. It accumulates the bits of the result while navigating the tree and after processing the search string it returns the accumulated bits as the result of the search. These bits can then be interpreted as ascii text.

The following implementation of the search function uses a slight optimization:searchBabel = id -- id is Haskell's identity function (id x = x)

The optimization is based on the fact that the user is always going to provide the full-text of the book so the input can be returned to the user as an answer to his/her query. This optimization does not work for ordinary libraries as they might or might not contain a book but the Library of Babel is assured to have any book.

The library of Babel exists, there can be no doubt about it. The big question that needs to be answered is whether the library of Babel is full of information or is it devoid of any information? The perspective of a philosopher suggests that the library is as full as it can be. On the other hand intuition suggests that the library is empty, it carries no useful information. Is our intuition right or wrong?

I will examine this issue in detail, using insights from information theory, in another article.


LAST UPDATED by Usman Latif  [Nov 08, 2003]

Part 3 – Google’s Bid-for-placement Patent Settlement Cover-up

Google always had excellent search engine indexing technology, but Google’s search technology by itself never generated profits for the company. Google’s profitability comes from its search technology combined with text ads and an ad placement mechanism that allows advertisers to bid for the placement of their ads (bid-for-placement mechanism). From a profitability perspective, the bid-for-placement mechanism is as valuable as Google’s indexing technology. In the absence of the bid-for-placement mechanism, ad pricing can at best be inefficient. The bid-for-placement mechanism frees up extensive resources that would otherwise be required to set ad prices, and it allows Google to charge ad sponsors in proportion to the value Google is delivering to the sponsors.

The bid-for-placement mechanism was pioneered by Overture, a paid search specialist company. In July 2001, the US patent office issued Overture a patent covering the mechanism. Patent 6,269,361 also known as the ‘361 patent was bad news for Google: it threatened Google’s core business model. It was imperative for Google to have access to the ‘361 patent, but Google never managed to negotiate a satisfactory licensing agreement with Overture. Consequently, in April 2002 Overture sued Google over patent infringement.

Google greatly miscalculated the threat posed by Overture and was eventually forced to settle at a most inopportune time. However, at the time, Google likely believed Overture was totally dependent on the ‘361 patent and could not afford any risk of having the patent invalidated in a court. Google calculated that it will win outright, if the courts dismiss Overture’s lawsuit early, but if things don’t go its way, it will still manage to cut a palatable deal. Unfortunately for Google, Overture did happen to have a few other options.

Overture started life as a paid listing search engine. The company was known as GoTo.com when it first began operations. Advertisers placed ads with GoTo.com and in response to queries the GoTo.com website produced a listing of ads ranked by the ad sponsors’ bids for the search keywords. The GoTo.com website never made it big, and the company was forced to pitch the idea of bid-for-placement to outside search engines. The company even changed its name to Overture to better reflect its new business model.

Overture had a bit more success with this new business model. Under this new business model Overture signed up affiliates and managed their ad sales via its bid-for-placement system. Overture recognized the ad sales of its affiliates as revenue on its books. The portion of the ad sales that went to its affiliates was recorded as traffic acquisition costs.

The paid-search market took off in 2001, and Overture prospered in the rapidly growing market. Overture earned $73 million on revenues of $667 million in 2002, but then a disturbing trend became apparent: Overture’s traffic acquisition costs started growing rapidly. Overture’s traffic acquisition costs grew from 53 percent of revenue to over 62 percent of revenue in just the last three quarters of 2002.

Even after handing over 62 percent to affiliates, a 38 percent cut of ad sales for providing access to the bid-for-placement mechanism is hardly unimpressive, and reflects the revenue generating power of the bid-for-placement mechanism. The trouble was the 38 percent number was just an average, and not every affiliate was paying the same amount to Overture. Overture was increasingly growing dependent on fewer and fewer affiliates for more and more of its revenue. Overture’s smaller customers were in decline as advertisers were focusing primarily on highly trafficked websites. Overture’s 2002 annual report reveals that Microsoft and Yahoo were responsible for 60 percent of Overture’s revenue that year. This dependency allowed the big affiliates to claim bigger and bigger chunks of ad sales as their share of the pie.

Overture certainly had a very valuable patent in the ‘361 patent, but the patent had not been tested in court. Overture knew if either Yahoo or Microsoft walked, the company could expect major layoffs and a severely weakened bargaining position with the rest of its affiliates. Worse, Overture could lose it all, if the company suffered a setback in court.

Overture’s management realized that the ‘361 patent was much more valuable to a company not so severely dependent on a single source of revenue. Such a company could bargain better, and could handle setbacks in courts. Yahoo made a lot of sense as Yahoo handled more web-traffic than anyone else. If the paid-search market grew as anticipated, Yahoo was going to generate most of its revenue from paid-search. Essentially, Yahoo would become what Overture wanted to be, but without the handicap of having to hand over vast chunks of revenue as traffic acquisition costs to affiliates. Consequently, it was reasonable to expect that over the long term Yahoo’s stock will perform as well as or better than Overture’s stock.

Yahoo too realized the value of Overture. Yahoo had always outsourced its search functionality to outside search engines. With Google becoming all powerful and no new contenders emerging in the search arena, Yahoo was in a tight spot. Yahoo realized that to be competitive in the search engine space, it needed access to Google’s intellectual property. Overture’s claim on the core of Google’s business model was just the bargaining chip Yahoo needed.

In July 2003 Yahoo acquired Overture for $1.63 billion. This was an expensive deal as Yahoo’s stock was not flying very high at the time. (Yahoo’s stock has more than doubled since then.) Also, Overture did not have anything valuable apart from the ‘361 patent. Yahoo and Microsoft counted for the bulk of Overture’s revenue, and were it not for the ‘361 patent, Microsoft would have certainly walked.

The Yahoo Overture deal meant that Google could no longer expect to cut a deal whenever things turned against it. Yahoo was now in a position to cut a very tough bargain with Google.

Google and Yahoo settled the ‘361 patent dispute in August 2004. Google disclosed the settlement in an SEC filing just before its IPO. The relevant excerpt from Google’s SEC filing reads:

… Overture will dismiss its patent lawsuit against us and has granted us a fully-paid, perpetual license to the patent that was the subject of the lawsuit and several related patent applications held by Overture. The parties also mutually released any claims against each other concerning the warrant dispute. In connection with the settlement of these two disputes, we issued to Yahoo 2,700,000 shares of Class A common stock.

At the time of the patent settlement disclosure, 2.7 million shares of Google represented roughly 1 percent of the company. Google estimated that the shares were worth somewhere between $260 and $290 million. (The estimate was based on Google’s proposed initial public offering price range of $108 to $135.) Yahoo had spent billions to corner Google so why would the company settle for such a paltry sum?

Actually, Google exaggerated the value of the shares it issued to Yahoo. Only a few days after the disclosure of the patent settlement, Google lowered its proposed initial public offering price range to between $85 and $95. Moreover, Google attempted to muddle up the math even further by jumbling together the numbers of the patent licensing settlement with the settlement of a separate second dispute with Yahoo.

In the second dispute, Yahoo claimed that a warrant it held, in connection with a June 2000 services agreement between the two companies, entitled it to 3.7 million shares of Google. Google disputed that claim and argued that it compensated Yahoo fully on the warrant account by issuing 1.2 million shares in June 2003.

Google’s 2004 annual report has the settlement value pinned down to $229.5 million, and it sheds some light on how much was paid for what. The section about the Yahoo settlement reads:

In the year ended December 31, 2004, the Company [Google] recognized the $201.0 million non-recurring charge related to the settlement of the warrant dispute [with Yahoo] and other items. The non-cash charge associated with these shares was required because the shares were issued after the warrant was converted. The Company realized a related income tax benefit of $82.0 million. The Company also capitalized $28.5 million related to certain intangible assets obtained in this settlement.

In the IPO filing the focus was on the patent dispute, but here the emphasis is clearly on the warrant dispute and ‘other items’. The attempt to fudge is there, but it is obvious the ‘other items’ do not cover the patent settlement. This because the $201.0 million amount was expensed all at once, whereas patents have a life and are expensed over that life. Google recorded the rest $28.5 million as intangible assets on its books. Patents are recorded as intangibles, so the $28.5 million amount is all Google paid for the patent licenses.

Why did Yahoo charge only $28.5 million for patents it acquired for $1.63 billion? The $28.5 million seems to cover Yahoo’s legal expenses associated with the patent litigation and likely does not represent any payment for patent licenses. Google must have compensated Yahoo in some other way, and the company is not being forthright about the settlement terms.

Yahoo certainly craved Google’s intellectual property, but the terms of the settlement make no mention of Yahoo licensing any of Google’s patents. Obviously, Google could not expect to get away with hiding an IP licensing agreement with its biggest competitor; therefore, it is reasonable to assume that Yahoo did not get such an agreement. But, Yahoo had an indirect way of achieving access to Google’s intellectual property.

Google’s SEC filings mention a “fully-paid, perpetual license,” but they omit the word non-revocable. Nowhere is there any mention of the patent license being, “fully-paid, non-revocable, and perpetual.” It is unreasonable to expect Google to inadvertently miss the word non-revocable, so Google’s license to the ‘361 patent has to be revocable. (Non-revocable patent licensing deals are nothing exotic.)

Now, there remains the question of the terms which if violated cause Google’s patent license to become revocable. Yahoo was in a very strong bargaining position as its patents covered the core of Google’s business model, so it must have asked for something big. The only obvious thing that makes sense was for Yahoo to have conditioned the revocability of Google’s patent license on Google’s not litigating against Yahoo. Such a condition puts Google on a leash, and effectively grants Yahoo the authority to use Google’s IP with complete immunity. Of course, there are other possibilities, but again there is no reason for Yahoo to let Google off the hook. Whatever Google is hiding is more than a little embarrassing.

The disclosure of any embarrassing patent licensing deal could have derailed Google’s IPO, so Google chose to cover it up. But why did Google settle for such poor terms? Why not go for an IPO without settling with Yahoo?

At the time the deal was being negotiated, Google was having a hard time selling its IPO to investors. Worse, a preliminary ruling concerning the Overture lawsuit was due, and Google expected extensive saber rattling from Yahoo if it did not settle. All these factors combined with market conditions could have derailed Google’s IPO or could have caused Google’s shares to tumble after they started trading. Clearly, some people at Google were excessively worried about the value of their options; greed was certainly at work.

Interestingly, Google had no need to go through an IPO. The company was profitable and doing quite well. The pressure to do an IPO was coming from the venture capitalists and employees. The VCs wanted to show impressive gains on their Google investment, and the employees wanted to cash out their stock options. The company could have made a better deal had it stayed private. Under this scenario, there would be no incentive to rush the deal, and Google could have held out for better terms.

Sadly, selfish interests have pushed Google to the brink of disaster. As a public corporation, Google was obligated to inform its shareholders of all significant risks to its business. Google not only failed to disclose the risks, but it intentionally tried to mislead its shareholders and potential investors. Google is certainly asking for an SEC investigation into its business practices and might have to pay fines. Worse, if ever Google’s stock takes a nosedive, Google can expect shareholder lawsuit hell. Shareholders are going to claim that Google intentionally hid potential risks from them, and they are going to be right about that.

Google is a great company but it needs new management. Managers who misrepresent information and manipulate investor expectations with an intent to profit from such actions are certainly not fit to run Google. Google’s shareholders must demand the complete dissolution of the current ineffective board of directors. (Some of the directors might be complicit in the cover-up.) New, responsible directors should be brought in to rid the company of unscrupulous managers and the culture of greed which is threatening to destroy the company.


by Usman Latif  [Apr 29, 2005]


LASTE Updated: May 07, 2005

Part 2 – Google’s Bid-for-placement Patent Settlement Cover-up

Google always had excellent search engine indexing technology, but Google’s search technology by itself never generated profits for the company. Google’s profitability comes from its search technology combined with text ads and an ad placement mechanism that allows advertisers to bid for the placement of their ads (bid-for-placement mechanism). From a profitability perspective, the bid-for-placement mechanism is as valuable as Google’s indexing technology. In the absence of the bid-for-placement mechanism, ad pricing can at best be inefficient. The bid-for-placement mechanism frees up extensive resources that would otherwise be required to set ad prices, and it allows Google to charge ad sponsors in proportion to the value Google is delivering to the sponsors.

The bid-for-placement mechanism was pioneered by Overture, a paid search specialist company. In July 2001, the US patent office issued Overture a patent covering the mechanism. Patent 6,269,361 also known as the ‘361 patent was bad news for Google: it threatened Google’s core business model. It was imperative for Google to have access to the ‘361 patent, but Google never managed to negotiate a satisfactory licensing agreement with Overture. Consequently, in April 2002 Overture sued Google over patent infringement.

Google greatly miscalculated the threat posed by Overture. Google likely believed Overture was totally dependent on the ‘361 patent and could not afford any risk of having the patent invalidated in a court. Google calculated that it will win outright, if the courts dismiss Overture’s lawsuit early, but if things don’t go its way, it will still manage to cut a palatable deal. Unfortunately for Google, Overture did happen to have a few other options.

Overture started life as a paid listing search engine. The company was known as GoTo.com when it first began operations. Advertisers placed ads with GoTo.com and in response to queries the GoTo.com website produced a listing of ads ranked by the ad sponsors’ bids for the search keywords. The GoTo.com website never made it big, and the company was forced to pitch the idea of bid-for-placement to outside search engines. The company even changed its name to Overture to better reflect its new business model.

Overture had a bit more success with this new business model. Under this new business model Overture signed up affiliates and managed their ad sales via its bid-for-placement system. Overture recognized the ad sales of its affiliates as revenue on its books. The portion of the ad sales that went to its affiliates was recorded as traffic acquisition costs.

The paid-search market took off in 2001, and Overture prospered in the rapidly growing market. Overture did well in 2001 and 2002, but then a disturbing trend became apparent: Overture’s traffic acquisition costs started growing rapidly. Overture’s traffic acquisition costs grew from 53 percent of revenue to over 62 percent of revenue in just the last three quarters of 2002.

Even after handing over 62 percent to affiliates, a 38 percent cut of ad sales for providing access to the bid-for-placement mechanism is hardly unimpressive, and reflects the revenue generating power of the bid-for-placement mechanism. The trouble was the 38 percent number was just an average, and not every affiliate was paying the same amount to Overture. Overture was increasingly growing dependent on fewer and fewer affiliates for more and more of its revenue. Overture’s smaller customers were in decline as advertisers were focusing primarily on highly trafficked websites. Overture’s 2002 annual report reveals that Microsoft and Yahoo were responsible for 60 percent of Overture’s revenue that year. This dependency allowed the big affiliates to claim bigger and bigger chunks of ad sales as their share of the pie.

Overture certainly had a very valuable patent in the ‘361 patent, but the patent had not been tested in court. Overture knew if either Yahoo or Microsoft walked, the company could expect major layoffs and a severely weakened bargaining position with the rest of its affiliates. Worse, Overture could lose it all, if the company suffered a setback in court.

Overture’s management realized that the ‘361 patent was much more valuable to a company not so severely dependent on a single source of revenue. Such a company could bargain better, and could handle setbacks in courts. Yahoo made a lot of sense as Yahoo handled more web-traffic than anyone else. If the paid-search market grew as anticipated, Yahoo was going to generate most of its revenue from paid-search. Essentially, Yahoo would become what Overture wanted to be, but without the handicap of having to hand over vast chunks of revenue as traffic acquisition costs to affiliates. Consequently, it was reasonable to expect that over the long term Yahoo’s stock will perform as well as or better than Overture’s stock.

Yahoo too realized the value of Overture. Yahoo had always outsourced its search functionality to outside search engines. With Google becoming all powerful and no new contenders emerging in the search arena, Yahoo was in a tight spot. Yahoo realized that to be competitive in the search engine space, it needed access to Google’s intellectual property. Overture’s claim on the core of Google’s business model was just the bargaining chip Yahoo needed.

In July 2003 Yahoo acquired Overture for $1.63 billion. This was an expensive deal as Yahoo’s stock was not flying very high at the time. (Yahoo’s stock has more than doubled since then.) Also, Overture did not have anything valuable apart from the ‘361 patent. Yahoo and Microsoft counted for the bulk of Overture’s revenue, and were it not for the ‘361 patent, Microsoft would have certainly walked.

The Yahoo Overture deal meant that Google could no longer expect to cut a deal whenever things turned against it. Yahoo was now in a position to cut a very tough bargain with Google.

Google and Yahoo settled the ‘361 patent dispute in August 2004. Google disclosed the settlement in an SEC filing some time before its IPO. The relevant excerpt from Google’s SEC filing reads:

… Overture will dismiss its patent lawsuit against us and has granted us a fully-paid, perpetual license to the patent that was the subject of the lawsuit and several related patent applications held by Overture. The parties also mutually released any claims against each other concerning the warrant dispute. In connection with the settlement of these two disputes, we issued to Yahoo 2,700,000 shares of Class A common stock.

At the time of the patent settlement disclosure, 2.7 million shares of Google represented roughly 1 percent of the company. Google estimated that the shares were worth somewhere between $260 and $290 million. (The estimate was based on Google’s proposed initial public offering price range of $108 to $135.) Yahoo had spent billions to corner Google so why would the company settle for such a paltry sum?

Actually, Google exaggerated the value of the shares it issued to Yahoo. Only a few days after the disclosure of the patent settlement, Google lowered its proposed initial public offering price range to between $85 and $95. Moreover, Google attempted to muddle up the math even further by jumbling together the numbers of the patent licensing settlement with the settlement of a separate second dispute with Yahoo.

In the second dispute, Yahoo claimed that a warrant it held, in connection with a June 2000 services agreement between the two companies, entitled it to 3.7 million shares of Google. Google disputed that claim and argued that it compensated Yahoo fully on the warrant account by issuing 1.2 million shares in June 2003.

Google’s 2004 annual report has the settlement value pinned down to $229.5, and it sheds some light on how much was paid for what. The section about the Yahoo settlement reads:

In the year ended December 31, 2004, the Company [Google] recognized the $201.0 million non-recurring charge related to the settlement of the warrant dispute [with Yahoo] and other items. The non-cash charge associated with these shares was required because the shares were issued after the warrant was converted. The Company realized a related income tax benefit of $82.0 million. The Company also capitalized $28.5 million related to certain intangible assets obtained in this settlement.

Again there is an attempt to fudge, but it is clear that out of $229.5 million settlement $201.0 million were for the settlement of the warrant dispute. This because the $201.0 million amount was expensed all at once, whereas patents have a life and are expensed over that life. Google recorded the rest $28.5 million as intangible assets on its books. Patents are recorded as intangibles, so the intangible assets mentioned above have to be the patent licenses Google acquired.

Why did Yahoo charge only $28.5 million for patents it acquired for $1.63 billion? The $28.5 million seems to be compensation for Yahoo’s legal expenses and likely does not represent any monetary compensation for patent licenses. Google must have compensated Yahoo in some other way, and the company is not being forthright about the settlement terms.

Yahoo certainly craved Google’s intellectual property, but the terms of the settlement make no mention of Yahoo licensing any of Google’s patents. Obviously, Google could not expect to get away with hiding an IP licensing agreement with its biggest competitor; therefore, it is reasonable to assume that Yahoo did not get such an agreement. But, Yahoo had an indirect way of achieving access to Google’s intellectual property.

Google’s SEC filings mention a “fully-paid, perpetual license,” but they omit the word non-revocable. Nowhere is there any mention of the patent license being, “fully-paid, non-revocable, and perpetual.” It is unreasonable to expect Google to inadvertently miss the word non-revocable, so Google’s license to the ‘361 patent has to be revocable. (Non-revocable patent licensing deals are nothing exotic.)

Now, there remains the question of the terms which if violated cause Google’s patent license to become revocable. Yahoo was in a very strong bargaining position as its patents covered the core of Google’s business model, so it must have asked for something big. The only obvious thing that makes sense was for Yahoo to have conditioned the revocability of Google’s patent license on Google’s not litigating against Yahoo. Such a condition puts Google on a leash, and effectively grants Yahoo the authority to use Google’s IP with complete immunity. Of course, there are other possibilities, but again there is no reason for Yahoo to let Google off the hook. Whatever Google is hiding is more than a little embarrassing.

The disclosure of any embarrassing patent licensing deal could have derailed Google’s IPO, so Google chose to cover it up. But why did Google settle for such poor terms? Why not go for an IPO without settling with Yahoo?

At the time the deal was being negotiated, Google was having a hard time selling its IPO to investors. Worse, a preliminary ruling concerning the Overture lawsuit was due, and Google expected extensive sabre rattling from Yahoo if it did not settle. All these factors combined with market conditions could have derailed Google’s IPO or could have caused Google’s shares to tumble after they started trading. Clearly, some people at Google were excessively worried about the value of their options; greed was certainly at work.

Interestingly, Google had no need to go through an IPO. The company was profitable and doing quite well. The pressure to do an IPO was coming from the venture capitalists and employees. The VCs wanted to show impressive gains on their Google investment, and the employees wanted to cash out their stock options. The company could have made a better deal had it stayed private. Under this scenario, there would be no incentive to rush the deal, and Google could have held out for better terms.

Sadly, selfish interests have pushed Google to the brink of disaster. As a public corporation, Google was obligated to inform its shareholders of all significant risks to its business. Google not only failed to disclose the risks, but it intentionally tried to mislead its shareholders and potential investors. Google is certainly asking for an SEC investigation into its business practices and might have to pay fines. Worse, if ever Google’s stock takes a nosedive, Google can expect shareholder lawsuit hell. Shareholders are going to claim that Google intentionally hid potential risks from them, and they are going to be right about that.

Google is a great company but it needs new management. Managers who misrepresent information and manipulate investor expectations with an intent to profit from such actions are certainly not fit to run Google. Google’s shareholders must demand the complete dissolution of the current ineffective board of directors. New, responsible directors should be brought in to rid the company of unscrupulous managers and the culture of greed which is threatening to destroy the company.

LAST UPDATED by Usman Latif  [May 31, 2005]

Google’s Bid-for-placement Patent Settlement Cover-up

When Google acquired blogging startup Blogger in February 2003, nobody foresaw that two years later Google will earn the ignominious distinction of being one of the first corporations to fire an employee for blogging. In January of this year, Mark Jen a new Google hire was sent packing for a blogging offense. Mark Jen writes in his blog:

“I was terminated from google. either directly or indirectly, my blog was the reason. this came as a great shock to me because two days ago we had looked at my blog and removed all inappropriate content – the comments on financial performance and future products. for my next entries, I was very cognizant of my blogging content, making sure to stay away from these topics. I mean, as much as I like to be open and honest about communicating to users and customers, i’m not insubordinate. if I was told to shut down this blog, I would have.”

Mark Jen is justly irate about the harsh treatment. Google did not have a blogging policy at the time, and Mark did his best to comply with the wishes of his superiors. Furthermore, Mark Jen was not a high ranking Google manager with important information to disclose; he had only just started work at Google. If Google’s management wanted to avoid making an incidence of the Mark Jen case, it had plenty of options, but Google’s management willingly embraced bad PR and proceeded with a response disproportionate to the offense.

(Update: July 10, 2008
In retrospect, I feel I exercised bad judgement in connecting Mark Jen’s firing to Google’s cover-up of its patent settlement with Yahoo. In fact, this article as originally published in April 2005 did not contain any reference to Mark Jens’ firing, but I subsequently added the material. Mark Jen’s firing was suspicious and made no sense, and it may have been prompted by motives I outlined in this article; however, I have no evidence that the firing was linked to Google’s cover-up of its Yahoo deal.)

It turns out there was a very good reason for management’s bizarre behavior. At the time of Mark Jen’s dismissal, Google’s management was playing a game of deceit to mask an ugly but not so well-kept secret. Management wanted employees to keep their mouths shut and firing Mark Jen, a new hire, was a convenient way of conveying the message without ruffling many feathers.

Google’s ugly secret has quite a bit of history. The trouble started with Google’s push for profitability. Google owned excellent search engine indexing technology all along, but it lacked a business model to extract profits from search traffic. Google’s solution to the problem was the AdWords business model. The AdWords business model combines Google’s search technology, text ads, and an ad-placement mechanism that allows advertisers to bid for the placement of ads (the bid-for-placement mechanism).

The bid-for-placement mechanism is a very important component of the AdWords business model. In the absence of the bid-for-placement mechanism, ad pricing can at best be inefficient. The bid-for-placement mechanism frees up extensive resources that would otherwise be required to set ad prices, and it allows Google to charge ad sponsors in proportion to the value Google is delivering to the sponsors. Thus, from a profitability perspective, the bid-for-placement mechanism is as valuable as Google’s indexing technology. Unfortunately, Google did not own the rights to this important mechanism when it adopted the AdWords business model.

Overture Inc. a paid search specialist pioneered the bid-for-placement mechanism. In July 2001, the US patent office issued Overture a patent for the mechanism. Patent 6,269,361; also known as the ‘361 patent, essentially covered all AdWords like business models. Google needed access to the ‘361 patent in order to use AdWords, but it never managed to negotiate a licensing agreement with Overture. Consequently, in April 2002 Overture sued Google over patent infringement. Google was now operating with a loaded gun pointed to its head.

Overture started life as a paid listing search engine. The company was known as GoTo.com when it first began operations. Advertisers placed ads with GoTo.com and in response to queries the GoTo.com website produced a listing of ads ranked by the ad sponsors’ bids for the search keywords. The GoTo.com website never made it big, and the company was forced to pitch the idea of bid-for-placement to outside search engines. The company even changed its name to Overture to better reflect its new business model.

Overture had a bit more success with this new business model. Under this new business model Overture signed up affiliates and managed their ad sales via its bid-for-placement system. Overture recognized the ad sales of its affiliates as revenue on its books and recorded the portion of ad sales that went to its affiliates as traffic acquisition costs.

The paid-search market took off in 2001, and Overture prospered in the rapidly growing market for a while. Overture earned $73 million on revenues of $667 million in 2002, but then a disturbing trend became apparent: Overture’s traffic acquisition costs started spiraling out of control. Overture’s traffic acquisition costs grew from 53 percent of revenue to over 62 percent of revenue in just the last three quarters of 2002.

Even after handing over 62 percent to affiliates, a 38 percent cut of ad sales for providing access to the bid-for-placement mechanism is hardly unimpressive, and reflects the revenue generating power of the bid-for-placement mechanism. Unfortunately, the 38 percent number was just an average, and not every affiliate was paying the same amount to Overture. Overture was increasingly growing dependent on fewer and fewer affiliates for more and more of its revenue. Overture’s smaller customers were in decline as advertisers were concentrating primarily on highly trafficked websites. Overture’s 2002 annual report reveals that Microsoft and Yahoo were responsible for 60 percent of Overture’s revenue that year. This dependency allowed the big affiliates to claim bigger and bigger chunks of ad sales as their share of the pie.

The ‘361 patent was certainly very valuable, but the patent had not been tested in court. If either Yahoo or Microsoft walked, Overture could expect major layoffs and a severely weakened bargaining position with the rest of its affiliates. Additionally, one big setback in court could end Overture’s game for good. Such risks implied a vulnerable market position, and pushed Overture’s management towards seeking a buyer for the company. A company with well-diversified sources of revenue was Overture’s best option. Such a company could handle setbacks in courts and bargain better with affiliates on account of its stronger finances.

Microsoft made a lot of sense as Microsoft was well-diversified and loaded with cash. There was some enthusiasm for acquiring Overture inside of Microsoft as well. According to Fortune Magazine, Chris Payne, a Microsoft manager, tried to convince Bill Gates to proceed with an Overture acquisition; however, Bill Gates did not see sufficient value in Overture and rejected the acquisition. Overture’s value lay in its patent portfolio, and Bill Gates must have miscalculated the strategic importance of Overture’s patent portfolio.

Yahoo was Overture’s backup choice and it too made a lot of sense. Yahoo handled more web-traffic than anyone else, and with Overture’s patent portfolio Yahoo could become what Overture wanted to be, but without the handicap of having to hand over vast chunks of revenue to affiliates.

Unlike Bill Gates, Yahoo was aware of Overture’s potential. Yahoo had always outsourced its search functionality to outside search engines. With Google gaining prominence and no new search contenders on the horizon, Yahoo was in a tight spot. Yahoo realized that to be competitive in the search engine arena, it needed access to Google’s intellectual property. Overture’s claim on the core of Google’s business model was just the bargaining chip Yahoo needed.

In July 2003 Yahoo acquired Overture in a mostly stock deal valued at $1.63 billion. This was an expensive deal as Yahoo’s stock was not flying very high at the time. Also, Overture did not have anything valuable apart from the ‘361 patent. Yahoo and Microsoft counted for the bulk of Overture’s revenue, and were it not for the ‘361 patent, Microsoft would have certainly walked. The deal was terrible news for Google as Yahoo was now in a position to cut a very tough bargain with Google.

Google and Yahoo settled the ‘361 patent dispute in August 2004. Google disclosed the settlement in an SEC filing just before its initial public offering (IPO). The relevant excerpt from Google’s SEC filing states:

Overture will dismiss its patent lawsuit against us and has granted us a fully-paid, perpetual license to the patent that was the subject of the lawsuit and several related patent applications held by Overture. The parties also mutually released any claims against each other concerning the warrant dispute. In connection with the settlement of these two disputes, we issued to Yahoo 2,700,000 shares of Class A common stock.

At the time of the patent settlement disclosure, 2.7 million shares of Google represented roughly 1 percent of the company. Google estimated that the shares were worth somewhere between $260 and $290 million. This estimate was based on Google’s proposed IPO price range of $108 to $135 a share, which was subsequently lowered to $85 a share. Interestingly, even the $290 million number does not represent any sort of adequate return on Yahoo’s $1.63 billion Overture investment. Yahoo was licensing critical patents to Google at a critical time for less than one fifth of what it paid to acquire them.

Interestingly, Google never paid anything even remotely close to $290 million for the patents. Google quite successfully managed to muddle up the math by jumbling together the numbers of the patent licensing settlement with the settlement of a separate second dispute with Yahoo.

In the second dispute, Yahoo claimed a warrant it held in connection with a June 2000 services agreement between the two companies entitled it to 3.7 million Google shares. Google had issued 1.2 million shares in June 2003 to settle Yahoo’s claims, but Yahoo wanted an additional 2.5 million shares.

Google’s 2004 annual report has the settlement value pinned down to $229.5 million, and it sheds some light on how much was paid for what. The section about the Yahoo settlement states:

In the year ended December 31, 2004, the Company [Google] recognized the $201.0 million non-recurring charge related to the settlement of the warrant dispute [with Yahoo] and other items. The non-cash charge associated with these shares was required because the shares were issued after the warrant was converted. The Company realized a related income tax benefit of $82.0 million. The Company also capitalized $28.5 million related to certain intangible assets obtained in this settlement.

In the IPO filing the focus was on the patent dispute, but here the emphasis is clearly on the non-recurring charge related to the warrant dispute and “other items”. Interestingly, the “other items” mentioned above do not include patent licenses (explanation follows), so out of the $228.5 million settlement $201 million had absolutely nothing to do with the patent settlement.

US corporations are required by law to follow generally accepted accounting principles (GAAP) put forth by the Financial Accounting Standards Board (FASB), and compliance to this requirement is assured by independent auditors. One fundamental component of GAAP is the matching principle. This principle requires revenues to be matched with the expenses it took to generate them, i.e., expenses relevant to the revenues reported are recognized in the same period as the revenues. Often matching expenses with revenue is not straightforward, and then the matching principle reduces to recognizing expenses when resources are utilized as opposed to when resources are paid-for or acquired.

For instance, if a corporation pays for a one year insurance contract at the start of its fiscal year, the corporation can not record the total cost of the insurance contract as an expense in its first quarter. The corporation will be availing of the insurance company’s services over one year, so the insurance expense has to be spread over that time frame. To accomplish this the corporation capitalizes the insurance contract, i.e., the corporation records the insurance contract as an asset, prepaid insurance, on its books. Afterwards, the corporation records insurance expense in four subsequent quarters to write-off the prepaid insurance asset.

A patent license is conceptually somewhat similar to an insurance contract. A typical patent license grants the licensee the right to use a patent for a number of years, and if the licensee prepays for the license then it makes sense to capitalize the cost of the license. In fact, GAAP requires prepaid patent licenses to be capitalized. Google too was required to capitalize the patent license it acquired from Yahoo and expense it over time. However, the $201 million one time charge was something that was not capitalized; therefore, it had no relevance to the patent licenses Google acquired.

A $201 million payment for the warrant dispute makes perfect sense as this amount represents 2.36 million Google shares which is fairly close to the 2.5 million shares Yahoo wanted. Yahoo had zero incentive to compromise on its demands with Google’s IPO so close, and Google’s financial statements clearly indicate that Yahoo got what it wanted. Google recorded the rest $28.5 million as intangible assets on its books. Patents are capitalized as intangible assets; therefore, $28.5 million is all Google paid for the patent licenses it acquired from Yahoo.

Yahoo was also a participant in the patent licensing transaction, and it too was required to report the transaction. Yahoo needed to recognize revenue and under GAAP revenue is reported in accordance with the revenue recognition principle. This principle requires revenue attributed to a product/service to be recognized in the same period in which the product/service was sold/rendered, regardless of the exchange of cash. Under this principle, prepayments for services are recorded as liabilities and revenue is recognized against these liabilities when the relevant services are rendered.

The patent licenses Google acquired counted as service agreements under GAAP. Google prepaid for the patent licenses, so Yahoo was required to report the prepayment as deferred revenue, a liability. Yahoo’s 2004 annual report states:

“The Company [Yahoo] agreed to dismiss the 361 patent lawsuits and has granted to Google a fully-paid license to the 361 patent as well as several related patent applications held by Overture. The Company allocated the 2.7 million shares between the two disputes, based on the relative fair values of the two disputes, including consideration of a valuation performed by a third party. A portion of the shares allocated to the patent dispute has been recorded as an adjustment to goodwill for the period that the patents were in effect prior to Overture’s acquisition by the Company. The portion of the shares received for the settlement of the patent dispute which has been allocated to future periods has been recorded in deferred revenue on the consolidated balance sheets and will be recognized as fees revenues over the remaining life of the patent, approximately 12 years.”

Yahoo divided the proceeds of patent licensing agreements amongst the goodwill and deferred revenue accounts. The twelve year remaining life of the patent mentioned in the annual report implies Yahoo allocated most of the proceeds from the licensing agreement to its deferred revenue account. Consequently, Yahoo’s deferred revenue account ought to show a big spike if the proceeds from the patent agreement were substantial.

Yahoo reported deferred revenue of $224 million, an increase of $11 million over the prior quarter, in its Q3 2004 (3rd quarter of fiscal 2004). This was the quarter in which the settlement took place, and a substantial increase in deferred revenue was anticipated. Yahoo’s deferred revenue was $192 million, $201 million, and $213 million in Q4 2003, Q1 2004, and Q2 2004, respectively. Considering these numbers, the $11 million increase in the Q3 2004 in no way represents a break from the obvious trend in Yahoo’s deferred revenue account. The patent settlement value has to be insubstantial as there is no spike in Yahoo’s deferred revenue.

Both Google and Yahoo’s financial statements indicate that the patent licensing portion of the $229.5 settlement was no bigger than $28.5 million. Spread over the life of the patent the licensing fees are paltry, and it is simply inconceivable how Yahoo agreed to license its billion dollar patents for such a ridiculous sum.

Patent litigation is extremely expensive and Yahoo must have spent a good chunk of $28.5 million on legal expenses by the time of the settlement. This suggests the $28.5 million covers Yahoo’s legal expenses associated with the patent litigation and does not represent any payment for patent licenses. Google compensated Yahoo for the patent licenses in some other way, and the company is not being forthright about the settlement terms.

Yahoo craved Google’s intellectual property, but the terms of the settlement make no mention of Yahoo licensing any of Google’s patents. Google could not expect to get away with hiding an IP licensing agreement with its biggest competitor; therefore, it is reasonable to assume that Yahoo did not get such an agreement. However, Yahoo had an indirect way of achieving access to Google’s intellectual property.

Google’s SEC filings mention a “fully-paid, perpetual license,” but they omit the word non-revocable. Patent license terms often include non-revocable in addition to perpetual. Perpetual seems to indicate non-revocability but it really does not. It is unreasonable to expect Google’s high-powered lawyers to miss the word non-revocable, so Google’s license to the ‘361 patent is revocable.

Now all that remains is the question of the terms which if violated cause Google’s patent license to become revocable. The only obvious term that makes sense was for Yahoo to have conditioned the revocability of Google’s patent license on Google’s not litigating against Yahoo. Such a condition puts Google on a leash, and effectively grants Yahoo the authority to use Google’s patents with complete immunity. Of course, there are other possibilities, but again there was no reason for Yahoo to settle for anything less than unhindered access to Google’s patent portfolio.

Settling on onerous terms with a competitor is not a crime, and Google did nothing wrong by settling with Yahoo. If Google had reported the transaction honestly there would be no problem at all, but Google didn’t do that.

US regulations require corporations to disclose all significant business risks to investors. Before the settlement, Overture’s patent litigation posed a threat to Google’s business and Google did disclose that. The settlement removed the risk of litigation but by giving away the company’s technical superiority it created a new risk. Google essentially traded one type of risk for another, and was bound to disclose the nature of the new risk. Google never did that, and in doing so ran afoul of US regulations.

Google’s patent settlement disclosure on August 9 2004 was immediately followed by an IPO auction on August 13. Typically, investors don’t care too much about pending patent litigation when investing in IPOs, but the timing of Google’s patent settlement suggests Google’s IPO was the motivation for the settlement and the ensuing cover-up.

US regulations prohibit companies from aggressively marketing their IPOs and compliance is assured by forcing companies into a quiet period. In the quiet period a company can disclose information about its business activities only via an SEC filing known as the company’s prospectus. The goal being to prevent unsound businesses from pitching themselves to investors. This results in a substantial handicap. Additionally, new businesses have the problem of being restricted to raising funds from retail investors (individuals who buy/sell securities for their personal account) on account of their limited contacts. Retail investors constitute a small portion of the trading activity in the stock market, and are only willing to invest small amounts in speculative ventures. To raise a billion dollar or more a company might need to sell its IPO shares to tens if not hundreds of thousands of retail investors which is impractical at best.

Fortunately, companies going for IPOs don’t have to worry about such problems; instead, they hire investment banks as underwriters and relegate these problems to investment bankers. Most of the big money in the stock market is controlled by institutional investors such as mutual funds, endowment funds, pension funds and hedge funds. Investment bankers deal with large institutional investors on an everyday basis and are able to market IPOs to such investors. In addition, they help companies price their IPOs by assessing demand and obtaining commitments from investors.

Investment bankers typically collect seven percent of IPO proceeds for their effort. Even better, investment bankers get to allocate IPO shares. As IPO shares are almost always underpriced and often experience a pop of 15 percent on the first day of trading, IPO allocation privileges are very valuable and enable investment bankers to handsomely reward their favored clients. This means the direct and indirect payoff to investment bankers totals around 20 percent of the value of IPO proceeds.

Google’s IPO was atypical. Instead of letting its investment bankers control its IPO, Google tried to price and allocate IPO shares by an auction. Additionally, Google’s investment bankers received only 2.8 percent of IPO proceeds instead of the usual 7 percent. Google’s IPO was a very bad deal for investment bankers, but that was not the end of it. Google’s IPO was the IPO event of 2004; the success of Google’s IPO auction would have turned the IPO underwriting business upside down. It would have caused a flood of companies to embrace IPO auctions, and in the process threatened the very livelihood of IPO underwriters. Consequently, Google’s investment bankers had powerful incentives to do their utmost best to sabotage Google’s IPO auction.

Google’s IPO auction turned out to be a complete rout. Towards the end of its IPO auction Google simply gave up and allowed its IPO underwriters to run the show like a traditional IPO and price its IPO offering. The company not only sold its IPO shares at $85 a share, well below the $108-$135 pricing range, but also reduced the number of shares it was planning to offer to 19.6 million from 25.7 million.

Google’s investment bankers had to play a balancing game; they needed to sabotage Google’s IPO auction but they also needed to preserve their credibility. This suggests, Google’s IPO underwriters discreetly high-lighted risks stated in Google’s prospectus to institutional investors. Google’s Overture troubles were mentioned in Google’s preliminary prospectus and constituted convenient ammunition. It must have been a straightforward job to hype Google’s patent litigation risks to institutional investors, and scare them into seeking assurances as to an amicable settlement with Yahoo and steep discounts on Google’s IPO auction pricing range.

Once the institutional investors became wary of Google’s patent troubles, Google had to settle and cover-up in order to proceed with its IPO. There was no other option. The investment bankers couldn’t have helped, as they couldn’t simply tell their clients that they exaggerated Google’s troubles and investing in Google was safe. Disclosure of the unfavorable terms was not a choice either, as that would have scared institutional investors even more.

Google’s choice of proceeding with its IPO was suicidal. The patent settlement cover-up was always going to be an open secret. Mark Jen’s wrongful termination suggests the details of the cover-up were widely known inside of Google, and Yahoo’s managers knew of the cover-up from the start.

Google’s future prospects are not good. By hiding risks from investors Google has exposed itself to shareholder lawsuits. Google can only hope that its stock doesn’t take a nosedive; otherwise, Google can expect shareholder lawsuit hell. Shareholders are going to claim Google intentionally hid potential risks, and they are going to be right about that.

US securities regulators are a much graver threat to Google. Google flouted US securities regulations right under the noses of SEC lawyers, and SEC lawyers are not going to find that amusing in the least. Google’s top management and its directors obviously knew about the cover-up, and regulators are not likely to want to miss a chance to fry some really big fish.

Interestingly, Google had no need to go through an IPO. Even before its IPO Google was profitable, and had access to all the capital it needed to expand operations. By simply staying private Google could have avoided the whole mess. Under this scenario, Google would have no obligation to disclose settlement terms. Even better it would have no incentive to rush the settlement in the first place. Google could have simply held-out for better terms. The pressure for an IPO was coming from the venture capitalists and employees. The VCs wanted to show impressive gains on their Google investments, and the employees wanted to cash out their stock options. Petty greed combined with extreme naivete and hubris seems to have led Google’s management on the path to ruin.


LAST UPDATED by Usman Latif  [May 31, 2005]


Updates:
9 March 2006: Edited for clarity
31 May 2005: This document has gone through a number of revisions, and the original version as well as an older revision to the original are also accessible.

What Microsoft Wants from Yahoo

Microsoft’s now retracted acquisition bid for Yahoo has been endlessly analysed in the media and myriad explanations have been offered for Microsoft’s interest in Yahoo. Analysts have generally assumed the obvious that Microsoft made the bid in order to acquire Yahoo. Yahoo CEO Jerry Yang, though, does not agree with this assumption. He has publicly claimed that Microsoft did not intend to acquire Yahoo. Some analysts have dismissed Jerry Yang’s claim as self-serving, but this can not be so. As part of ongoing shareholder litigation, Jerry Yang may need to defend this claim in a court of law and could not have made the claim lightly. Besides, there is plenty of evidence that supports Jerry Yang’s claim.

Microsoft’s $31 a share bid came just one day after Yahoo’s share price hit a four-and-a-half year low of $19.05. Yahoo shares closed at $31.36 on November 05, 2007; less than three months before the Microsoft bid. Yahoo had not reported a major downturn in its operations in the months leading to decline of the stock from above $30, and Yahoo’s share price exhibits considerable volatility. Share prices on a particular day are meaningless anyway as prices can move drastically even on the basis of insignificant trading volume. This suggests that Yahoo’s $19.18 a share closing price on January 31, 2008 was uncharacteristically low, and did not reflect a realistic valuation of Yahoo. Considering this the much touted 60 percent premium Microsoft offered over Yahoo’s closing price on January 31st appears to be mostly a well-planned and cleverly executed public relations coup.

The Microsoft bid never made sense from a business perspective either. Yahoo has always had stale search offerings, second rate search technology, and a mediocre unmotivated workforce. Yahoo derives its value primarily from the massive web-traffic the company controls, but the cost of controlling this web-traffic is likely to be prohibitive for Microsoft. In case of an acquisition, Microsoft needs to assimilate Yahoo’s 10,000 employee workforce into its corporate culture, and it is questionable if the “synergies” Microsoft is after outweigh the inefficiencies that will result from having to take on an army of sub-par and unmotivated employees.

Interestingly, Microsoft’s Yahoo bid does not make sense even after assuming a strong business case for a Microsoft Yahoo merger. Had Yahoo’s board accepted Microsoft’s $31 a share bid, the deal still could not have gone through because of antitrust issues. A Microsoft Yahoo merger is not a problem for anyone in search but it is a massive threat to Adobe Systems and its dominant rich internet application (RIA) development platform: Flash. Microsoft’s RIA platform, Microsoft Silverlight, competes directly against Flash and in the case Microsoft acquires Yahoo, Silverlight will get a hold over some of the web’s most popular websites and more than 50 percent of the webmail market. Consequently, any Microsoft attempt to acquire Yahoo is bound to be challenged by Adobe on antitrust grounds. More than that, because of Adobe’s legitimate concerns, Microsoft has to accommodate Adobe, which it can not do without divesting most of Yahoo’s valuable web-assets.

Evidently, Microsoft’s bid for Yahoo had no realistic chance of being successful. But why did Microsoft bid at all? Microsoft’s bid has to be an unconventional tactic for putting pressure on Yahoo, but to what end?

The precise nature of Microsoft’s interest in Yahoo has been clear for years. Even before Microsoft’s acquisition bid for Yahoo the two companies had been talking for years. Back in 2006, in an interview with New Yorker journalist Ken Auletta (video of interview no longer available), then Yahoo CEO Terry Semel publicly acknowledged that Microsoft and Yahoo were talking. In the interview Terry Semel said, “Microsoft taking over Yahoo — that conversation has never come up,” Terry continued, “[We discussed] search, and Microsoft co-owning some of our search. I will not sell a piece of search. It is like selling your right arm while keeping your left — it does not make any sense.”

The idea of a competitor co-owning Yahoo’s core business is extremely odd if not outright nonsensical, and Terry Semel rightly spurned Microsoft. However, Terry Semel did not stop at not wanting to sell anything to Microsoft, but made an astonishing remark that received extensive media coverage at that time. He said, “My impartial advice to Microsoft is that you have no chance. The search business has been formed.”

The above remark was all the more surprising as Yahoo was not even close to being dominant in the search market at the time. In fact, in the very same interview Terry Semel admits to Yahoo not even being in the search business just 2-3 years back. Terry Semel further credits Yahoo’s June 2003 acquisition of paid search specialist Overture Services in getting it into search. He says of the Overture Services acquisition, “Turns out we earned over $200 million that year (from Overture) and that was the start of search”

Yahoo’s Overture Services acquisition is the crucial link between Yahoo and Microsoft and is the key to understanding Terry Semel’s Microsoft comments. Yahoo’s Overture acquisition was not about search technology even though Overture owned the AltaVista and AllTheWeb search engines. By the time of the Overture acquisition, Yahoo already owned search technology via its acquisition of the Inktomi search engine in December 2002. Overture Services specialized in paid search. Overture Services’ 2002 annual report reads:

Overture Services, Inc. is the global leader in Pay-For-Performance search (also known as paid search) on the Internet. Overture’s search service is comprised of advertiser’s listings, which are screened for relevance and accessed by consumers and businesses through Overture’s affiliates, a network of Web properties that have integrated Overture’s search service into their sites or that direct user traffic to Overture’s own site. In some cases, consumers and businesses access our search listings directly at our site. The search listings are ranked by the advertisers’ bid; the higher the bid, the higher the ranking. Advertisers pay Overture the bid price for clicks on the advertisers’ search listing (also known as a paid introduction, click-through or a paid click).

The above reads like a description of the Google AdSense and AdWords paid-search solutions and it is. Google is generally credited with pioneering the paid-search text ad, but it was Overture that originally came up with the idea. Overture not only pioneered the paid search business model but it also managed to patent the core ideas behind paid-search. In July 2001, the US patent office granted Overture US patent number 6,269,361. Also known as the ‘361 patent, it covered the basic paid-search bid-for-placement advertising model. The ‘361 patent effectively granted Overture the right to monopolize the lucrative US paid-search market and subsequently dictated the evolution of the global paid-search market.

After getting the ‘361 patent, Overture wasted no time going after its paid-search competitors and quickly mobilized its lawyers to corner the paid-search market. Initially, Microsoft, Yahoo, and most of the other paid search players chose to license the ‘361 patent. However, Google and Miva (a small paid search operator formerly known as FindWhat) chose to challenge the patent in court, and subsequently got sued by Overture for patent infringement.

Google and Miva accused Overture of inequitable conduct in the filing of ‘361 patent. Inequitable conduct is a legal term that implies noncompliance with patent filing procedures or misdirection in the filing of the patent application. Google and Miva contended that Overture filed the patent application for ‘361 patent more than a year after releasing a product based on the ideas for which it was seeking patent protection. Under US patent law, companies have one year to file patent applications after they disclose their inventions, and Overture missed the one year window. Google and Miva effectively claimed that Overture lied to the US patent office in order to obtain the ‘361 patent.

Challenging the ‘361 patent made a lot of sense as Overture was demanding hefty fees for the patent’s use. Overture’s financial statements from 2002 indicate that the company on average retained 38 percent of ad revenue generated by its paid search ads at customer websites. Microsoft and Yahoo were two of Overture’s biggest customers in 2002 and were responsible for 60 percent of Overture’s revenue that year.

Overture reported sizable growth in 2002, but also started looking for a buyer. By that time, Google was taking over paid search via its in-house paid search solutions and Overture must have anticipated Yahoo and Microsoft bypassing it and rolling out paid-search solutions of their own. Overture did own ‘361 patent and could use it to block Yahoo and Microsoft but the patent was being challenged in court. Any setback in court would have devastated Overture, and on account of the questions surrounding the ‘361 patent filing, the company had good reasons to expect a less than favorable outcome in court.

Overture initially tried to market itself to Microsoft. There was enthusiasm for acquiring Overture inside of Microsoft too. According to Fortune magazine, Chris Payne, a Microsoft manager, wanted Bill Gates to go ahead with an Overture acquisition. The deal made sense for both companies. Overture needed to hedge against setbacks in court and Microsoft needed to get into paid-search. Surprisingly, Bill Gates rejected the deal. Bill Gates must have seen the Overture deal primarily as a patent play and realized that Microsoft could not afford to enforce the ‘361 patent because of antitrust issues. However, Microsoft needed the patent for itself too and this is where Bill Gates miscalculated.

With Microsoft out, Overture was practically driven into Yahoo’s arms. Terry Semel at Yahoo had no qualms about enforcing patents, so he was able to justify a higher valuation for Overture than Bill Gates. In July 2003 Yahoo acquired Overture in a mostly stock deal valued at $1.63 billion. The purchase price amounted to more than 8.5 percent of Yahoo’s valuation at that time, but from Terry Semel’s perspective this was a sweet bargain for a chance to control the paid search market and make Yahoo relevant again.

Yahoo’s Overture acquisition completely cornered Microsoft in the paid-search market. Microsoft had been licensing the ‘361 patent and in doing so it had admitted to the patent being valid. Microsoft could still challenge the patent in court but it did not have a good case. Yahoo now effectively dictated what Microsoft could and could not do in the paid-search market. Microsoft’s only hope was the challenge to the ‘361 patent by Google and Miva.

The subsequent resolution of the ‘361 patent litigation did not lead to a favorable change in Microsoft’s position. In a very shady deal just before its IPO (more on this ahead), Google chose to settle with Yahoo. Google claimed to have won royalty free use of Yahoo’s paid search patents for around $300 million in stock, but Google’s own financial statements contradict this and suggest that of the $300 million settlement only around $30 million was for settling Yahoo’s patent claims. The rest of the money was compensation for settling a completely separate dispute.

Miva, the other ‘361 patent litigant, persisted with the litigation, and the case eventually went to trial in April 2005. In May 2005, the case resulted in a mistrial after a jury failed to reach a verdict. The marketing portal ClickZ reports on the Yahoo Miva trial:

A jury was unable to reach a verdict after deliberating since Friday in the U.S. District Court trial in California. The jury did find that FindWhat [Miva] infringed on 18 claims within the patent held by Yahoo Search Marketing (formerly Overture). They also found that FindWhat had proven six of those claims invalid during the trial, but they were unable to reach a decision on the 12 other claims.

The ClickZ report continues:

Judge Cormac J. Carney has scheduled a hearing for June 24, 2005. At that time, he may rule on the issue of inequitable conduct, and he could potentially enter the parts of the jury’s verdict where a decision was reached into the record, though he is not required to do so. The judge will not have a set time limit to make his decision, but he has generally tried to keep the trial moving quickly.

The jury found invalid two of the main claims in the patent describing the overall method of selling bid-for-placement search ads. If the judge accepts the jury’s decisions on those two claims, FindWhat, and the industry in general, will score a big win. Most of the remaining issues relate to lesser issues that could force FindWhat and others to alter the way they do business without changing it completely.

The above seems to be a middling outcome for both sides, but this is not so. Judge Cormac was in a position to dismiss the major claims of ‘361 patent, and there were reports in the media that he was inclined to do just that. Additionally, Judge Cormac was reported as favoring Miva in the inequitable conduct dispute about the ‘361 patent. Yahoo could have appealed an unfavorable ruling but such a ruling would have been a massive setback nevertheless.

Inexplicably, Yahoo pushed the Miva patent litigation to trial and was itself responsible for the ensuing fiasco. After settling with Google, Yahoo had patent licensing deals with all significant paid-search operators, and there was no obvious rationale for going after, at best, a marginal player.

Unsurprisingly, with the threat of a potentially devastating setback just around the corner, Yahoo relented and settled with Miva. Miva ended up paying Yahoo $8 million in cash and undisclosed royalties. Again, inexplicably, Yahoo insisted on getting royalties from a marginal player when it allowed Google to use the ‘361 patent royalty free.

The Miva settlement is all the more interesting because it directly contradicts Google’s claims of having settled with Yahoo for $300 million in stock with no ongoing royalties. (The settlement amounted to roughly $30 million according to Google’s own financial statements.) Both companies’ business models depended on having access to ‘361 patent, but Miva’s was less dependent on account of its having acquired paid-search assets overseas where the ‘361 patent could not be enforced. More importantly, Miva settled from a position of strength as Yahoo could not afford to have major claims of the ‘361 patent dismissed, but Google settled from a position of weakness as its IPO was at stake when it settled. Even ignoring the massive competitive threat Google posed to Yahoo, Google’s settlement should have been at least two orders of magnitude bigger than Miva’s on account of Google’s size, growth potential, and weak bargaining position.

The Miva settlement consolidated Yahoo’s control over Microsoft’s paid-search business. Microsoft’s subsequent experience with its adCenter paid-search product outlines just how extreme and damaging Yahoo’s control over paid-search became. An article published in May 2006 by The Register says:

Microsoft is taking on the great Google Money Machine with an inhouse answer to Google Adwords.

Step forward Microsoft adCenter, launched yesterday to pump out all-paid search traffic on MSN and other Microsoft online properties in the US. Microsoft’s adCenter replaces Yahoo!’s Overture as the paid-for search engine on MSN. The only surprise is how long it took Microsoft to make the switcheroo — predicted ever since Yahoo! bought Overture in 2003 — and confirmed this time last year by Microsoft at its annual MSN Strategic Account Summit.

Microsoft adCenter is already running in Singapore and France and will trial in the UK with select advertisers in June.

The Register reporter is unclear as to the reason for Microsoft adCenter roll out delay, but there was an obvious reason for the delay: Yahoo refused to license the ‘361 patent to Microsoft on terms agreeable to Microsoft. Typically US companies launch their products first in the US and then elsewhere but due to Yahoo’s persistent stalling Microsoft launched adCenter outside the US first where the ‘361 patent is not enforceable.

Microsoft’s launch of adCenter in the US does not imply that it managed to buy its way out of Yahoo’s influence. In fact, it did not. Terry Semel’s “I will not sell a piece of search” and “My impartial advice to Microsoft is that you have no chance” comments came less than ten days after Microsoft adCenter’s US debut. The timing of Terry Semel’s comments unequivocally conveys the information that the Microsoft Yahoo negotiations Terry Semel talked about in the New Yorker interview concerned Microsoft adCenter. The comments further suggest that Microsoft failed to win any concessions in the negotiations and Microsoft’s paid search operations are severely handicapped because of Yahoo imposed restrictions and licensing fees.

The paid search situation for Microsoft has not changed since then. Microsoft is still chafing under Yahoo’s influence and is desperate for unfettered access to the ‘361 patent. It is quite possible that the size of the royalties Microsoft is paying to Yahoo are forcing Microsoft to neglect its paid search operations in order to minimize payments to Yahoo, and to minimize the size of an eventual settlement with Yahoo.

Microsoft has kept on negotiating with Yahoo since Terry Semel’s interview back in 2006, and details of talks between the two companies have emerged from time to time. The peculiar thing about Microsoft Yahoo negotiations is Microsoft’s insistence on owning/co-owning Yahoo’s paid-search assets. This is odd as Microsoft only needs access to Yahoo’s paid-search patent portfolio. Microsoft has not stopped eyeing Yahoo’s paid-search assets even after the withdrawl of its acquisition bid for Yahoo, and recently Microsoft proposed yet again to buy Yahoo’s paid-search assets. The latest proposed deal asked for Yahoo to accept a $8 billion stock investment and $1 billion in cash from Microsoft in exchange for Yahoo’s paid-search assets.

As previously discussed the idea of Yahoo letting a competitor control Yahoo’s paid-search operations is nonsensical. In the latest proposal, Microsoft is spinning the idea as being beneficial to Yahoo. Microsoft is claiming that Yahoo can improve returns for its search operations by selling and out-sourcing all its paid-search operations to Microsoft. This rationale holds absolutely no water. It is like Toyota wanting to takeover General Motors’s manufacturing in order to improve margins for General Motors. Microsoft’s rationale is even more incomprehensible as Toyota can be relied on to deliver gains for General Motors, but Microsoft is itself struggling in paid-search and can not possibly deliver any long term gains to Yahoo. The whole gains argument is a recent invention anyway, and can not explain why Microsoft has been trying to co-own Yahoo’s paid-search operations since even before its in-house paid-search infrastructure was operational.

Yahoo had little choice but to reject Microsoft’s terms. The money Microsoft offered may have been good in the short-term but it was questionable if Yahoo could have stayed independent after losing its search assets. Additionally, Yahoo managed to arrange a better deal with Google that does not involve Google co-owning/buying Yahoo’s paid-search assets and actually improves Yahoo’s paid-search monetization. Surprisingly, Microsoft instead of proposing a more competitive deal has been busy trying to subvert the Yahoo Google deal by raising antitrust concerns, and even seems to have succeeded at getting the US Department of Justice to investigate the Yahoo Google ad deal.

Microsoft is completely aware of the ludicrousness of its attempts to buy Yahoo’s paid-search assets and Microsoft’s earlier acquisition bid seems to have been an attempt to soften up Yahoo’s opposition to a paid-search asset acquisition. Microsoft’s publicized $31 a share bid was preceded by an unpublicized $40 a share bid in January of 2007. The initial $40 bid seems to have conveyed the message that if Yahoo refuses to let Microsoft co-own paid search, it can expect a public bid from Microsoft at some inopportune time, and consequent trouble from share-holders and corporate raiders. Yahoo responded to the threat by handing over the CEO job to Jerry Yang, a founder and a major shareholder, and attempting to shore up its share price by cost-cutting. Of course, this did not help and eventually Yahoo’s share price fell to a point where Microsoft felt that it could make a safe play for Yahoo’s paid search assets.

Microsoft’s $31 a share bid could not have gone through, so it was no surprise it got retracted. However, this was not the end of Microsoft’s play for Yahoo’s paid search assets. Microsoft seems to have waited for corporate raider Carl Icahn to buy into Yahoo before retracting its bid. Additionally, Microsoft disclosed the terms of its made-to-be-rejected $9 billion offer for Yahoo’s paid-search assets after Yahoo declined the offer. This particular action can only be interpreted as a thinly veiled invitation for corporate raiders to target Yahoo. The idea at Microsoft seems to be to dangle carrots in front of corporate raiders to get them to buy large chunks of Yahoo and force changes in the composition of Yahoo’s management. Microsoft is hoping that eventually the composition of Yahoo’s management will change enough that Microsoft will be able to push the paid-search asset acquisition deal it wants.

Interestingly, Yahoo’s refusal to sell paid-search assets is perfectly understandable, but its persistent refusal to license its paid-search portfolio to Microsoft is not. Microsoft is a distant third in the paid-search market, and it is not obvious why Yahoo doesn’t get Microsoft and the Microsoft sponsored corporate raiders off its back by licensing Microsoft the paid-search patents it needs. After all Microsoft only needs Yahoo’s paid-search portfolio and can do without Yahoo’s paid-search assets. Also, didn’t Yahoo grant its biggest competitor Google a royalty free license to its paid-search patents for a mere pittance?

Actually, Yahoo has gotten into a situation where it would be happy to license Microsoft the patents it wants, but it can not do that. Relenting on patent licensing terms is not an option for Yahoo. Remember Miva? Miva was the niche paid-search operator who almost prevailed in court but even then Yahoo insisted on collecting patent royalties from it. The reason for Yahoo’s obstinance with licensing terms is obvious: it can not give anyone a good patent licensing deal in isolation. Giving someone a good deal will trigger an escape clause in a lucrative patent licensing agreement with a big fish third-party, allowing that patent licensee the better terms as well.

Microsoft is perfectly aware of Yahoo’s constraints. Microsoft knows who the big fish is and understands that it cannot compensate Yahoo for the loss that would result from Yahoo giving Microsoft good paid-search patent licensing terms, so it is not even proposing to license Yahoo’s patents. Instead, it is proposing to buy/co-own Yahoo’s paid-search operations. The motivation behind such a convoluted structuring of what could have been a simple patent licensing agreement is to subvert escape clauses in Yahoo’s lucrative patent licensing with its big fish patent licensee.

There is another reason for structuring the deal that way. Microsoft believes that by being clever about the deal terms Microsoft can practically get Yahoo’s big fish patent licensee to fully reimburse Microsoft for whatever Microsoft pays for Yahoo’s paid search assets.

So, who is Yahoo’s big fish patent licensee, and why would it compensate Microsoft for buying Yahoo’s assets? There are only three major paid-search players: Yahoo, Microsoft, and Google. Clearly, the big fish is one of them, and it is not Yahoo. It can not be Microsoft either. By simple elimination it has to be Google. Google though claims to have obtained a perpetual royalty-free license to Yahoo’s paid-search patents for a one-time payment. Surely, Google can not be the big fish?

Google may have done a lot of things right but its paid-search patent settlement with Yahoo is not one of them. Too many fingers point to Google hiding a monster of a bad deal from the public. First, Google was evasive about its paid search patent settlement with Yahoo and tried to pass-off the impression that it paid $300 million in stock for licensing Yahoo’s paid-search patents; however, Google’s own financial statements contradict this and prove that Google only paid an unbelievable $30 million or so in stock for Yahoo’s patents. Second, Yahoo’s financial statements confirm that Google only paid around $30 million as payment for patent licenses. Third, even the $300 million Google says it paid for patent-licensing was too low in the context of the Miva settlement. Fourth, the control Yahoo exercised over Microsoft’s paid-search operations and Terry Semel’s comments clearly indicate that Yahoo did not grant a good patent licensing deal to anyone. Fifth, Microsoft is willing to spend billions to secure access to the very same patents that Google says it got for next to nothing.

Google does have a perpetual royalty-free license, but $30 million was not all Google paid for the license. Evidently, Google is hiding some material terms of its patent settlement with Yahoo from the general public and its investors. Google had a legal obligation to disclose anything material that was likely to influence its future business operations, but Google’s management subverted that obligation. It seems that the terms Google did not disclose were covered by a “grace period” of several years during which Google got free use of the patent, and escape clauses; so back in 2004, Google’s management under pressure of an imminent IPO convinced itself that the terms covered by the grace period were unlikely if ever to come into play even after the grace period ended, and so were immaterial and did not need to be disclosed. The thinking at Google must have been that either the ‘361 patent is not going to hold-up in court, or Miva is going to extract a good settlement out of Yahoo triggering escape clauses in the agreement, or somebody else is going to get a good deal out of Yahoo.

Google may have been helped in its decision not to disclose all the terms of the deal by Yahoo. Yahoo was patent trolling, and must have wanted to avoid scrutiny of its patent licensing deals. Additionally, Yahoo may have wanted to avoid disclosure of the escape clauses and the grace period in order to keep this information secret from its other patent licensees.

Yahoo may have had legitimate reasons for keeping the settlement terms secret but Google didn’t. Google’s rationale was just self-serving thinking: Yahoo had too much to lose in case the escape clauses triggered and it structured all its paid-search licensing deals to ensure that the escape clauses did not trigger.

The actual terms of the Google Yahoo patent settlement are largely academic now. Yahoo understands that there is no simple way for it to enforce the original terms of the agreement. If Yahoo tries to publicize or enforce the original terms of the agreement in order to get a share price boost, Google will flatly dispute owing anything to Yahoo. Admitting to all the material terms of the patent settlement with Yahoo is not an option for Google as this is tantamount to admitting to fraud. Instead, Google will prefer to assert that Yahoo is misinterpreting the terms of the patent settlement because of greed and desperation. Consequently, a flood of negative PR will get unleashed at Yahoo and the media will be happy to denounce Yahoo as a patent troll (which happens to be true). Additionally, this may result in renewed legal challenges to the ‘361 patent or a review of the patent by the patent office. Under such a scenario Yahoo will be forced to go to court in order to resolve the dispute, but courts move slowly and are unpredictable. Yahoo may manage to destroy Google by going to court but it is uncertain if it will recover anything substantial anytime soon.

The other option for Yahoo, which it seems to be pursuing, is to ask Google to quietly compensate it. Google can structure the compensation as an investment similar to the $9 billion deal Microsoft offered for Yahoo’s paid search assets. Basically, Google can acquire 20-30 percent of Yahoo at an inflated price as a “strategic investment”, and decline all voting rights to the stock in order to avoid antitrust scrutiny. Of course, after getting a generous enough compensation Yahoo will forget all about the problem licensing terms.

The problem for both Yahoo and Google is that Microsoft wants to force itself into the deal. Microsoft knows that it can block any compensation deal between Yahoo and Google by beating up the antitrust drum. Microsoft is using this as leverage to goad Yahoo into selling its paid-search assets. Essentially, Microsoft has conveyed to Yahoo that there can be no payday for Yahoo if Microsoft does not get Yahoo’s paid-search assets, but if Yahoo acts sensibly Microsoft can conveniently look the other way when Google comes knocking with the ransom.

Not only that, but Microsoft is actively positioning itself to benefit from any cash infusion from Google in Yahoo. The $9 billion deal Microsoft offered to Yahoo asked for Microsoft buying $8 billion in Yahoo stock at $35 a share. This amounts to 228.5 million shares of Yahoo. The $35 a share may seem like a good premium for Yahoo shares but it really isn’t for three reasons. First, the premium should be calculated with reference to the average price of Yahoo shares in the months preceding the Microsoft bid, and not the $19.18 closing price on January 31st, and this would raise the reference price to around $25. Second, after a cash infusion from Microsoft at $35 a share, Yahoo shares will become more valuable. Third, Yahoo’s share price does not reflect the value of Yahoo’s Google deal, but once Yahoo accepts Microsoft’s terms, the Google deal will become a certainty.

Now, if Microsoft manages to get such a deal from Yahoo, Yahoo’s share price will immediately pop to around $30. If Microsoft were to sell its Yahoo stake at this point (which won’t be possible) Microsoft will have effectively paid less than $2.5 billion for Yahoo’s search assets. This is an awesomely sweet deal by itself, but this is not all Microsoft will be getting.

The Microsoft deal will be quickly followed by rumors of Google acquiring a stake in Yahoo, and this will likely result in further upwards momentum for Yahoo’s stock. Eventually, once Yahoo’s stock has hit a suitable high of around $36-40, Google will come through with its “strategic investment” to “counterbalance” Microsoft’s influence at Yahoo. Google will likely buy around 25 percent of Yahoo at more than $50 a share, and this will raise Yahoo’s share price above $40. If at that point Microsoft chooses to exit Yahoo, it will realize a gain in excess of $1.14 billion on its 228.5 million Yahoo shares. So even if Microsoft in addition to its $8 billion stock investment pays Yahoo $1 billion for Yahoo’s search assets, overall it will come out ahead with a gain in excess of $140 million.

Microsoft is unlikely to exit Yahoo in order to make its gains concrete, but the above sequence of events is central to Microsoft’s strategy. Microsoft is counting on the huge gap that exists between Yahoo’s current valuation and Yahoo’s potential valuation if it sells its paid-search assets to Microsoft. Microsoft is hoping that by quietly advertising the gap in Yahoo’s valuation and the associated loot to be had, it can get corporate raiders and short-term profit seekers to flock to Yahoo and force Yahoo management’s hand.

So far Microsoft has only had limited success getting corporate raiders do its bidding. The problem for Microsoft is that it has to be discreet; otherwise, some of its executives might end-up behind bars for conspiracy to manipulate Yahoo’s stock. Fortunately for Microsoft, Yahoo does not have many options and Microsoft has plenty of time to get its act right.

Yahoo has mostly itself to blame for the situation. The terms Microsoft is offering for Yahoo’s paid-search assets are incredibly harsh and humiliating, but Microsoft is unlikely to sweeten them. The original deal Microsoft proposed in 2006 involved Microsoft “co-owning” and not outright owning Yahoo’s paid-search assets but Terry Semel dismissed it. Since then Yahoo’s management has proven completely hostile to Microsoft overtures and has been unwilling to make concessions. Consequently, Microsoft does not owe Yahoo any favors and it has “earned” the right to own Yahoo’s paid-search assets. Besides, if Microsoft offers a decent deal it may lead to a rise in Yahoo’s share-price and release pressure on Yahoo’s management.

Some rumors suggest Yahoo maybe seeking a merger with a third-party but it is unclear how such a merger will help Yahoo collect the “dues” from Google. Such an attempt can only help Microsoft as it can join the deal by offering to unlock Yahoo’s hidden value in exchange for Yahoo’s paid-search assets. The only thing Yahoo can do is to get Google to bid for its non-search assets, and collect its “dues” that way. Effectively, Yahoo can try to sacrifice some of its non-core assets to save paid-search; however, Microsoft may get antitrust regulators to block that exit as well.

Currently, there seems to be stalemate. Yahoo is under siege but it is not yielding, and Microsoft can’t afford to back off. Time though is on Microsoft’s side. If Yahoo fails to extricate itself from the siege quickly, it may only be a question of how much Microsoft gets paid for taking over Yahoo’s paid-search assets.

LAST UPDATED by Usman Latif  [Jul 07, 2008]

Random Permutation Generation

Many problems require the use of randomly generated permutations, kpermutations, and ksubsets. Simple and efficient algorithms exist for producing these combinatorial objects; however, these algorithms are somewhat subtle and most people do not intuitively understand them. The algorithm for generating random permutations is the most significant of these algorithms, and the other algorithms are simple variants of this algorithm.

Permutations are rearrangements of items where the order the items are listed in is important. For example, the rearrangements {1,3,2} and {1,2,3} are two different permutations of the numbers [1,2,3]. A permutation has the property that it contains all the items in a given set without any repetitions. For instance, the lists {1,2}, and {1,2,2,3} are not permutations of [1,2,3].

It is helpful to visualize permutations as paths in a permutation tree. A permutation tree models the choices available when forming a permutation. The colored edges of the permutation tree denote the items that are being permuted. A particular item corresponds to edges of a specific color. Following an edge in the permutation tree is equivalent to selecting an item, and permutations are formed by following paths from the root to the leaves. Figure 1, shows a permutation tree for four items.

4 permtree

A permutation tree of n items is generated recursively; at the root all n items are available for selection, so the root has n edges of different colors leading to n subtrees. At the subtrees of the root only n-1 items are available (one item has been selected), so the subtrees of the root each have n-1 edges leading to n-1 subtrees. The color of the missing edge corresponds to the the item that has been selected, and each subtree itself is a permutation tree for n-1 items (see Figure 1). This process is repeated with the subtrees of the subtrees, and so on, until the leaves are reached. At that point all items have been selected; therefore, no edges are available to continue the process.

This construction of the permutation tree assures that all permutations of n items correspond to paths in the tree. Also, no two distinct paths correspond to the same permutation of items. This implies that there are as many permutations of n items as there are paths in a permutation tree for n items.

Counting the number of paths in a permutation tree is fairly straightforward; at the root n paths emerge, and each of these n paths splits into n-1 paths at the subtrees of the root. This gives n*(n-1) paths, and each one of these splits into a further n-2 paths at the next level in the permutation tree. The paths terminate at the leaves by which time the original n paths have split into n! (n factorial) paths, and that equals:

n! = n*(n-1)*(n-2)*..*1

An algorithm for selecting random permutations must assure that it has equal chance of picking any one of the n! possible permutations. Figure 2, suggests a simple algorithm for generating random permutations; to get a permutation of n items, simply pick a random path from the root to the leaves in a permutation tree for n items.

A random path can be picked by starting at the root, randomly picking one of the available edges to get to a subtree, and repeating that process to get to successive subtrees until a leaf is reached. The animation below illustrates this process.

permtree anim
Figure 2, Picking a random path in a permutation tree

Intuitively, this process works because the random path picking algorithm is not biased towards any particular path; therefore, all n! paths have equal chance of being picked. More formally, the probability of picking a path consisting of edges {E1,E2,..,En} is the probability of picking the path leading to edge En, and multiplying by the probability of picking edge En given that the path leading to En has been picked.

P({E1,E2,..,En}) = P({E1,E2,..,En-1}) * P(En|{E1,E2,..,En-1})

The above equation can be recursively expanded by substituting in the value for P({E1,E2,..,Ek}) in the right hand side of the equation.

P({E1,E2,..,En}) = P({E1,E2,..,En-1}) * P(En|{E1,E2,..,En-1}) 
                 = P({E1,E2,..,En-2}) * P(En-1|{E1,E2,..,En-2}) * P(En|{E1,E2,..,En-1}) 
                 ..
                 = P(E1|{}) * P(E2|{E1}) * .. * P(En|{E1,E2,..,En-1})
                 = P(E1)    * P(E2|{E1}) * .. * P(En|{E1,E2,..,En-1})

The probability P(Ek+1|{E1,E2,..,Ek}) can be computed readily. At the point where edge Ek+1 is in consideration k edges have been selected, and edges Ek+1 to En are available. Edge Ek+1 is randomly chosen from all the available choices, so there are n-k choices for it. Therefore, the probability of picking edge Ek+1 is 1 / (n-k). The following expression gives the probability of picking a particular path in a permutation tree for n items:

P({E1,E2..En}) = P(E1) * P(E2|{E1}) * .. * P(En|{E1,E2,..En-1})
               = 1/n   * 1/(n-1)    * .. * 1
               = 1/n!

The above expression implies that the path picking algorithm described above picks any path with probability 1/n!; therefore, the algorithm picks all permutations with equal probability.

The random path picking algorithm needs to be efficient; generating the whole permutation tree in advance is not practical. Fortunately, a complete permutation tree is not needed. Each iteration of the algorithm only needs to keep track of the path chosen so far, and the possible selection of choices for the next edge of the path; Figure 2, illustrates this idea.

There are a maximum of n outgoing edges at any node of the permutation tree, and these can be modeled as numbers 1 to n in an array. Selecting an item frees up space for one edge, and a clever algorithm can use that space to store the path. An algorithm that implements these ideas is given below:

for (i=1 to n) ar[i] = i;
for (i=1 to n) swap(ar[i], ar[Random(i,n)]);

The above algorithm partitions the elements of an array of n elements into two parts, positions 1 to i-1 and positions i to n. Positions 1 to i-1 contain the segment of the path chosen so far, and positions i to n contain the unselected edges. The first loop of the algorithm is just initialization of the array elements, and the second loop does all the work in this algorithm.

Each iteration of the second loop extends the path selected so far by one edge. This is done by picking a random edge from the unselected edges and moving it to position i. Of course, the edge at position i has to go somewhere; therefore, the edge at position i is moved to the former position of the selected edge. The table below illustrates this algorithm for picking a 5 item permutation.

IterationSelected EdgeSelected PathUnselected Edges
?1 2 3 4 5
1442 3 1 5
214 13 2 5
334 1 32 5
454 1 3 52
524 1 3 5 2

In the above table, array ar is the concatenation of the last two columns. At each iteration one edge is chosen and moved to position i. Unlike the permutation tree, unselected edges in array ar are in no specific order. This is all right as the algorithm chooses amongst the unselected edges randomly, and any specific ordering of the edges is irrelevant.

Another thing to notice is that after the second last iteration of the algorithm the contents of the array ar do not change. At the last iteration only one edge is left unselected; going through the iteration simply swaps it with itself and results in no change. This suggests that the last iteration of the algorithm is redundant and the algorithm can be changed to the following:

for (i=1 to n) ar[i] = i;
for (i=1 to (n-1)) swap(ar[i], ar[Random(i,n)]);

Multiple random permutations can be generated by repeating the second loop. As long as array ar contains a permutation of numbers 1 to n, it can be reused without reinitialization.

Choosing a random k-permutation turns out to be fairly simple as well. A kpermutation is a rearrangement of k items selected from a total of n items. Kpermutations are a generalization of permutations. A permutation is simply a rearrangement of all n items; a permutation is equivalent to a kpermutation when k equals n.

5 3 permtree
Figure 3, a kpermutation tree for n = 5 and k = 3

Kpermutations too can be visualized as paths in a permutation tree. Kpermutations correspond to paths of length k from the root of the tree. A kpermutation tree can be generated by pruning the permutation tree so that all possible paths from the root to the leaves are of length k. Figure 3, shows such a tree for n = 5 and k = 3. This correspondence with paths suggests the following algorithm for generating kpermutations:

for (i=1 to n) ar[i] = i;
for (i=1 to k) swap(ar[i], ar[Random(i,n)]);

The algorithm above is a modified version of the algorithm for generating permutations; that algorithm yields a path of length n while this one yields a path of length k. Also, it is not possible to optimize this algorithm to cut down one iteration as was the case with the previous algorithm. The optimization in the algorithm for random permutations was possible only because at the last iteration only one edge was left unselected. However, with n less than k more than 1 edges will be available for selection at the kth iteration, and randomly choosing one of the edges will make a difference.

The final algorithm is for choosing a random ksubset. A ksubset is simply a kpermutation where the order of items is unimportant. The lists {1,3}, {3,1} are different kpermutations of the items [1,2,3], but they are the same ksubset. While the lists {1,2} and {1,3} are different ksubsets as well as different kpermutations of [1,2,3].

Paths from the root to the leaves in the kpermutation tree enumerate all possible kpermutations; therefore, they also enumerate all possible ksubsets of n items. The only problem is that each ksubset is enumerated multiple times. This is not an issue as each ksubset appears exactly k! times; once for each permutation of the items in the ksubset. As each ksubset occurs as often as any other all ksubsets have the same probability of being picked at random.

Some problems require ksubsets that are arranged in a sorted order. For example, {1,2,3} and {1,3,2} are the same ksubset but {1,2,3} is in ascending order while {1,3,2} is unsorted. For small k, sorting is inexpensive; therefore the best approach to generating ordered ksubsets in such cases is to simply sort the elements of the ksubset after generating it.

The algorithms for random kpermutations and ksubsets use O(n) space and O(n) time to generate objects of size O(k). O(n) space is needed for the array to store the chosen path and the unselected edges, and O(n) time is needed for initializing the elements of this array. The time and space complexity is not a problem when n and k do not differ by too much, but if n is sufficiently large and k is small, the algorithms can be very inefficient. Part II of this article will discuss techniques to bring the space and time complexity of these algorithms to O(k).

LAST UPDATED by Usman Latif  [Feb 13, 2004]

Part II – Random KPermutations and KSubsets in O(k) space

The first part of this article described some algorithms for the generation of random permutations, kpermutations and ksubsets. The algorithm for generating permutations is very efficient; however, the algorithms for generating kpermutations and ksubsets can be improved significantly.

The following algorithm was given for the generation of random kpermutations, and ksubsets:

for (i=1 to n) ar[i] = i;
for (i=1 to k) swap(ar[i], ar[Random(i,n)]);

The big problem with this algorithm is that it uses space inefficiently. Selecting a Kpermutation of size k from n elements can require O(n) space. For instance, when selecting a KPermutation of 1000 items from a collection of 100,000 items, the algorithm requires an array of 100,000 elements. This is clearly suboptimal and undesirable. The algorithm also initializes the array of 100,000 elements, and this results in O(n) time as well.

Ideally the algorithm should use only O(k) space and O(k) time. This turns out to be possible but requires a few changes to the basic algorithm. The crucial insight is that while generating a Kpermutation, the algorithm disturbs at most O(k) elements in the array containing the item numbers. This allows using a sparse array of O(k) space to represent an array of n elements.

The sparse array contains the numbers 1 to n arranged in ascending order, and it is modeled by a hashtable. Elements that get disturbed by the kpermutation algorithm are placed in the hashtable, and all other elements are assumed to be in their correct places in the array. For example, an array with the following contents:

1 2 3 4  6 5  7 8 9

can be represented by a hashtable containing the following key-value pairs:

(5 6)
(6 5)

An array reference for element i is satisfied by first searching for the key i in the hashtable, and returning the associated value if it is in there; if the key is not in the hashtable the value i is returned. For example, an array reference for 3 in the above array returns 3, but an array reference for 6 returns 5 (the key 6 is in the hashtable).

The kpermutation algorithm only uses a swap operation, so by implementing a swap operation for the sparse array representation the original algorithm can be used without any significant modifications. The algorithm will change to the following pseudo code:

ar = init_htable();
for (i=1 to k) swap_keys(ar, i, Random(i,n));

The swap_keys operation is fairly straightforward to implement, but as the keys might or might not be present in the hashtable, there are four cases to be handled. The four cases and the actions required to swap the keys are summarized below:

Key i     Key j      Action

present   present    swap values, i.e., (i,n)(j,m) -> (i,m)(j,n) 
present   absent     add (j,j) to htable and swap values 
absent    present    add (i,i) to htable and swap values
absent    absent     add (i,i) (j,j) to htable and swap values 

The algorithm takes O(k) space and O(k) time as the swap_keys function executes k times, and on each invocation it uses O(1) time and can allocate at most O(1) additional space. After the algorithm has finished executing, the kpermutation can be retrieved from the hashtable by retrieving keys 1 to k in order. Actually, the algorithm can be easily optimized so that the kpermutation is directly stored in a separate array. Other optimizations are also possible but they don’t improve the O(k) time and space behavior of the algorithm beyond constant factors.

Unlike the original random kpermutation algorithm, this algorithm requires a reinitialization of the hashtable whenever a new kpermutation needs to be generated. Generating many kpermutations without reinitialization of the hashtable will lead the algorithm to use more than O(k) space; this behavior being the result of the swap_keys operation executing more than k times.

As discussed in part I of this article, the random ksubset generation algorithm is the same as the kpermutation algorithm, so the above algorithm can generate ksubsets using O(k) space and O(k) time. However, if the ksubset is required to be in sorted order then a bucket sort can be used to sort the output of this algorithm. A bucket sort will work very well in this scenario as the items of the random ksubset will be distributed randomly over the interval 1 to n. Combining bucket sort with this algorithm will allow sorted ksubsets to be generated in O(k) time.


LAST UPDATED by Usman Latif  [Mar 26, 2004]