Archive for the ‘Guest columns’ Category

MSFT vs. Google: Zoho’s Perspective

May 29, 2009

This is from Zoho CEO Sridhar Vembu:

Inevitable comparisons are made between the hugely enthusiastic developer response (including from us at Zoho) to Google Wave yesterday with the relatively tepid reponse to Microsoft’s new search engine Bing. The real interesting contrast to us, as independent software developers, is the way developers responded to Silverlight as opposed to the reaction yesterday to Google Wave. Both Silverlight and Wave are aimed at taking the internet experience to the next level. To be perfectly honest, Silverlight is a great piece of technology. Google Wave, as yet, is not much more than a concept and an announcement.

It is easy to dismiss all this with “Oh, the press just loves to hype everything Google, and loves to hate Microsoft,” but that cannot explain why even competitors like us are willing to embrace Google’s innovations, but stay away from perfectly good innovations from Microsoft, such as Silverlight?

Read the rest of it on the Zoho blog.

Advertisement

Why We Don’t Have Good Local Business Content

January 13, 2009

The following is a guest post from Long Hill Consulting’s Marty Himmelstein: 

Local search’s most significant failure is its inability to provide an accurate stratum of content about neighborhood businesses. The necessity for this base layer arises from the defining characteristic of local search, which is that it is model-based. Local search’s first job is to create an accurate depiction of places in the real world. Being found trumps being reviewed. Being found also trumps search engine optimization. When local search is running on all cylinders it will not make qualitative decisions; if there is a shop on Main Street people will find it. There will be no jousting for position, because the demarcation between fact and advertising will be clear.

To address the failure, a number of Internet companies have either been formed or have started initiatives to aggregate content about brick and mortar stores and services, either as their core service or to improve their core service (e.g., user reviews).These initiatives solicit content directly from businesses, and often, following a wiki-type model, from individuals who have no direct relationship with the businesses for which they create content. In the latter case, the contributors might receive a small financial incentive if the information they submit can be verified, usually by being ‘claimed’ by a submitted business, or when a claimed business buys additional services from the aggregator. To create an initial layer of content, most companies purchase some form of Internet Yellow Pages content from one of several compiled-list vendors. The main content sources for these lists still derive from phone directories, which the list vendors improve through varying degrees of quality control and enhancement.

Because these efforts proceed from one or more incorrect assumptions about the nature of local search, it is unlikely they will be successful. Most adhere to an erroneous ‘walled garden’ view that business content gathered on the Internet is a defensible asset. But information flows freely on the Internet, and since these services don’t control the information sources they require to assemble and maintain a data asset, no data they aggregate can be defended. (The information sources are businesses themselves, and ultimately they control their own information.) It will also be hard for any one of these initiatives to gather a critical mass of content. That local content is both valuable and not defensible is an apparent, not a real, contradiction. Local content is hard to create, but once created is a common data resource: it is best to think of the information about a business as nothing more than a structured web page. Another problem is that Google has already created the technical infrastructure to aggregate and distribute structured business content, and other initiatives have nothing to offer that improves on Google’s technology. Lastly, these initiatives assume the Internet can be used to short circuit real-world notions of community, but it can’t. Unmediated user contributed content, so successful for expressing creativity and points of view, is the low-hanging fruit of local search. It is not the organizing principle upon which local search will be built.

From the perspective of a service that requires better business information than that which is available to them, the justification for a walled garden seems simple: “We know people are willing to contribute content. We’ll create tools to make it easy for businesses, or, following a wiki model, anybody, to supply us with business information. These tools have a development cost and it takes effort to solicit and verify content, but having done so, the content we gather will be much better than standard listings data. This content has value, and there is no reason for us to give it away to others. Further, our users are precisely the ones these businesses want to reach. We’ll try a freemium model, and charge businesses for enhanced representation.”

Too many of these services are vying for businesses’ and users’ attention for any of their individual efforts to succeed. Businesses won’t contribute and maintain the same content at multiple services and pay for redundant capabilities at each. Moreover, once a business creates its digital profile, the marginal effort to distribute it to multiple services is (or could be) small. The ‘business content is a defensible asset’ model erroneously conflates business content with the value-added services that rely on that content to be successful. Business content is indeed valuable, at least as much to the services that need it to build compelling sites and capture advertising revenues as to anybody else. The demand for this content will drive the price to the businesses that supply it to zero. It’s not even hard to imagine scenarios in which businesses derive revenue from syndicating their content to downstream services. One way to ensure that Google doesn’t become the sole depository of business content is to give businesses the incentive to distribute theirs widely.

From an Internet ecosystem and data modeling perspective, multiple walled gardens of duplicated and separately maintained business content makes no sense at all.  Popular services might get a continual stream of updates, but new or struggling services won’t, making it even harder for them to gain traction. This ‘each to his own’ approach will perpetuate a morass of inconsistent and obsolete content, much as we have now, to the continuing dismay of consumers.

The adherents to the flawed garden analysis are either unfamiliar with a basic data modeling tenet, or think it doesn’t apply on the Internet. Data can be distributed, copied, and duplicated but each occurrence must be traceable to a known provenance that has an unique identity. For the purposes of data modeling, the Internet is nothing more than a very big disk drive. The storage medium has changed, the requirement for sound data engineering has not.

Unique identity is not a new concept on the web. Web pages and blog posts and comments have at least an informal notion of identity, and second generation content syndication formats support stronger notions still. These formats also support structured content, a requirement for business information on the web. Google Base, a notable example, specifies Atom and RSS 2.0 formats to allow data providers to specify and upload structured content to, well, the Google Universe. Google also provides a query language API so developers can retrieve content from the Google database. Google’s walls are permeable: their interests are served by good content, not its ownership.

Unfortunately, the quality of local business content lags well behind the Internet’s technical capabilities to create, aggregate and distribute it. An important reason for this quality deficiency is that we have relied almost exclusively on the technology that enables the next generation of local search, while underestimating the need to create online representations of the real neighborhoods and relationships within which businesses exist. As I noted in a previous post:

The fundamental role of a community in local search is to establish an environment of trust so that users can rely on the information they obtain from the system. Businesses exist in a network of customers, suppliers, municipal agencies, local media, hobbyists, and others with either a professional or avocational interest in establishing the trustworthiness of local information.

Businesses are responsible for their physical storefronts, and, ultimately, their digital storefronts. But businesses don’t exist in a vacuum, either physically or online. They require the services of the community to which they belong – when online, especially in the formative stages of local search. To create accurate digital storefronts, then, we need to enable the participation of the various constituencies that are part of a community. It is within this framework that a reliable stratum of local content will be created and maintained.

Individuals who contribute content because of a small financial incentive, who are most of the time trustworthy and altruistic but will occasionally be neither, and who have no intrinsic connection with the neighborhoods in which the businesses they describe reside, do not constitute a community. It’s not that their contributions aren’t valuable or even necessary, it’s that they are not sufficient for ensuring an accurate depiction of the local environment. Pick your own war story about how local search failed you in a time of need (everybody has one), assume your need was urgent, and then consider the assurances you would require from the system to trust the information you get from it.

The only way local search can meet these assurances is to build them into its basic fabric. The basic fabric of local search is the community, because the community provides the means to establish the network of trust that is essential to local search. The purely user-contributed content model that works so well for YouTube has shortcomings when applied to local search. The preeminent virtue for YouTube is creativity, for local search veracity. YouTube is whimsical, local search mundane. People use YouTube to pass time, local search to save time.

In an immeasurably weightier circumstance, Winston Churchill remarked “You can always count on Americans to do the right thing – after they’ve tried everything else.” And so it is with local search. Its eventual shape, though tortuously arrived at, seems to me easy to discern. Each business will have its own digital identity and a core of factual information, kept in a standardized format, which it or its designees will maintain. These designees will aggregate content at the community level, as defined above. Designees will include entities, some new, but certainly some that already exist, which are trusted by both consumers and merchants. This core content will feed downstream services. To provide subjective or more detailed information, the basic content will be augmented at various points with user contributed and third party sources of information. Revenue models built around helping businesses and their designees create, maintain, verify, augment and distribute their content make sense; those built around cordoning it off do not.

_____

Marty Himmelstein is the principal of Long Hill Consulting, which he founded in 1989. Marty’s interests include databases, Internet search, and web-based information systems.

For the last eleven years, Marty has been active in location-based searching on the web, a field often called Local Search. Marty was an early member of the Vicinity engineering team. Vicinity was a premium provider of Internet Yellow Pages (Vicinity provided Yahoo!s IYP service from 1996-8), business locators, and mapping and geocoding services.

Microhoo vs. Google: The Battle for Audience and Keystrokes

February 15, 2008

The following is a guest post from Tim Cohn, marketing consultant and author. It has not been edited and reflects the views of its author exclusively.

The Redmond giant has sprung to its feet from its long and comfortable slumber. Much like the browser business before it, Microsoft has realized it had better get into the search advertising business before its too late.

I think we all know who won the browser war. We also know how they did it.

Even with its proposed acquisition of Yahoo!, Microsoft may have already overslept and thus lost this battle. On the surface this acquisition looks like a grab for a piece of the search advertising business. However, just below the surface lie its real targets: the Internet audience and their keystrokes.

Internet Audience?

Keystrokes?

Both beachheads Microsoft has or has had control of nearly since their inception, keystrokes via the personal computer desktop and the Internet audience via browsers – not from birth but before the web’s infancy ended.

Like their importance to Microsoft’s franchise before, both have an equally and even greater importance going forward. Audience begets keystrokes and vice versa. However, It’s hard to control one if you don’t control the other.

Microsoft’s $44 billion offer to acquire Yahoo and its audience is an admission by Microsoft that if they aren’t able to augment their present audience now with an acquisition the size of Yahoo, they won’t ever be able to stem the audience gains being made by Google and their control of the most valuable part of the internet audience — the search audience.

At this point, Microsoft’s not getting control of Yahoo’s audience is the single greatest risk facing their business – hence their offer price and the need to get the deal done. Maybe not today or tomorrow, but let unabated Microsoft faces continued losses in both audience and keystrokes.

Its not a market position Microsoft is familiar with or comfortable.

Why search is the most valuable audience on the Internet.

There are two types of audiences on the Internet. The old and familiar audience type, which is the one served and supported by display advertising.

Advertisers buy ads to reach an audience based on what content a publisher assembles to attract a particular audience. Ads are then priced and sold based on the desirability advertisers have in reaching that particular audience.

At any one time, a large percentage of the publisher’s audience is inactive – not interested in what the advertiser is selling.

Advertisers still have to pay to reach the publishers entire audience regardless of how many people may or may not be interested in the advertiser’s ads or products. Because display advertising is inefficient i.e., reaches more disinterested audience than audience of potential buyers – it sells for less and thus generates less income for publishers.

The other type of audience available to advertisers on the web is search advertising.

Unlike display advertising, search advertising reaches only an active audience – people who have explicitly requested advertisers information about their products or services – by their clicking on ads.

Search advertisers only incur costs to reach their audience when consumers click on their ads. Thus search advertising is significantly more efficient in delivering advertising messages to the exclusively active segment of the Internet audience – people who are actively searching for information.

By definition, search advertising only delivers advertisements to people actively seeking what the advertiser is advertising and selling. Because of this efficiency in targeting and delivery, search advertisers are able to reach more qualified prospects for less than through traditional media.

In turn, search advertising providers like Google are able to charge advertisers commensurate with the value the advertisers receive from reaching a efficiently targeted and active audience.

The result?

By my calculations, Google’s annualized gross revenue from advertising per visitor is roughly twice that of Yahoo’s and nearly four times Microsoft’s (gross advertising revenues divided by web property visits)

At a minimum, a search driven visit is worth at least twice – up to four times more than a non-search driven visit.

This is why Microsoft desperately needs Yahoo’s audience.

Although there are wide discrepancies over what percentage of search each company gets, Google receives between four to twenty times more search traffic than Microsoft and three to five times more search traffic than Yahoo, combined and assuming no market disruption – the two companies would still only generate one fourth to one half the search business of Google.

This acquisition also assumes Yahoo’s ad platform can continue to harvest one half the value Google does whether through Yahoo! or Microsoft’s search product without cultural distraction or interruption from the merger.

Even with their proposed clean room assembly, Microsoft’s acquisition of Yahoo! does not answer how they will make up difference (search volume + gross revenue per visitor).

By doubling their performance (revenue per visit) post merger to meet Google’s present level of performance, a MicroHoo search advertising business gross revenues per visitor would be half of Google’s.

In order for Microsoft to retain Yahoo’s audience, publishers and advertisers- the combined company will also need to produce:

Highly relevant search results for its audience, a functional ad platform for its advertisers, profitable ad distribution for its publishing partners and most importantly: a greater return on its advertisers’ investments.

Without which any new ad platform and search product may grab the attention of a larger audience and gain its keystrokes only to see it lost after they are unable to deliver what the internet audience has already come to expect, find and get from Google.

Of course, this also assumes Microsoft is somehow precluded from using its expanded platform and footprint to reroute ancillary chunks of audience to its new web properties acquired through the proposed acquisition along with their accompanying keystrokes.

In the absence thereof, there may be no stopping Google’s march.

The Battlefield Defined: Local Audience and Mobile Keystrokes

Let’s look at two areas where search will play a role in winning new audience and their keystrokes: Local and Mobile search.

Here are how Microsoft, Yahoo and Google web property’s search are performing today.

Its been reported nearly 50% of searches are local in nature. Let’s see how Microsoft’s Live handles a local brand search for Verizon Wireless in New York, NY.

Live is able to locate Verizon Wireless stores in New York and provides five viewing options: Road, Aerial, Hybrid, Bird’s Eye and Traffic. Are their results relevant? Yes. Could we make our way to Verizon Wireless store or reach them by phone with the information Live provides? Yes.

With the Bird’s Eye view we may even be able to see what our destination looks like. Pretty cool.

Microsoft (US) Brand Search: Verizon Wireless New York, NY

Microsoft Live Verizon Wireless NY Road Map

Microsoft Verizon Wireless NY Road

Microsoft Live Verizon Wireless NY Aerial Map
Microsoft Verizon Wireless NY Aerial

Microsoft Live Verizon Wireless NY Hybrid Map

Microsoft Verizon Wireless NY Hybrid

Microsoft Live Verizon Wireless NY Bird’s Eye Map

Microsoft Verizon Wireless NY Bird’s Eye

Microsoft Live Verizon Wireless NY Traffic Map

Microsoft Verizon Wireless NY Traffic

How does Live perform outside the United States? A search for HSBC in London yields similar results. This particular brand search result is for a location near Trafalgar Square. If you aren’t going to be able to stop by a bank branch in London today, you can still take in the sights.

Microsoft (UK) Brand Search: HSBC London, England

Microsoft HSBC London UK Bird’s Eye Map

Microsoft HSBC London Bird’s Eye

Now let’s try the same searches in Yahoo. Yahoo offers similar results. The look and feel isn’t too much different from those we received from Microsoft.

Initially though, I had difficulty locating the results I was hoping to find. Eventually I did find them – must have been my error.

Our options for connecting with one of the stores include: Getting directions, Save for later, Send to phone and Write a review.

Yahoo (US) Brand Search: Verizon Wireless, New York, NY

Yahoo Verizon Wireless NY “Find a Business”

Yahoo Verizon Wireless Find A Business

Yahoo Verizon Wireless New York

Yahoo Verizon Wireless New York

I can get the same type of UK map results from Yahoo however; I have to pull Yahoo’s UK web property up to get London results whereas with Microsoft I was able to get results from their US site.

Yahoo (UK) Brand Search: HSBC London, England

Yahoo HSBC UK Brand US Search

Yahoo HSBC UK US Search

Yahoo HSBC UK London UK Search

Yahoo (UK) Brand Search: HSBC London, England

Now, let’s run searches for the same terms in Google. Like Microsoft, Google returns five types of views albeit under different button terms: Map, Street, Traffic, Satellite and Terrain Views.

The look and feel of Google’s views seem more visually pleasing than both the Microsoft and Yahoo products, however my appraisal is subjective.

Microsoft’s search and map features seem to be evenly matched with Googles’ and beyond those of Yahoo’s. Microsoft’s “Bird’s Eye” view does appear to ahead of its counterpart – Google’s satellite view.

Google (US) Brand Search: Verizon Wireless New York, NY

Google Verizon Wireless NY Map View

Google Verizon Wireless Map View

Google Verizon Wireless NY Street View

Verizon Street View

Google Verizon Wireless Traffic View

Google Verizon NY Traffic View

Google Verizon Wireless NY Satellite View

Google Verizon Wireless Satellite View

Google Verizon Wireless Terrain View

Google Verizon NY Terrain View

Google offers only three viewing options in the UK at this time compared to Microsoft’s five, yet I can fetch the results from Google’s US property unlike with Yahoo.

Google (UK) Brand Search: HSBC, London England

Google US HSBC Brand Search London UK Map View

Google (UK) Brand Search: HSBC, London England

Overall Microsoft, Google and Yahoo each offer their version of both business and brand rich search results.

From what I can tell, businesses and brands have yet to scratch the surface so to speak when it comes to reaching their potential customers in this new geographically rich and fertile target marketing environment.

Mobile Search and Reverse Business Telephone Number Lookup, a Visual 411

As local information requests are being keyed in from mobile devices, 411 and driving directions are becoming more visually rich and available via search.

Case in point: The business telephone number reverse lookup.

How does Microsoft’s Live render a reverse lookup for Microsoft’s own telephone number? Microsoft delivers the correct result along with the five previously mentioned view options: Road, Aerial, Hybrid, Bird’s Eye and Traffic. The map view does however default to Chicago, IL even though Microsoft is located in Bellevue, WA.

I can find the Microsoft campus on the map after scrolling over a couple thousand miles. I ran several more queries with each defaulting to the same Chicago starting point. I am not logged into a Microsoft account so I wouldn’t think it was based on my computers cookies or IP address which by the way is still several hundred miles south.

Evidently, Microsoft, Yahoo and Google all seem to generate their map results based on your past location specific searches.

Microsoft’s reverse lookup offers: 1 Click Directions, Add to collection, Send to Email, Mobile and GPS and Reviews. The send to GPS requires MSN Direct compatible navigation systems.

Reverse Business Lookup– 425-882-8080 Microsoft

Microsoft 425-882-8080

A reverse lookup for Yahoo’s telephone number in Yahoo produces two results both of which are Yahoo locations. The map provides the same functionality found in their standard searches: Get Directions, Save for later, Send to phone and Write a review. If a web address is associated with the location it will be displayed too.

Reverse Business Lookup– 408-349-3300 Yahoo

Yahoo 408-349-3300

A search for Google’s telephone number yields the same five view options: Map, Street, Traffic, Satellite and Terrain Views as with the brand or business category search before. Additionally, Google provides a dialogue box with more options.

Searchers options are: Get directions, Search nearby, Street view, Save to My Maps, Send to phone and Edit. More information about the business and reviews are also one click away.

With “Search nearby” a searcher can locate additional businesses and services like finding Chinese takeout from their hotel.

Where Microsoft’s Bird’s Eye view appears to have bested Google’s satellite view, Google’s “street view” takes visualization to the next level.

With Google’s street view, Google provides eye level images of locations. It’s not available in every area yet. Coordination with volunteer picture geotagging projects may eventually speed the population of their street level image file.

Google’s new Edit feature lets anybody correct the location of a business. It also prompts business owners to “claim” their business in Google’s Local Business Center. These two options should eventually help them improve their data.

Reverse Business Lookup – 650-253-0000 Google

Google 650-253-0000

Google Street View

Google Street View

Edit Map Feature

Reverse Lookup Edit

By pushing more information out to users third screen (mobiles), Google, Microsoft and Yahoo regardless of their corporate status, have greater potential to attract ever-larger audiences and their keystrokes – a situation where all consumers ultimately win.

Tim Cohn
Search Marketing Communications

Tim Cohn is a marketing consultant, a recommended marketing resource for IBM Business Partners, Google Advertising Professional and author of the upcoming John Wiley and Sons book: For Sale By Google.

Guest Post: Google Should Power the Local Web

December 12, 2007

The following is a guest post from Daniel Bower, who is part of welovelocal.com, a local search site for the UK. It is presented verbatim without editing and represents his opinion and perspective exclusively:

Only in the last few months did the UK get introduced to the wonder that is the Google Maps GeoCoder; send Google the name of a location and it will suggest you a co-ordinate in return. It sounds like a relatively simple exchange but it’s remarkably useful and a great time saver. This got me thinking about Google’s long term play within the local space, and whether its goal shouldn’t be to create a local portal, but instead to power the local web.

The first point worth discussing is whether or not you believe the zenith of local search is a portal, be it by Yahoo!, Google, or a startup with its own twist. For me, it’s definitely not. Portal sites can in no way reflect diversity of local communities, the range of cultures, languages, and needs and thus the user experience is often lacking: reviews feel unappreciated, and discussions can feel hollow. In the time I have spent working with community sites in London the common theme among the most active has been shared offline experiences, where the Internet is an extension of their real world lives that allows neighbours, friends and family to carry on their conversations in remote locations. These sites are typically small, community led, and either extend, or help forge a common identity among members.

Google can’t operate on such a micro level, but what it can, and in many ways already is doing, is power these smaller local sites, providing search technology, mapping, and some point in the future both business data, and an advertising platform.

It’s not secret that Google is amassing its own business directory; its business referral program is evidence of that. Google is also collecting straight from the source via its Local Business Centre, and being Google, there is always the possibility that it could acquire one or two of the more tech savvy data collectors as well. What if Google’s next step were to open this directory to the public, directly via its current Maps API and allow any website to republish it?

Using Google a local site could now deploy a fully functioning business lookup feature, complete with world class search engine and mapping functionality, ready for any small community to build upon it with all the usual user generated trimmings. Of course, what Google also provides is the sponsored ads, essentially a more sophisticated version of AdSense that’s highly relevant, geo targeted, and built specifically for the SME market. A small business owner could then market directly to a web site of its choosing, much like traditional brands can do so using the content network, targeted to a specific area, and a specific set of keywords. What is more the publisher gets a unique method a monetising their site, an area that any startup working within the local space will tell you is a particularly long uphill struggle. Freeing up the business data would also spur on the sort of creativity that Google looks to encourage via its iGoogle platform and the new Maplets feature, not to mention the potential for local data mashups.

Google’s local offering needs to look to the company’s roots, and this move would be firmly inline with some of its core ambitions, to further organise the sea of data and to continue to provide highly relevant ads. By abandoning its current centralised local strategy in favour such a decentralised model, it could firm up its position in the space for some time to come.

Himmelstein on G’s Local Biz Referral Program

September 10, 2007

Guest columnist Marty Himmelstein is a local search expert who founded Long Hill Consulting. He was with Vicinity Corp. (acquired by Microsoft) and wrote one of the original “local search” patents before that term existed. The sentiments in the article are entirely his own. I have not contributed to or edited the piece.

Google’s recently announced Business Referral Program, where it pays individuals to submit information about local businesses, is important less for what it is than what it will be. It is a signpost not only of Google’s intent, but of their understanding of how the Internet will develop. For while Google doesn’t make trends, they do have a keen eye for discerning them. Their patient execution of a plan based on their reading of the road ahead is nowhere more apparent than in local search. These trends have been remarked upon before, and at least some of Google’s advantage is that while others watch Google, Google’s attention is straight ahead. These trends include:

  • Decentralized collection of business content, from the edges in The infrastructure that gathers blog posts from the far reaches of the web is as well-suited to aggregate content from businesses, wherever they are physically located. Teenagers create YouTube videos because they have free time. Businesses will create YouTube videos because of competitive necessity. Undoubtedly they’ll have teenagers create videos for them, melding free time with usefulness and profit, a prospect which should cause YP Publishers to reach for the smelling salts.The centralized collection of business information was an artifact of the organization of the telephone network. This was fine for YP Publishers but less than ideal for either businesses or consumers. By creating the communication channels that enable businesses to directly control their symbolic representations, the Internet has made the contrivance of centralized content collection unnecessary. In the not too distant future the idea that businesses are responsible for both their digital and physical storefronts will seem entirely unremarkable.
  • The importance of community and neighborhood to local search: The fundamental role of a community in local search is to establish an environment of trust so that users can rely on the information they obtain from the system. Businesses exist in a network of customers, suppliers, municipal agencies, local media, hobbyists, and others with either a professional or avocational interest in establishing the trustworthiness of local information. These community members can contribute unique perspectives to create a rich and accurate depiction of the businesses with which they are involved. The group targeted by Google’s new program, college-aged students who want to earn extra spending money, hardly comprise a community as described. But it is a start. One must assume the current program is a precursor to a more disciplined and organized initiative where Google works with organizations that have more substantial relationships of trust in the local community.

  • Rich and structured content: The program announcement said nothing about structured content. It didn’t have to. The information Google gathers is headed right for Google Base. The initial content Google is requesting is basic, but the sky is the limit for what is to come. For example, Google Base already supports a product type, and there are several ways Google could make it easy to associate information about products and the stores that carry them. And, YouTube as a local search interface sounds pretty intuitive.

  • Completeness is key: One of the fundamental tenets of local search is for it to be useful it must be complete – if there is a shop on Main Street it will be in the database. Completeness is necessary to gain the trust of the two most important local search constituencies – consumers and local businesses. Google states it simply: “Google wants local businesses to be easily discovered by people using our products. And we want their information to be accurate and complete.” Google has built its dominance by layering advertising on top of the best natural search results in the business. They will tenaciously adhere to the same philosophy in local search.

Some Google competitors might take comfort in the apparently haphazard and unfinished feel of various Google offerings. A more appropriate response would be alarm. Google’s fledgling projects are part of an encompassing architecture measured not in a year or two but five or more. (Consider that the results of a typical Internet Yellow Pages search have barely improved in the last ten years.) It is inevitable that the Internet will displace other mediums as the starting point of practically all local advertising – including advertising destined for print, television and radio. It will also take time for Google and others to demonstrate the value of local search in a way that makes sense to Small and Medium Businesses (SMBs), and other actors in the local search community. There’s still a lot of spadework to be done, and combined with the sheer size of the local search market, the extended adolescence of Google Base, Google Coop, the Business Referral Program, and other projects is closer to necessity than profligacy. The current value of the content in Google Base is of no consequence. Its function is to help Google build the next generation of Google Base, when content will matter.

Basic business content doesn’t belong in a walled garden. (Bill Burnham has a great series of posts on the problem with walled gardens, and on Google Base.) The ‘owners’ of business content are business proprietors themselves, and they can do what they wish with their information. They can provide it to Google, Yahoo, their local newspaper, whomever. Nor is Google Base is incompatible with open content. The Google Base data specification (based on RSS/Atom) could even serve as an open or de facto standard for specifying business content – there are just so many ways to express business information in XML. Much of the technology, in the form of RSS, is already in place to enable local search directories (not just Google) to aggregate content directly from businesses or proxies who create content on their behalf. (The college students who participate in the Business Referral Program are a simple form of proxies.)

The danger is that Google, aided by the inaction of the rest of the industry, patiently accumulates a data asset of basic business information. If Google makes it easy enough for SMBs to contribute their content, Google could have its walled garden, by default. It could be that by the time the bulk of SMBs understand the relevance of the web, Google will be their preemptive first and only choice for interacting with it. In this scenario, the criteria for success for the Referral Program is soberingly low. Google needs only enough early adopters to bring along the bulk of more conservative businesses.

The data providers who vigorously defend the current value of their business directories have the most to lose, because if business content remains in a walled garden, it won’t be theirs. If their sales channels are to have future value, it is not in building the next generation data asset, but in providing a migration path onto the Internet for their customers. The value is the relationship, not the data. (But the relationship is just a start, it guarantees nothing.)

Google will build a high quality directory of local business information. This directory and its integration with Google’s other services will give Google a competitive advantage in local search, but the fortress it builds won’t be impregnable. This is because Google’s unassailable strength in corpus-based search is not of primary importance in local search. Whereas the web continues to grow and is beyond the wherewithal of all but Google and few others to manage, local search models a small bit of the physical world, and the bit it models is modest in size and constrained in its rate of growth. Further, local search depends less on algorithmic richness and more on collaborative content creation and social computing, areas in which Google doesn’t have a preemptive advantage.

Any chance at a level playing field in local search disappears if the field isn’t built on open content. The easier it is for businesses and their designees to create, maintain and distribute their content, the harder it will be for any one player, whether Google, Yahoo or Microsoft, to construct a walled garden from that content. Google will be a prominent distribution point of local data, but still just one of many. But if there is going to be any game at all, its basic rule is if Google doesn’t get to build a walled garden, nobody does. Local content is not a defensible asset.

The predicament for Google’s competitors is that the cost of failure of the Business Referral Program for Google is low, but the cost of its success for others is high. There is virtually no scenario in which Google doesn’t play a key role in defining the local search ecosystem – Google Maps alone ensures that. Yet, Google’s ability to monetize local search doesn’t require they “own” business data and keep it behind a walled garden. A proprietary data asset constructed from commodity content is incompatible with the participatory nature of the web. Google isn’t going to waste its time doing what can’t be done. Further, Google could easily decide the effort to build a direct relationship with businesses is more of a burden than an opportunity. They could leave that job to others, opting rather to provide tools and incentives that ensure the road for content between businesses and Google is easy to traverse. On the other hand, Google wants business content as much as anybody else and they will do what is necessary to get it. If in the process of collecting business content they create barriers that make it harder for others to compete, so be it. Google will use the fecklessness of their competitors to their advantage – they will exploit opportunities to be opportunistic. Competitors inadvertently accommodate Google by their failure to provide an infrastructure to specify, collect, and share business content. The infrastructure won’t work if its intent is to be an alternative to the developing Google ecosystem. Rather, it is necessary to ensure that Google doesn’t get to make all the rules.

Guest Columns: Bring It!

August 1, 2007

So far I’ve had two:

I’m not going to make this necessarily a structured thing, but would love those who have strong views and thoughts about the range of topics covered on this blog to submit columns and opinion pieces. I’d like to make this more of a forum for discussion of industry issues and not just about my opinions.

Be provocative; say what you wouldn’t say at a conference, rant and rave — and so on.

Guest Column: Local Search: ‘Burgs’ vs. ‘Burbs’

July 30, 2007

Guest columnist Bob Chandra is CEO at Grayboxx, a local-search startup. The views expressed in the article are entirely his own. He can be reached at bobchandra AT grayboxx.com.

The technorati and technical press are predominantly in big cities — largely the Bay Area and New York. So we focus by instinct on coastal towns to the point we sometimes forget there’s a whole country in between. Dwelling on the impact of technology in burgs (old English for “cities”), the consideration of the ‘Burbs (suburbs) is often an afterthought. So it’s worth asking the question: What is local search like in the burgs versus the burbs, and is there a substantial difference between the two?

One way to divide up the American public is to look at whether they reside in the larger metro areas or outside of them. For sake of discussion, why don’t we label those who live in the top 25 metropolitan statistical areas (MSA’s) as living in the “burgs” and those outside of these regions as inhabiting the “burbs.” The folks at American Demographics might shudder at the simplification of this divide, but let’s stick with this for now.

Burgs

Approximately 124,398,448 people live in America’s top 25 metro areas. That’s 41% of the population. Through sheer coincidence, Citysearch showcases 41 cities on its site. Once you remove the towns that overlap in the same MSA (ie: New York and Brooklyn) and excise featured towns that have little editorial or user-generated content outside of the ubiquitous ‘restaurant’ category (ie: Indianapolis), you are left with about 20-25 cities. Yelp showcases 27 cities, but practicing the same elimination process, you end up with about 15 cities with meaningful information. The verdict? In short: “burgs” have it good. Their residents can expect reviews from locals and write-ups by editors. Local search is adequate – at least for key categories such as restaurants, nightlife, and other arts & entertainment categories. There isn’t a one-to-one comparison between the aforementioned cities that local search sites serve and the top 25 MSA’s but they have much in common. It is safe to say that local search goes above and beyond standard yellow pages listings in these areas.

Burbs

For starters, those who live outside the top 25 metros aren’t exactly living in the past nor have substantially lower technical sophistication. Kansas tested second nationally for the states with the highest average bandwidth. As far as usage, Utah and Alaska are among the top five states when it comes to percent of residents having computer and Internet access. About 60% of the US population lives in the “burbs” (using our definition). How well do local search sites with user-generated content fare here? Not as well as the burgs. The local search site of a large portal for “dentists” in Provo, Utah yielded only one recommendation, across dozens of available dentists. A search on an up-and-coming local search site for “barbers” in Syracuse, NY yields zero reviews. Local search sites are only starting to address the “burbs” and providing them with helpful information. InsiderPages is an exception in addressing medium-size and small towns. However, since InsiderPages compensated users for their reviews (i.e., through gift cards), the reviews are highly generic and are short on relevant information.

Who Addresses the Burbs?

One of the difficulties local search companies face in reaching the “burbs” is their sheer number and distribution throughout the country. They are not pockets of densely concentrated populations that can be easily targeted with traditional marketing campaigns. And it is expensive to roll out in one suburb after another. Doing so in the hundreds of towns may be prohibitive as far as cost. But there are sites which address these areas, largely in terms of providing local content. Placeblogger, a project of the Center for Citizen Media, illustrates that there are countless blogs at the hyperlocal level (there are at least 8 significant blogs that cover Nashville news alone). While Backfence closed recently, sites like Topix.net continue to provide news links to cities large and small. And Craigslist continues to expand its reach in smaller towns. There are over 150,000 listings for items to sell and over 250,000 service provider listings on Craigslist Denver. Local content sites have taken a stab at regions where local search is largely absent.

What Next?

We may have certain preconceptions of those outside the major metro areas, but the facts suggest they are computer literate and avid Internet users. Much of this area is made up of traditional suburbs, growing exurbs, and emerging micropolitans as opposed to rural areas. It may not be long before local search sites follow the lead of local content sites into the broad expanse of territory that makes up the country. As Sam Walton said once, “There’s a lot more business out there in small town America than I ever dreamed of.”

Marty Himmelstein Disputes the Local.com Patent

July 24, 2007

Guest columnist Marty Himmelstein is a local search expert who founded Long Hill Consulting. He was with Vicinity Corp. (acquired by Microsoft) and wrote one of the original “local search” patents before that term existed. He asked to write this column based on the flurry of recent local patent activity and related coverage surrounding the Local.com patents in particular. The sentiments in the article are entirely his own. I have not contributed to or edited the piece.

More Matter, With Less (Prior) Art

Local.com was recently awarded a patent for geographic search on the web. This patent will encounter rough sailing ahead as it is neither original nor inventive. The bulk of this post will explore the patent’s deficiencies, and refute CEO Heath Clark’s claim that “the methods covered have subsequently become the de-facto standard for information retrieval in the local search industry.”

But first an aside. Patents without merit, either because of obviousness or prior art, are unfortunately common in the software industry. Most of them sulk in the obscurity that is their rightful place. The “yet another one” part of this Local.com episode is not its most disturbing. We can find fault with the patent, with the current state of the patent system, and with Local.com’s predictable posturing, but the company is playing by the rules, broken as they may be. The worth of its IP will be evaluated in the light of day. What is most disturbing is that the facile musings of a financial blogger of no special esteem, and without the technical wherewithal to judge the merits of the IP he so effusively lauds, are not likewise ignored. Rather, they have occasioned an untethered credulity that has caused, at its peak, a tripling of Local.com’s (LOCM) stock price, and a three order of magnitude increase in its volume.

To review, on June 25, Local.com issued a press release announcing its patent, which was granted on June 12. Nobody noticed. On June 28 the stock (LOCM) closed at 3.94 on 29,932 shares. The next day on the Seeking Alpha financial blog, John Gilliam started his daydreaming. According to Eric Savitz of Barron’s the post was picked up by Yahoo Finance, and the attendant chattering sent the stock price to 6.92 on 8,406,829 shares.

Gilliam asserts the patent “is a very broad patent that seems to encompass what the major players in search are already doing with their local search applications.” By the end of the post, it’s a good bet that Google or Yahoo! will buy Local.com because “why would Google or Yahoo sign up to pay $10 – $20 million or so per year or pay royalty fees per transaction that could push it to three or four times that level when they might be able to buy the company outright for $100 million or so?” Mr. Gilliam has a position in the stock.

It apparently hasn’t occurred to the commentator that the major players are already doing what they’re doing because they have prior art, and plenty of it. And that an overworked and under qualified patent examiner missed and misunderstood that prior art. Most telling is that nowhere in the original or in a subsequent post does Mr. Gilliam mention anything about the quality of the service Local.com provides. Does he use their service? Does he find it compelling? The trading frenzy, though, is “a very positive development,” because “one of the largest items on the expense side of Local.com’s income statement is its cost of traffic acquisition,” and the exposure will bring lots of people to its site. How silly to suppose that rash speculation will build the user base that several years of the company’s own efforts haven’t. The fundamentals apply; this company’s value will be determined by the quality of its service. But unless they get a pass from the very companies they expect to curtsy to them, they will have licensing fees of their own to deal with.

Prior Art

To start, while at Vicinity Corporation, I co-authored patent 6,701,307, A Method and Apparatus for Expanding Web Searching Capabilities, filed in October 1998 and granted in March 2004. The patent is now owned by Microsoft (with whom I have no affiliation). The Local.com patent was filed provisionally in May 2004. The Microsoft patent covers the essentials of geographic searching on the web in a manner that is more general and more thorough than the Local.com patent. The concepts described in the patent were implemented and publicly available as a joint project between Vicinity and Northern Light between April 2000 and January 2002.

In June 2002, Google, not unaware of the potential of geographic search (or of Vicinity’s work), awarded its first annual programming prize to Daniel Egnor’s geographic search project, and followed this with their own version, which has been available as part of Google Local since September 2003. Yahoo!, too, has had a similar capability, which from this press release, may have been released as early as April 2003. And MetaCarta’s Geographic Text Search is an interesting project that also predates Local.com’s work. This list is not exhaustive.

Local.com did cite two Microsoft/Vicinity patents, both peripheral to their application, misidentifying the patent number of one with the title of another. They didn’t cite the patent that directly pertains to their work. The first patent cited by the patent examiner was the right Microsoft patent, but he apparently couldn’t discern the similarities between it and the work in front of him, similarities that would be apparent to a person of ordinary skill in the art.

Here’s the bulk of the Microsoft abstract:

“At index time, a Web page is spidered and the text and metatags returned to a processor. The processor extracts spatial information from the text and metatags. A geocode is generated for the spatial information. The geocode is then indexed along with the remaining contents of the page. A subsequent query during query time can search for entries based on proximity to a known location using the indexed geocode.”

And portions of the Local.com abstract:

“A local search engine geographically indexes information for searching by identifying a geocoded web page of a web site and identifying at least one geocodable web page of the web site. [….] The system indexes content of the geocoded web page and content of the geocodable web page. The indexing including associating the geocode contained within content of the geocoded web page to the indexed content of the geocoded web page and the geocodable web page to allow geographical searching of the content of the web pages.”

A Tale of Two Patents

The fundamental building blocks of geographic search on the web are:

· Parsing text from web pages and other kinds of unstructured documents to find location information.

· Verifying that the candidate text does represent a location.

· Transforming the parsed textual description of a location into geographic coordinates, usually latitude and longitude, that correspond to a point on the earth’s surface, a process called geocoding.

· Indexing the geocoded location. This and the next step require the use of spatial access methods (SAMs) so that two-dimensional coordinates can be transformed into one-dimensional coordinates. SAMs are a practical rather than an absolute requirement, since without them spatial searching is costly. If you choose the right encoding method, a search engine can index and search the transformed coordinate, called a spatial key in the Microsoft patent, in the same way the other terms on the page are indexed.

· Performing spatial proximity queries at search time.

At the start of the Geosearch project at Vicinity, we thought that for the types of public and commerce-oriented location-based searching that would be popular on the Internet, most of the content in which we were interested would include well-formed addresses or telephone numbers. Our hunch was right: between fifteen and twenty percent of the web pages we examined had either a US or Canadian address or telephone number, a percentage that is consistent with other estimates. We also tried hard to find complete addresses, rather than settling for simpler-to-find postal codes. The Microsoft patent discusses some of the methods we used to find and confirm addresses in unstructured content. Our decision to do fine-grained address detection and geocoding anticipated the ever-improving mapping and routing services that Google, Yahoo!, Microsoft and others have made available . Today, a local search service that doesn’t include accurate map placement and door-to-door driving directions is at a marked disadvantage.

The Local.com patent is vague on the procedure it follows to find and verify addresses. In fact, the author misuses the term ‘geocode’ to mean textual information that represents an address: “the street address has to be present to be considered a valid geocode in one configuration.” How do you know when you have a street address? I don’t even know all the streets in my little town. The patent neither describes how it makes such a determination, nor cites prior art. But I might be missing something, because “The Geocoder as disclosed herein is able find locations (i.e., geocode information such as an address, phone number, etc.) in a similar way as do human beings.” I’ll jump on the bandwagon, too, if Local.com has IP that substantiates this assertion: there is none in the patent.

You know you have a valid street address by using sophisticated geocoding databases, such as TeleAtlas’s MultiNet. (Google Maps and other services work with these databases on your behalf when you map a location or get driving directions.) These databases are dynamic because they need to maintain an accurate model of the street networks they represent. They are capable of resolving an address to within several meters. The Local.com patent makes no provision for working with these services, contenting itself to use “a look up table containing all of the US town, state, zip code, latitude and longitude” values, which can do no better than zip-centroid resolution. Of course, a software interface to a third-party software service is not an innovation. Nevertheless, the patent’s glib treatment of an important function is indicative of its general inadequacy for solving the problem it purports to address.

The SAM described in the Microsoft patent is based on quadtrees. The basic idea is to hierarchically decompose space into successively smaller regions and assign a unique name to each region. To create a unique name, each time you decompose an area into smaller regions, you append a new piece onto the end of the name of the parent region, a different piece for each of the smaller regions. (The ‘you’ here is a piece of software.) The idea is the same as creating new subfolders under a parent folder. (In fact, you (dear reader) can think of a region as a folder – it might help for the comparison with the methods used in the Local.com patent.) Region 1342 is contained in region of 134, which is contained in region 13, and so forth. A point (a location) is directly in only one region (and indirectly in parent regions). At search time, based on the user’s search center and radius, you figure out the names of the regions that fall wholly or partially within the distance to be searched. Then, quietly, you add these names as additional search terms to the user’s query. The ideal situation is that you only have to add the name of one region, but this doesn’t happen often.

Other spatial access methods can be and are used by other web-based geographic search implementations. The problem with the Local.com patent is that it doesn’t really use any spatial indexing method, and therefore efficient proximity searching on large datasets is not practical. The method employed in the Local.com patent is to assign a web page with an address to a folder that represents a geographic region. However, at search time only one folder is searched (from claim 21):

“receiving a user query … using the location to identify a folder in which to search content of web pages indexed with that folder, the folder selected from a plurality of folders…”

To solve the boundary problem of “businesses located in nearby regions … each folder includes overlapping content of web pages from web sites associated with entities located within a certain overlapping distance into the other folder.” This solution doesn’t work, of course, because there are no boundaries in local search. You can’t know beforehand how far a user is willing to search because that depends on many factors, including what the user is searching for (restaurants or balloon rides), mode of transportation, urgency, and so forth.

To solve the scalability problem, the patent suggests creating different search engines:

“One problem with conventional search engines is that they perform searches on content from web pages collected from all over the world. In contrast, configurations described herein can divide the web into different countries and provide a search engine for each, and each search engine can index content locally for each country. Using this approach, the local search engine disclosed herein deals with data in a certain country and the resources needed to process searches are greatly reduced.”

Apparently, folders only contain pages that are “geocoded” or “geocodable.” Geocodable pages are those that are deemed to be related to pages on which “major geocodes” are found; the example given requires pages share, for starters, the same domain. Geocodable pages inherit the location of the page with the major geocode. (See discussion below.)

Unlike the regularly shaped regions associated with quadtrees and related structures, which makes for easy math, Local.com folders are based on political criteria, such as zip codes, and state and country names. Instead of doing math, “each folder can have a list of zip codes associated with that folder and zip code near a state border for example might be included in two folders.” Zip codes have irregular boundaries and people don’t care about them other than to pick up their mail. Political subdivisions don’t work at all or with enough resolution in many parts of the world, a matter of practical concern since GPS devices and mapping services like Google’s MyMaps let people chart every move they make, even on the high seas. (And chart they will.) Besides, lots of systems use political boundaries to approximate true proximity searching, so there is plenty of prior artlessness, if you will.

Assigning addresses to pages without any

The Local.com patent devotes considerable time describing how to associate a ‘major’ address on one page of a website with other pages on the website that don’t also contain a major address. The patent’s Claim 6 describes how it defines a major address:

“The method of claim 3 wherein: the geocoded web page is at least one of: i) a home page of the web site; ii) a contact page of the web site; iii) a direction page of the web site; iv) an about page of the web site; v) a help page of the web site; and vi) a page of the web site that is no deeper than a predetermined number of links below the home page of the web site; and wherein the geocode contains a complete physical address of the entity associated with the web site.”

There are at least three problems with these claims. The first is that it is conventional practice for web sites, especially business-oriented sites, to include a contact page (etc.) that contains the business’s address. The whole point of pages with obvious names and titles is that they perform the obvious functions denoted by their names. People understand these pages apply to an entire website. It is not original to apply a conventional practice as it is meant to be applied and call it an invention. (I suppose one could argue the process of automation here is itself innovative. Sigh.)

The second problem is there are many cases when the technique doesn’t work or is irrelevant. Travel agencies describe vacation spots far away; the mailing addresses of web hosting providers are mostly irrelevant; suppliers want you to go to stores that sell their products, not where they are headquartered. People are a lot better than computers at disambiguating certain types of information. A geographic search that gets a user to a nearby business’s home or contact page is good enough. Any attempt to do more is as likely to hinder as to help.

The third problem relates to both prior art and industry trends. The lack of structured content standards has been the bane of local search. An early attempt to define a standard was the Small and Medium Business Metadata Initiative, described by Dan Bricklin when he was working for Interland, in 2003. SMBMeta defines “an XML file stored at the top level of a domain that contains machine readable information about the business the web site is connected to.” Location information is part of the file and the file applies to the entire website. A structured file that explicitly describes facts and relationships is not the same thing as a process that tries to infer them. It is better. While the SMBMeta initiative didn’t catch on, equivalent mechanisms are, and will: RSS, Atom, structured blogging, Google Base. The imperfect heuristics we have had to use in an attempt to gather facts about local businesses from unstructured content will be rendered increasingly marginal.

Link Analysis based on Proximity

The basic idea of what the Local.com patent calls ‘georanking’ is that a page’s georank is increased when other pages that link to it within the same folder have an address within the boundaries of the same folder. Claim 11:

“The method of claim 10 wherein performing georanking of content of the web pages comprises: identifying links in content of web pages associated with the folder; and for each identified link, adjusting a georank of a web page referenced by that identified link if the web page identified by that link has a geocode associated with the same folder associated with the web page from which the link was identified.”

As noted, the Local.com concept of geographic folders is flawed, and the flaws are inherited by methods that use them. However, we could generalize the concept of georank and base it purely on distance, so that all pages within a certain distance of a page affect that page’s georank. Such a technique seems that it could have merit. I see two possible problems, one related to usefulness, the other to originality. For the former, pages relevant to local search will tend to be referenced by other pages relevant to local search. A town’s Chamber of Commerce site will link to the sites of member businesses. Georanking might not do much more than reiterate the information implicit in the more general link analysis upon which it is based. For some types of content, travel-related say, proximity doesn’t have much value anyway. If I am considering a trip to a particular exotic locale, I want the impressions of other people with tastes similar to mine, wherever they live. In terms of originality, link analysis is well-covered. I assume there is prior art that evaluates various attributes and relationships between pages on both sides of a link. To pass an originality test one would have to demonstrate that a distance attribute is different enough from other kinds of attributes that are used to characterize a link.

Conclusion

The viability of Local.com’s patent on geographic search is highly suspect. The essential pieces of geographic search on the web have been covered in practice and in patent: parsing content to find location information; geocoding; indexing and searching. Further, Local.com would have an insurmountable task demonstrating their solutions are even functionally equivalent to, let alone improvements over, existing ones. The patent makes some subsidiary claims related to associating addresses on web pages with related web pages, and modifications of standard link analysis techniques to accommodate location. The originality and usefulness of these techniques are doubtful at best.