Google's obsession with data has defined a company that in eight short years has become one of the most successful businesses in modern times.
But take privacy into account, and Google's stated mission--as the dominant Internet search company, to make all the world's information accessible and usable--can have unsettling overtones. Google's success stems from its growing ability to fine-tune advertising pitches to users, thanks to its intimate knowledge of their online behavior. And for some that sounds scary.
In an age of growing concern over privacy, people worry about the government looking at their phone records. They fear having their medical records fall into public view or their library borrowing habits reviewed by federal agents.
Then there's the privacy showstopper from earlier this year: AOL imprudently released the search histories of 657,426 users onto the Internet. The gush from AOL was the biggest warning yet that just a few clicks of a computer mouse are enough to dump vast amounts of private data into the public square.
"This was the first time users really realized, `Wow, people are really logging everything I do,'
" said Eric Jensen, who researches Internet search behavior at the Illinois Institute of Technology. "That was the big shock.''
Google's competitors have taken note. Microsoft and Yahoo are beginning to focus on privacy because it's an issue consumers care about. The rivals are refining their products and sharpening marketing messages to gain ground on Google, which commands nearly a 50 percent market share in search, according to Nielsen/NetRatings.
"This is going to play out in competition on the Internet," said Andrew Sherman, a lawyer who has written a soon-to-be-published book on Internet privacy and the law. "People might say, `I'm going to use Yahoo as opposed to Google because they protect my data better.' "
Microsoft has adopted a stringent privacy review policy for all products in development. The company is touting its privacy awareness on its new search engine, Windows Live.
Google is hustling to stay on top of the privacy issue too. The company, known for storing records of virtually all the activity on its many Web services, says the trove of data helps its products get better over time. Google's spelling function and even the quality of its search results depend on its use of user data. It also helps the company track fraud in the form of heavy clicking that distorts Google's search results and advertising.
Still, Google is refining its product development programs in reaction to the privacy debate. "The key thing we're trying to do now is to be aware of the privacy implications of everything we do," said Nicole Wong, Google's general counsel for products.
Even so, Google is finding that the privacy issue can sting it in unexpected ways.
Take the AOL leak. AOL took the heat. But it turns out another company actually conducted the searches. That was Google, which also set the clock to store users' search records through at least 2038. All searches on AOL are powered by Google.
Building the honey pot
When job seeker Todd Malicoat first posted his resume online a few years ago--complete with his name, address and phone number--he freely traded his privacy for a chance at a better job.
"I gave up on privacy a long time ago," said Malicoat, who now works in New York as a consultant, advising companies on Internet advertising.
People trade privacy for utility all the time. A consumer gives up personal information to get a credit card. Grocery shoppers swap their anonymity for "club card" discounts.
Yet, even now, people have their limits. The same user who posts intimate details on his MySpace page might not want his college grades posted. A rap fan in a Google discussion group might get creeped out when a Ludacris ad suddenly pops up. A worker searching for cancer treatments on WebMD might not want the boss to know.
Despite user concerns, the search companies keep pushing the limits on what they do with user data for one reason: profit potential.
Google and the other search giants deliver users to advertisers, which pay billions to get the most targeted marketing platforms possible. And Google makes no apologies for trying to use its data.
"This whole personalization thing is in its infancy. I believe this is the future of where we're going," said Alan Eustace, Google's senior vice president of engineering and research. "The more information you have about the user, the better."
At Google, the emphasis is on finding new ways to tailor advertising even more specifically, thanks to what it learns about users from their behavior on Google's growing array of sites.
Google's Gmail service combs through a person's e-mail messages to present targeted ads on the user's computer screen. Its discussion group function displays ads that are pegged to keywords that pop up during discussions held on the site.
Google's Desktop program, a virtual file management system, scans all of the user's data--text files, tax returns, electronic wills and such. It indexes and stores the data on enormous server "farms" all around the world. Desktop even automatically stores copies of deleted files.
A user has to sign up for such services and OK the privacy trade-offs involved. But even when a user deploys Google's simplest service, basic search, the company tracks the clicks. It inserts "cookies" that help Google identify any Web page visited directly via the search engine: any video viewed, any product priced, any document downloaded.
The data is linked to the computer's address on the Internet. And the Google cookie makes all of that computer's Google search activity available to the company through at least the year 2038, Google's standard expiration date.
For users like Malicoat, the New York search consultant, Google has gone too far. Google's Desktop service, in particular, is too nosy for him. "I wouldn't touch it with a 10-foot pole," he said. "It isn't a cool enough feature to justify it indexing all the data on my hard drive."
Privacy advocates believe Google is inviting trouble by retaining so much data. Earlier this year, Google successfully resisted a Justice Department subpoena for all search data for a given time period, part of a wide-ranging search for information about child pornography on the Web.
But the privacy advocates say more government snooping, perhaps related to national security, is inevitable.
Because Google is so big, it is a rich target beyond law enforcement. It is attracting hackers, phishers and other scammers too, privacy experts say.
"Google is making itself a honey pot for law enforcement and hacking," said Beth Givens, director of the Privacy Rights Clearinghouse advocacy group. "I don't see any reason--I can't conceive of any reason--why keeping so much data for so long is necessary."
Suing for secrets
The potential downside of such massive indexing could be seen at a conference of lawyers in Chicago this fall focused on one of the boom areas of litigation: e-discovery, the use of electronic records as evidence.
Google and the other search giants promise to protect users' privacy. But when lawyers start seeking user data, those promises may not always hold up, legal experts warn.
"Now people record their thoughts and transactions on their computers, not on paper at all," George Socha, a legal consultant, told some 50 lawyers who attended the Chicago conference.
In a celebrated North Carolina case dubbed "The Google Murder," a man was convicted of first-degree murder after a police search of his computer hard drive turned up records of Google searches for such words as "neck," "snap" and "break" in the days before his wife was murdered.
Google's Wong said the legal departments of all the big search firms are under pressure from a flood of subpoenas from private litigants and law enforcement. Google informs the affected user of any requests for information, but ultimately its policy is to provide the data requested in a valid subpoena.
"It has been an issue for everyone," Wong said, noting that Google alone receives "less than a dozen" subpoenas each month.
Growing power of privacy
It's not just lawyers, hackers and cops that have their sights set on Google. Competitors also are noticing that the vast stores of user information Google controls may be a competitive vulnerability.
Although Microsoft recently has revamped its search capability and launched it as Windows Live, the company runs a distant third in search, with only an 8.2 percent market share. Google dominates with 49.5 percent, and Yahoo takes second place at 24.3 percent, according to Nielsen/NetRatings.
That may help explain why Microsoft is taking a calculated risk: pitching its privacy-protection program as a competitive advantage over Google's.
The move is risky in part because Microsoft has a couple of significant blemishes. The company's Windows XP operating software drew criticism after it was launched because of a feature that automatically sent a signal to Microsoft once a day to authenticate the software's copyright.
A Microsoft Internet service called Hailstorm also drew criticism. It enabled users to log onto multiple Web sites with a single identification from Microsoft. But privacy advocates complained that Hailstorm could enable Microsoft to follow a user's tracks across the Web. That feature was changed.
Despite its past missteps, Microsoft today thinks it can carve out an advantage over Google on the privacy front.
"We don't believe that you can simply have a mantra--in Google's case, `Don't Be Evil.' You have to have a program in place," said Peter Cullen, Microsoft's chief privacy strategist.
When users log onto the Windows Live search engine, it automatically scrubs the user's identity from search activity. The search request is given an internal identifier--called an "anid," for "anonymous i.d."
The system allows Microsoft to use the search data for research purposes, enabling it to refine Windows Live much as Google improves its services. The data it uses is real, but without the personally identifiable information.
When Microsoft develops new search products, it uses a heavily structured approach to assuring that privacy safeguards are built in. A multicolored flowchart grades the privacy risks of each innovation, a "privacy advocate" reviews the design at regular intervals, and the product must undergo a formal "privacy review" before it launches.
The rigor of the review process was on display recently when Kim Cameron, Microsoft's chief identity architect, stepped into a windowless office on Microsoft's corporate campus. She came to review refinements to a feature that lets users of Windows Live share their list of contacts with others, and even post some information to Web sites.
It's sensitive territory. And as Koji Kato, program manager, laid out the features of the new service, Cameron methodically detailed the limits of her privacy tolerance.
Users could share certain, widely available data about people. But as Kato pushed several other ideas, Cameron's back stiffened.
There was a proposal to share information about whether a user's contacts were chatting on the Web at any given time. Another suggested posting contact information to blogs. In one instance, users would be allowed to share how many contacts they had, but not the identities of people on their list.
"A lot of the answers to these questions are going to be, `No, no, no, no and no,'
" Cameron said.
An off-the-record response
Google is energetically responding to the thrusts from Microsoft and others.
One example: Google Talk, the company's chat feature. Gmail is Google's biggest user-to-user communication tool, but Talk, with its instantaneous communication, is growing.
Google automatically stores Gmail messages. But product manager Keith Coleman wrestled with the question of whether Google should store chat messages.
Some users might not want a permanent record of the shorthand argot of instantaneous online communication.
The solution: An "off-the-record" feature that works if both parties agree to disable Google's automatic storage function.
"We want users to feel when they use our product that they don't just have as much control as with other products," Coleman said. "We want them to feel they have more control."
Despite Google's efforts and promises, some privacy advocates express concern about a slippery slope related to all the data Google manages.
"It's easy to play nice now," said Daniel Solove, author of The Digital Person, a landmark book on Internet privacy. "But what if they get into financial trouble? All those priorities could change."
The global scope of Google's reach has already presented the Internet giant with headaches. In capitulating to censors in China, Google faced a significant public relations problem. And despite making concessions in exchange for market access, Google still is being trounced by a local Chinese competitor, Baidu Inc.
Google has not made any obvious, AOL-caliber missteps in the privacy arena. But even the company's chief executive, Eric Schmidt, acknowledges that mistakes are possible.
"Never say never," Schmidt said at a conference a few days after the AOL leak.
Google might benefit from some sincere self-evaluation, author Solove said. "Google thinks, `As long as we're not evil, we're a good player in the privacy arena,'
" he said. "But the real world has a way of putting pressure even on a non-evil-doer."