Welcome to Online Insider ...
... the editorial blog by Marydee Ojala, Editor of ONLINE: Exploring Technology & Resources for Information Professionals. ONLINE Insider intends to extend the reach of the print publication, presenting a more timely commentary on the products, people, and events that shape today's online world. It explores new technologies as they impact the working lives of information professionals, explains resources for specific topic areas, and expounds on information management tools and techniques.

ILI2005 2nd Day Keynote

Marydee @ 10:07 am

It’s the second day of Internet Librarian International and the keynote speech on relevance, Google, and why it matters by Stephen Arnold . The actual title is Relevance: What? For Whom? When? How?

Steve says he doesn’t really have an answer to the last question.

Situational Relevance: Googleplex, patenting relevance, SEO, the GUI fix, and observations.

The SEO conferences are larger than many librarian conferences. He just asked if people think Google is a search engine. Only one person raised their hand. Everyone else agreed Google was more than a search engine. Then he read a sentence from an article in today’s International Herald Tribune that suggested Google should run the Internet. There was a combination of gasps and groans from the audience.

Search is computationally intensive. The Googleplex (not the physical building in Mountain View, but a conceptual representation of Google)has Linux in the center, surrounded by Gmail, News,s Ad system, and Search. In the circle around these applications are data centers (165,000 servers at data centers around the world), Outputs go in 4 directions–XML/HTML, WSDL, POP3/SMTP, and Atom (Gmail, Blogger).

Relevance is now a fuzzy black box. Now the fire alarm is going off. Steve says, “I’m so hot.” Precision and recall is no longer operative in the search world.

Now the hotel is evacuating us. The fire engines showed up, said someone was smoking a cigar in his room and set off the fire alarm. We stood outside for about 15 minutes, but now we’re back and Steve’s back to talking about relevance.

With individualized, personalized search engines, you may get results relevant to you, but not relevant to someone else. Steve says that Google has lots of patents about potential personalizations, plus it has products already personalized. I can see where this wouldn’t be terribly helpful at a library reference desk which is, by its very nature, not personal. Google is mathematics-based entities. Sergey and Larry are algorithmic. Google is watching you and Steve says he’s got a software solution to keep Google from watching you.

How do metrics affect relevance? The new industry is SEO. Aviadian Technologies fell off the Google watch list, disappeared from the Google index, because it was put into Google Sandbox because it changed its code. Look at Google’s rules for webmasters. You must have in-bound links from high traffic sites, fresh semantically tight content, site map that points to what you want indexed, well-formed pages, appropriate metatags. If you’re a library, point to other libraries and make sure they point to you. Microsoft and Yahoo follow in Google’s footprints.

Five cheats: steal text, metatag spamming, blog seeding, link yourself in link farms, doorway cheat. The full ten are in the sidebar to his September/October 2005 issue of ONLINE .

Everybody is not driving the same car. Google is driving a Jaguar; everyone else has a Mustang.

What is boundary between SEO and “real indexing”? The answer is the interface. We’re not in an “ss=” world. Endeca powers eToys.com. It lets you drill down through choices rather than typing a search query. This gets lots more usage. Boolean implementation that’s been chosen by Library of Congress. Now he’s showing a hybrid search (facets, hard coding, synonym expansion) from Mondosoft powering the Pope’s Web site. That the exact opposite of typing a search query into Dialog

Who defines relevance? The searcher? Advertiser? Client of search engine?

Many ways to delivery relevanace, more subtle ones coming, understand context of content. Assess the basics: Provenance, accuracy, currency, selectinve depth. Google is redefining search, relevance, content, and many other sectors as the driver of our business. Google is even changing patent searching, since they don’t list themselves as patent assignee. Traditional online have not kept up. Has LexisNexis, Dialog, Derwent changed our world? No. Google has.

New job for librarians. You are the only ones who can assess those basics.

SEO is legitimate indexing model.

We need more work on situational relevance. This is one of most exciting times in information science. Old model is laptop. New model is handheld.

Q&A:

Is next generation being taught this? Two schools talk about it, Liz Liddy at Syracuse & Michael Koenig at Long Island University. I think he should ask the audience who’s teaching it?

Do you approve or disapprove? He disapproves of SEO as indexing. Only question from search optimitzers is how do I become number 1?

Why are you offended? It’s intellectually dishonest. The way to index is honestly, not just to cut costs. Need people as indexers. If medical term is misused, could affect person’s life, since appropriate article/item is not found.

What about Yahoo? What about portals? Is concept of portals coming back? If your behavior suggests to Google that you want a portal, you’ll see one when you log on. You can opt out. Less invasive than Microsoft’s approach. Google’s algorithms are far more subtle. Google puts cookies on your machine that expire in 2025 from Google Scholar. Google can block you from certain documents even if you’ve changed machines. Yahoo is box of straws in tissue paper. Structural weakness of Yahoo.

Google is revolutionalizing more than librarianship. Google WiFi is coming, it’s the next AT&T. And it’s unregulated.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Previous Posts
Keyword Tags
Archives
©2005-2008, Information Today, Inc. About/Contacts | PRIVACY POLICY
143 Old Marlton Pike, Medford, NJ 08055-8750 | Phone: 609-654-6266 • Fax: 609-654-4309 • custserv@infotoday.com
FireStats icon Powered by FireStats