July 26, 2013
The NewsBreak I wrote about ProQuest giving Dialog a makeover was published earlier this week (23rd of July, to be precise). I wish I could have covered all the technological aspects of PQD (ProQuest Dialog), but that probably would have bored everybody to death. When I was first introduced to Dialog, which was in the Dark Ages of 1976 or 1977, I spent a day and a half in Palo Alto for the initial training sessions. I can’t imagine information professionals — or anyone else, for that matter — giving up that much time to learn a search engine.
Today’s expectations are that search happens quickly, without much thought or effort. What PQD offers is a somewhat different twist on those expectations. Yes, you can do the proverbial “quick and dirty” search. It will work. Careful information professionals, however, will benefit from studying how to effectively use all the bells and whistles PQD has baked into the new system.
Technology is great, but without sufficient content, it’s an empty vessel. Dialog has told us it won’t contain the corporate directory files, the market research databases (no great loss, as they were mostly closed files), and trademarks. It doesn’t mention EconLit, TableBase, LISA, and other files that don’t seem to have made the cut. If your favorite database is not present in PQD, please let me know. Since ProQuest, so far, declines to provide us with a comprehensive list of databases not transferring to PQD from legacy Dialog, it’s up to the users of the system to fill this gap.
Additionally, let’s tell ProQuest what other content should be in the “reinvented” Dialog. What information sources did they never had that they could now add?
I look forward to your suggestions on legacy databases not in PQD and on new databases PQD could add.
July 10, 2013
I’ve been doing more searching than usual in Google Scholar in the course of editing a manuscript for the September/October 2013 issue of Online Searcher. I know that both librarians and students have mixed emotions about GS, and there have been many articles, blog posts, conference presentations, and tweets about it. I was delighted, last November, to hear about developments from its creator, Anurag Acharya, Google Distinguished Engineer, at the Charleston Conference.
Here’s just a few of the things that delighted/depressed me in my recent search experience. I searched for Marydee Ojala. OK, that’s me. And Google figured out that the M Ojala in the author field was that same Marydee. Not Markus, who’s doing a PhD at Helsinki University. Not Matti, who’s in the Department of Agricultural Sciences at the same institution. Not Mace, the librarian who organizes the unconference Cycling for Libraries. I have a mental image of a Google algorithm that, in extremely unscientific terms, sees Marydee Ojala, notes that I write about online searching, and eliminates social science research, agricultural research, and cycling to present me with completely relevant results. Well done, Google Scholar!
Then I got to the sort options. That’s when depression set in. On my name, I had 696 hits. Un-ticking the box for Citations, the hit count was further reduced, to 308. When I sort by date, that goes down to 1. OK, it’s a book review I wrote in 2013 and the sort does say that it only does articles posted (not published, mind you) in the last year. So I hit the back arrow and suddenly my results, now sorted by relevance, went down to 545. Again un-ticking the Citation box, the hit count is now 196.
Several other author searches performed similarly, so it’s not just me.
Going back to that date sorting: It’s important to remember that it’s not by publication date. One search, on “digital libraries”, retrieved 3,540 hits when limited to Since 2013. The sort by date winnowed it down to 606. Shouldn’t the number of articles in the last year be the same as since 2013? Looking at the first 10 hits in the sorted results list, they are, indeed, sorted by date posted. But the publication dates vary. Again, this is very unscientific, as I looked only at the first 10, but although most were published during the May-June 2013 time frame, they were not sorted by the publication date and one was published in 2006.
I can think of a number of reasons why Google Scholar has trouble with its sorting. Publisher embargoes and the methodology of web scraping to populate Scholar undoubtedly play a role. Still, it’s an interesting exercise to unravel how and why Google Scholar does what it does.
June 27, 2013
Yippy, Inc. announced on June 26, 2013 that it had acquired Gigablast and Web Research Properties.
You may remember Gigablast from early presentations at WebSearch University and from mentions in Greg Notess’ columns in ONLINE (now Online Searcher). It was a good alternative to Google at one time, but fell off the radar as its database aged and remained rather small. Gigablast was pretty much a one-man show and that man, Matt Wells, did a splendid job. But today’s world of very large companies (Google and Microsoft) dominating the search space, it’s touch to compete.
Yippy intends to integrate Gigablast’s and Web Research Properties’ technology with that of MuseGlobal, a company it acquired in June 2012. No mention of Clusty, a product it acquired from Vivisimo in May 2010 before Vivisimo was sole to IBM.
Although I find the name Yippy a bit strange, rhyming as it does with both hippie and skippy, it’s becoming a technology company to keep an eye on.
June 3, 2013
Massive Open Online Courses (MOOCs) are gathering lots of attention. Steve Arnold’s cover story (Gadzooks, It’s MOOCs) in the January/February 2013 issue of Online Searcher talked about some of the underlying open source technology, but most of what I read about MOOCs is more on the philosophical/logistics side of things. There’s a good blog post by Steve Dale explaining MOOCs for the information professional. There’s also been discussions in multiple places about where libraries and librarians fit into the MOOC mix. Although the word “free” is not in the MOOC acronym, the “open” piece of it implies free. A criticism from students talking MOOCs is that required readings are not necessarily either open or free. That may be changing.
Coursera decided to experiment with offering free textbooks from Cengage Learning, Macmillan Higher Education, Oxford University Press, SAGE, and Wiley. These publishers, via a partnership with Chegg, will offer versions of their e-textbooks to Coursera students, but only while they are taking the class. Students will have the “opportunity” (that’s the phrasing from the publishers) to purchase the full versions of the textbooks (e or otherwise, I’d assume) if they want to continue their learning experience after the class is over.
And how do librarians fit into the MOOCs environment? For one thing, they should be happy that the students aren’t bombarding them with requests to add the e-textbooks to library collections (not that they probably would). More importantly, as pointed out in a recent article in the Minnesota Daily, librarians are helping faculty produce the courses. From the article:
University librarian Nancy Sims and other team members have shifted job responsibilities to support MOOCs. Some on the team have devoted most of their time to working on the online courses since the University agreed to do them in February, she said.
“Sometimes people think of the library of just being buildings with books in them, but I think this has shown that this is not what we are,” she said. “The way the library folks have been able to just dive right in in a short time frame has shown a very high level of amazing talent.”
This involvement by the library team enhances their visibility and professionalism with university faculty, something that’s always a plus.
May 3, 2013
If you’re in New York City today (I’m not), you’ll notice the Empire State Building is showing red colors. This is in honor of McGraw-Hill changing its name to McGraw Hill Financial, differentiating it from the education piece, which is now a separate company, McGraw-Hill Education. The Empire State Building often changes colors and you can find out what the colors represent here or follow the unofficial building lights calendar on Twitter. On May 1st the building went peach for The Financial Times. Hmmm, May Day and the FT, now there’s an interesting juxtaposition.
McGraw Hill Financial includes brands familiar to business researchers, such as Standard & Poor’s Capital IQ, Dow Jones Indices, Platts, McGraw Hill Construction, Aviation Week, and J.D. Power & Associates. While losing the hyphen between the McGraw and the Hill, the company gained a new ticker symbol (MHFI), a new logo (a Mobius strip with a triangle in the middle), and a new tag line (Essential Intelligence).
Online Insider welcomes McGraw Hill Financial.
May 2, 2013
In a recent blog post “Forget Searching For Content – Content Is About To Start Searching For You”, Brian Proffitt, who, among other things, is an adjunct instructor at the Mendoza College of Business at the University of Notre Dame, proclaimed that content is about to start searching for us rather than us searching for content. Well, content may be searching for me, but somehow it’s not finding me. Or, to be clearer, content is finding me but it’s not the content I was looking for.
Proffitt was talking about efforts on the part of search engines to contextualize search based on geography, relevance, push, security and privacy. One of his examples was Notre Dame. If he’s on campus, he gets results about his university. Should he venture across the Atlantic to France, he gets results about the cathedral in Paris.
That’s great, I suppose, if your only concern is that you grab a few facts about something close by. It’s not terribly useful if you’re a student in South Bend (home of Proffitt’s university) doing a research project on French cathedrals. In fact, that’s the great fallacy in contextual search. Figuring out context based on geolocation works well only if the intent of the search is personal shopping. For information professionals, it’s largely been a dud.
And the notion that a revolutionary new development in search technology is pushing data to the user is just plain wrong. Stephen E. Arnold and Eric S. Arnold wrote about push and pull 16 years ago! (Push technology: Driving traditional online into a corner,” Database; Aug/Sep 1997; pg. 36+ ). Pushing information to the user is not a new endeavor.
Proffitt has it right about search engines’ desire to “knowledgize” search, however. Both Google and Bing are incorporating Knowledge Graphs into search results. Greg Notess talks about this in his upcoming article in the July/August 2013 issue of Online Searcher. The problem is that too many mistakes crop up in those knowledge boxes. Information is only as good as the source from which it is derived.
The changes in search technology, whether it’s contextual, semantic, personalization, or something else, is a topic of great interest to information professionals and will be exhaustively discussed at WebSearch University in September in Washington DC.
I know some of the librarians at Notre Dame. I’m hoping they can find Proffitt and educate him about the needs of professional researchers.
April 26, 2013
It’s been an exciting few months, as we transition ONLINE and Searcher into a new publication, Online Searcher. We’ve had some superb articles published in the first two issues and the third one (July/August 2013) will cover such topics as alternative metrics, social media for public company disclosure, the EPUB standard for ebooks, and open source everywhere.
March 16, 2012
Driving to work this morning, I heard a news report on NPR (National Public Radio) about Russia joining the World Trade Organization (WTO) and the negative effect that the Jackson-Vanik Amendment to the Trade Act of 1974 will have on US economic interests if not repealed. The Jackson-Vanik Amendment was enacted into law on January 3, 1975, when the business climate in both the Soviet Union and the United States was considerably different than it is now. Of course, the Soviet Union of 1975 no longer exists, but the legislation, now about Russia, persists.
A simple Google (or Bing, Blekko, DuckDuckGo, or your search engine of preference) search will retrieve numerous documents about Jackson-Vanik and its potential for repeal. However, the story caught my attention for a more personal reason. It validated my decision to make Anne-Marie Libmann’s article about finding information in support of doing business in Russia the cover story for ONLINE‘s March/April 2012 issue.
The cover, a stunning photo of downtown Moscow, conveys an image of a modern city. Whether you agree with repealing Jackson-Vanik or not, Libmann’s article will give you lots of resources to check if you want to find information or help someone else find information about Russian companies with which to do business.
March 15, 2012
This morning’s Wall Street Journal reports on Google’s gradual incorporation of semantic search techniques, in an article titled “Google Gives Search a Refresh” by Amir Efrati. Since WSJ is a subscriber-site, that link may or may not work for you. The article also appears in the print version on page B1. And, of course, you can find it on Factiva.
From the article: “Amit Singhal, a top Google search executive, said in a recent interview that the search engine will better match search queries with a database containing hundreds of millions of “entities”—people, places and things—which the company has quietly amassed in the past two years. Semantic search can help associate different words with one another, such as a company (Google) with its founders (Sergey Brin and Larry Page).”
I’m sure we’ll hear more about this at WebSearch University in September, because by that time we’ll probably have real world examples of how Google’s “refresh” is actually working.
March 13, 2012
Next Page »
March is supposed to come in like a lion and go out like a lamb. So far, the weather in my part of the world has been more lamb-like than liony.
In mergers & acquisitions news, the climate swings toward the lion side of the scale. Twitter is the new proud owner of Posterous, according to its blog.
There’s a rumor out from SXSW, that CNN will buy Mashable, something Mashable is denying. I’m not linking to any of the news stories about the rumor. If it turns into a true story, with nobody lion about it, then I’ll amend the post.
Gowalla was bought by Facebook back in December. Its website now says “Thank you for going out with Gowalla. It was a pleasure to journey with you around the world. Download your check-ins, photos and lists here soon.”
I’m sure there’s more to come. March is far from over.