<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Deep Web Technologies Blog &#187; Federated Search</title>
	<atom:link href="http://deepwebtechblog.com/category/federated-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://deepwebtechblog.com</link>
	<description>covering federated search and how to get the best from the Deep Web.</description>
	<lastBuildDate>Tue, 25 Oct 2011 16:35:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>Reducing Research Time and Costs in the Corporate Environment</title>
		<link>http://deepwebtechblog.com/reducing-research-time-and-costs-in-the-corporate-environment/</link>
		<comments>http://deepwebtechblog.com/reducing-research-time-and-costs-in-the-corporate-environment/#comments</comments>
		<pubDate>Thu, 28 Jul 2011 20:58:53 +0000</pubDate>
		<dc:creator>Brian Despain</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Marketing Announcements]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1702</guid>
		<description><![CDATA[One of the many unseen costs in corporations is the actual time spent doing research for the company. Researchers, product managers and scientists use a significant time researching a wide variety of sources to bring a new product to market. There are multiple costs in research. First, there is the cost of the research materials [...]]]></description>
			<content:encoded><![CDATA[<p>One of the many unseen costs in corporations is the actual time spent doing research for the company. Researchers, product managers and<img class="alignright size-medium wp-image-1710" title="researcher" src="http://deepwebtechblog.com/wp-content/uploads/2011/07/researcher-300x200.jpg" alt="" width="300" height="200" /> scientists use a significant time researching a wide variety of sources to bring a new product to market. There are multiple costs in research. First, there is the cost of the research materials themselves, namely the journals and subscription content that every technology driven organization uses to keep ahead of developments in the field. Maximizing utilization of subscription content is important to an enterprise. No one wants to buy content and then have it sit in the equivalent of a digital closet because the interface is too hard to use or there are too many interfaces to search for multiple sources. Enterprises want to get the maximum value for their subscription dollar.</p>
<p>The cost of the content isn&#8217;t the only cost to be considered either. Research scientists are paid on average <a href="http://www.indeed.com/salary/Research-Scientist.html">$84,000</a> annually. A principal research scientist earns on average <a href="http://www.indeed.com/salary/Research-Scientist.html">$120,000</a>. In organizations that are primarily research driven such as aerospace, semiconductors, chemical manufacturing, law and engineering, time spent in research is time not spent developing a product, improving a process or inventing the next great widget. Research needs to get done, and quickly, without any sacrifice on breadth or depth of research.</p>
<p>Federated search addresses these two cost areas of research while providing a third benefit to organizations. By providing a single point to searching hundreds of sources, researchers can issue a single search request through a simple Google-like interface and get thousands of results sorted by <a href="http://deepwebtechblog.com/clusters-that-think/">semantically related concepts</a>. Compare this with issuing multiple, individual search requests, then collating the results across multiple applications, de-duping the results and then getting the full text; federated search speeds this process up with every source that a researcher needs. Furthermore federated search decreases the likelihood of missing an important document. By extending the search across multiple sources, you give researchers more time to go in depth into the results. This significantly cuts the time to result for a researcher to begin analysis and to do the real job he or she is paid to do &#8211; build that next generation product, improve that process or design that industrial process.</p>
<p>Deep Web Technologies Explorit is used by some of the world&#8217;s leading research organizations to speed research. Our corporate customers are world leaders such as Boeing in aerospace, Intel in semiconductors and BASF in chemicals. These organizations have chosen Explorit to provide a competitive advantage in today&#8217;s world. Isn&#8217;t it time your firm looked at improving it&#8217;s bottom line and competitiveness? If you are interested in reducing research time and research costs and improving the effectiveness of your knowledge enterprise, contact Brian Despain, VP of Sales, toll free: 866-388-1407 x235 or visit <a href="http://www.deepwebtech.com">Deep Web</a> Technology and click on the chat link during business hours.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/reducing-research-time-and-costs-in-the-corporate-environment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Age of Discovery</title>
		<link>http://deepwebtechblog.com/the-age-of-discovery/</link>
		<comments>http://deepwebtechblog.com/the-age-of-discovery/#comments</comments>
		<pubDate>Fri, 24 Jun 2011 16:06:22 +0000</pubDate>
		<dc:creator>Darcy Pedersen</dc:creator>
				<category><![CDATA[Features]]></category>
		<category><![CDATA[Federated Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1696</guid>
		<description><![CDATA[Abe Lederman is heading to the ALA Annual Conference this weekend in New Orleans to take part in a fascinating panel discussion: The Age of Discovery: Understanding Discovery Services, Federated Search and Web Scale.   Here&#8217;s a brief description: Findability, discovery services, federated search, web scale—ways to discover content are increasing all the time, but [...]]]></description>
			<content:encoded><![CDATA[<p>Abe Lederman is heading to the ALA Annual Conference this weekend in New Orleans to take part in a fascinating panel discussion: <a href="http://connect.ala.org/node/145429">The Age of Discovery: Understanding Discovery Services, Federated Search and Web Scale</a>.   Here&#8217;s a brief description:<img class="alignright size-full wp-image-1697" title="download" src="http://deepwebtechblog.com/wp-content/uploads/2011/06/download1.png" alt="" width="223" height="165" /></p>
<blockquote><p>Findability, discovery services, federated search, web scale—ways to discover content are increasing all the time, but how do we discover which discovery mechanism is appropriate? Join us to learn more about the discovery landscape. When is it appropriate to use federated search over a discovery service? How does this differ by type of researcher? What kinds of resources should be included in discovery tools? Learn discovery implementation from two librarians in the trenches; learn about “web scale” and how federated search and discovery are evolving from the experts; and how the rest of us can sort out this tangle of access methods!</p></blockquote>
<p>Join Abe and the other panelists Sunday, June 26, 2011, 4:00-5:30 pm, at the Hilton Riverside – Grand Salon C.</p>
<p>Abe&#8217;s presentation is available here: <a href="http://www.deepwebtech.com/ala2011.ppt">http://www.deepwebtech.com/ala2011.ppt</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/the-age-of-discovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WorldWideScience receives warm welcome at the UN</title>
		<link>http://deepwebtechblog.com/worldwidescience-receives-warm-welcome-at-the-un/</link>
		<comments>http://deepwebtechblog.com/worldwidescience-receives-warm-welcome-at-the-un/#comments</comments>
		<pubDate>Wed, 15 Jun 2011 15:05:10 +0000</pubDate>
		<dc:creator>Sol</dc:creator>
				<category><![CDATA[Clients]]></category>
		<category><![CDATA[Features]]></category>
		<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Multilingual Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1680</guid>
		<description><![CDATA[WorldWideScience is a global science gateway that combines national and international scientific databases into a search engine. From a single search form, a scientist, researcher, or curious citizen can search over fifty databases in English and now 22 multilingual sources (with translation to the searcher&#8217;s native language) and seven multimedia sources. WorldWideScience is the brainchild [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://worldwidescience.org">WorldWideScience</a> is a global science gateway that combines national and international scientific databases into a search engine.<a href="http://www.worldwidescience.org"><img class="alignright size-medium wp-image-1691" title="WorldWideScience now includes multilingual and multimedia sources!" src="http://deepwebtechblog.com/wp-content/uploads/2011/06/download-300x141.png" alt="" width="300" height="141" /></a> From a single search form, a scientist, researcher, or curious citizen can search over fifty databases in English and now 22 multilingual sources (with translation to the searcher&#8217;s native language) and seven multimedia sources. WorldWideScience is the brainchild of the director of the DOE Office of Scientific and Technical Information (OSTI), Dr. Walt Warnick. The gateway is maintained and hosted by OSTI and governed by the <a href="http://worldwidescience.org/alliance.html">WorldWideScience Alliance</a>.</p>
<p><a href="http://deepwebtech.com">Deep Web Technologies</a> is proud to have developed the federated search technology behind WorldWideScience. And, with the cooperation of the Microsoft Translation services team, Deep Web Technologies also implemented the multilingual technology. It was a major undertaking but a worthwhile one for the science community, whose members can now greatly expand their reach to scientific papers in languages beyond their own.</p>
<p>Dr. Warnick was invited to deliver a <a href="http://www.osti.gov/speeches/fy2011/warnick/UNC2011/index.shtml">presentation</a> at the 14th session of the United Nations&#8217; Commission on Science and Technology (CSTD). In a post at the <a href="http://www.osti.gov/ostiblog/worldwidescience-opens-international-doors">OSTI Blog</a>, Dr. Warnick shares the warm reception that WorldWideScience received.</p>
<blockquote><p>I wish more of my OSTI colleagues could have been in Geneva to share the warm response from the attendees.   Several country representatives offered up new sources for WorldWideScience (WWS).  Another member of the audience searched mobile WWS for his own name and remarked that he found many of his papers.  I received enthusiastic comments, so many that I couldn’t address all of them because of time constraints.  Significantly, the Chair of CSTD volunteered to pay the costs of becoming a member of the WorldWideScience Alliance.  There was great excitement about the possibilities for its use within the home countries of the attendees and how WWS advances the goals of CSTD.</p></blockquote>
<p>The paper &#8220;<a href="http://iospress.metapress.com/content/f767t1076251xu84/">Breaking down language barriers through multilingual federated search</a>&#8221; co-authored by Abe Lederman (founder and president of Deep Web Technologies), and Dr. Warnick, Brian Hitson, and Lorrie Johnson from OSTI, explains the importance of the gateway:</p>
<blockquote><p>&#8220;WorldWideScience.org (WWS) is a global science gateway developed by the US Department of Energy Office of Scientific and Technical Information (OSTI) in partnership with federated search vendor Deep Web Technologies. WWS provides a simultaneous live search of 69 databases from government and government-sanctioned organizations from 66 participating nations. The WWS portal plays a leading role in bringing together the world&#8217;s scientists to accelerate the discoveries needed to solve the planet&#8217;s most pressing problems. In this paper we present a brief history of the development of WWS and discuss how a new technology, multilingual federated search, greatly increases WWS&#8217; ability to facilitate the advancement of science.&#8221;</p></blockquote>
<p>Deep Web Technologies is delighted to be working with OSTI and other organizations to push the envelope of search technology and to make the world a smaller place.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/worldwidescience-receives-warm-welcome-at-the-un/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preparing for ALA Panel and Federated Search Neutrality</title>
		<link>http://deepwebtechblog.com/preparing-for-ala-panel-and-federated-search-neutrality/</link>
		<comments>http://deepwebtechblog.com/preparing-for-ala-panel-and-federated-search-neutrality/#comments</comments>
		<pubDate>Tue, 31 May 2011 21:29:15 +0000</pubDate>
		<dc:creator>Abe</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[View from Inside]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1645</guid>
		<description><![CDATA[As is usual for me I’m up early this morning after the three day Memorial Day weekend, going through my Biznar Alerts, and I run into this interesting blog post: Net Neutrality and Federating Searching Jake, a librarian in the D.C. area and beer aficionado (he’s the beerbrarian), writes on how the neutrality of federated [...]]]></description>
			<content:encoded><![CDATA[<p>As is usual for me I’m up early this morning after the three day Memorial Day weekend, going through my <a href="http://biznar.com">Biznar</a> Alerts, and I run into this interesting blog post:</p>
<p><a href="http://beerbrarian.blogspot.com/2010/12/net-neutrality-and-federated-searching.html">Net Neutrality and Federating Searching</a></p>
<p>Jake, a librarian in the D.C. area and beer aficionado (he’s the beerbrarian), writes on how the neutrality of federated search solutions is often overlooked and that it is most disconcerting that librarians and users of federated search solutions are not even aware of the bias in federated search results. This bias is the result of the ulterior motives of some federated search / discovery services vendors whose primary business is selling content.</p>
<p>Jake writes:</p>
<blockquote><p>I am credentialed at an institution that uses EHIS. I searched for dozens of terms, and the results weren’t pleasant for EHIS. It’s a crude test, but EHIS failed it.</p>
<p>EHIS consistently promoted EBSCO resources, favoring Academic Search Premier, an interdisciplinary EBSCO database, over product from other vendors that are more specialized.</p></blockquote>
<p><img src="http://deepwebtechblog.com/wp-content/uploads/2010/12/evil_google.jpg" alt="" width="200" height="180" align="left" />This subject is one that I have addressed in several blog posts before including this post last December, <a href="http://deepwebtechblog.com/if-google-might-be-doing-it%e2%80%a6/">If Google might be Doing it …</a>, but I welcome reinforcement of this concern. In the last 6-12 months following the initial craze with Discovery Services I have seen significant more questioning in the library community of Discovery Services such as ProQuest’s Summon and EBSCO’s EDS.</p>
<p>Here at Deep Web Technologies we have put lots of emphasis on our relevance ranking algorithms, assigning a rank to each result returned (we bring back hundreds to several thousand results for some of the broader queries) based on how closely the title, author and snippets match the user’s query and not what the source that returned the result is.</p>
<p><img src="http://deepwebtechblog.com/wp-content/uploads/2011/05/Picture-46.png" alt="" align="right" />On Sunday, June 26th at the <a href="http://www.alaannual.org/">ALA Summer National Conference</a> in New Orleans I’ll be speaking on a panel on <a href="http://connect.ala.org/node/136968">The Age of Discovery: Understanding Discovery Services, Federated Search, and Web scale</a>. You can be assured that this is one topic that I will be discussing in my presentation.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/preparing-for-ala-panel-and-federated-search-neutrality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Federated search: the challenges of incremental results</title>
		<link>http://deepwebtechblog.com/federated-search-the-challenges-of-incremental-results/</link>
		<comments>http://deepwebtechblog.com/federated-search-the-challenges-of-incremental-results/#comments</comments>
		<pubDate>Fri, 13 May 2011 16:07:55 +0000</pubDate>
		<dc:creator>Sol</dc:creator>
				<category><![CDATA[Features]]></category>
		<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[View from Inside]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1628</guid>
		<description><![CDATA[Welcome to the second edition of &#8220;Best of the Federated Search Blog.&#8221; In this series I pull articles out of the Federated Search Blog archive and comment on them for the benefit of those considering Deep Web Technologies&#8216; offerings. In March, 2008 I explored the &#8220;incremental results&#8221; feature which Deep Web Technologies makes available in [...]]]></description>
			<content:encoded><![CDATA[<p>Welcome to the second edition of &#8220;Best of the Federated Search Blog.&#8221; In this series I pull articles out of the <a href="http://federatedsearchblog.com">Federated Search Blog</a> archive and<a href="http://www.federatedsearchblog.com"><img class="alignright size-medium wp-image-1608" title="bestof" src="http://deepwebtechblog.com/wp-content/uploads/2011/05/bestof-300x76.png" alt="" width="300" height="76" /></a> comment on them for the benefit of those considering <a href="http://www.deepwebtech.com">Deep Web Technologies</a>&#8216; offerings.</p>
<p>In March, 2008 I explored the &#8220;<a href="http://federatedsearchblog.com/2008/03/28/federated-search-the-challenges-of-incremental-results/">incremental results</a>&#8221; feature which Deep Web Technologies makes available in all its federated search applications. As a consultant to Deep Web Technologies I may be somewhat biased but I do believe that this feature is a huge differentiator for the company.</p>
<p>What are incremental results?</p>
<blockquote><p>The idea is simple: display results in chunks as they are received from the sources being searched. <a href="http://www.science.gov">Science.gov</a>, <a href="http://WorldWideScience.org">WorldWideScience.org</a>, and <a href="http://scitopia.org">Scitopia.org</a> are three applications that display incremental results.</p></blockquote>
<p>Why is it a big deal to provide incremental results? It&#8217;s because we live in the age of Google speed. Users don&#8217;t want to wait the 30 seconds it could take a content source to provide its results. The achilles heel of federated search is the fact that we have no control over how quickly sources respond with their results. If a federated search application is searching 30 sources at once and 29 of them return results quickly but one is slow to respond then the traditional approach to displaying search results has users wait until the last source returns its results. This is bad news for the impatient user.</p>
<p>Deep Web Technologies&#8217; approach is to wait just a few seconds, long enough to get a variety of documents from a number of sources. It then relevance ranks those documents and displays those results quickly to users. While users are inspecting those first results, Explorit (Deep Web Technologies&#8217; federated search engine) is gathering results from the other sources to display when the user is ready.</p>
<p>Explorit is polite to users. It doesn&#8217;t simply overwrite the first set of search results with a later batch. It instead informs the users that a newer set is available and asks the user if he wants that set. The user can take the offer, turn it down or defer it (waiting until later to refresh the results.)</p>
<p>Incremental results are a nice way to balance the federated search speed issue with the user demand for speed. We think the feature works well. You can judge for yourself at <a href="http://www.science.gov">Science.gov</a>, <a href="http://WorldWideScience.org">WorldWideScience.org</a>, and <a href="http://scitopia.org">Scitopia.org</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/federated-search-the-challenges-of-incremental-results/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Diagnosing Federated Search Source Problems: It&#8217;s Harder Than You Think</title>
		<link>http://deepwebtechblog.com/diagnosing-federated-search-source-problems-its-harder-than-you-think/</link>
		<comments>http://deepwebtechblog.com/diagnosing-federated-search-source-problems-its-harder-than-you-think/#comments</comments>
		<pubDate>Tue, 03 May 2011 22:23:55 +0000</pubDate>
		<dc:creator>Sol</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[View from Inside]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1607</guid>
		<description><![CDATA[Welcome to &#8220;The Best of the Federated Search Blog.&#8221; In this ongoing series I will be commenting on classic articles that I have authored for the Federated Search Blog. I aim to focus on the relevance of the article to current and prospective customers of Deep Web Technologies. In this first &#8220;Best of&#8221; article, Diagnosing [...]]]></description>
			<content:encoded><![CDATA[<p>Welcome to &#8220;The Best of the Federated Search Blog.&#8221; In this ongoing series I will be commenting on classic articles that I have authored for the<a href="http://deepwebtechblog.com/wp-content/uploads/2011/05/bestof.png"><img hspace="10" vspace="10" class="alignright size-medium wp-image-1608" src="http://deepwebtechblog.com/wp-content/uploads/2011/05/bestof-300x76.png" alt="" width="300" height="76" /></a> <a href="http://federatedsearchblog.com">Federated Search Blog</a>. I aim to focus on the relevance of the article to current and prospective customers of Deep Web Technologies.</p>
<p>In this first &#8220;Best of&#8221; article, <a href="http://federatedsearchblog.com/2008/06/23/diagnosing-federated-search-source-problems-its-harder-than-you-think/">Diagnosing federated search source problems: it&#8217;s harder than you think</a>, I introduced the complex nature of isolating and debugging source access problems. The gist of the challenge is that there are a number of potential points of failure and it&#8217;s not always obvious what is failing. Without knowing what is failing it&#8217;s not possible to correct the problem or to know who to notify to get the problem corrected. Plus, in order to be able to take action on a source access problem one needs to be aware of the problem. This requires a monitoring system that regularly probes all sources and alerts the appropriate persons of a problem.</p>
<p>Many users of federated search don&#8217;t realize that maintaining connectors (the software component accesses sources) is a substantial amount of work that requires a substantial investment. Prospective customers of federated search often focus too much on the bells and whistles of a particular implementation but don&#8217;t ask enough questions about whether their important sources can be searched and about what happens when a connector stops working.</p>
<p>Deep Web Technologies appreciates the difficulty of connector development and management. The company has a large catalog of connectors it has developed so its dedicated connector team has extensive experience with connector issues and is quick to identify and correct problems within its control. For those problems beyond its control, Deep Web Technologies has a publisher relations staff that can work with content providers to get those corrected. And, to ensure that problems are quickly discovered, Deep Web Technologies has developed custom software that frequently probes every single sources (except, of course, for those behind firewalls). If a source is intermittently down then an alarm is raised and the publisher, if appropriate is notified. If a source is &#8220;down hard&#8221; then the connector team swings into action and determines who owns the problem. If the problem is one that will result in a significant outage for a particular source then, upon a customer&#8217;s request, the source can be taken offline so that it is not searched at all during the outage period.</p>
<p>Connectors are the foundation of federated search. If the content you want isn&#8217;t available then nothing else matters. That&#8217;s why Deep Web Technologies works diligently to minimize the amount of time that a source isn&#8217;t available.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/diagnosing-federated-search-source-problems-its-harder-than-you-think/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brain Food&#8211; Are You Eating Healthy?</title>
		<link>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/</link>
		<comments>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/#comments</comments>
		<pubDate>Wed, 23 Feb 2011 18:53:57 +0000</pubDate>
		<dc:creator>Nicholas</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[The Deep Web]]></category>
		<category><![CDATA[Deep]]></category>
		<category><![CDATA[deep web analysis]]></category>
		<category><![CDATA[federated knowledge]]></category>
		<category><![CDATA[Federated Search  Blog]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[results]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[strategic advantage]]></category>
		<category><![CDATA[top 10]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1568</guid>
		<description><![CDATA[Our minds like to be fed with the best information possible! In the case of researchers (academic, medical, business, or others), their works depend on it.  Here is a great blog post by Sol Lederman I would like to share. It&#8217;s on the topic of information quality vs. information coverage. It really illustrates the importance of search [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deepwebtechblog.com/wp-content/uploads/2011/02/brain-food.jpg"><img class="alignleft size-medium wp-image-1573" style="margin: 10px; border: 10px solid black;" title="brain-food" src="http://deepwebtechblog.com/wp-content/uploads/2011/02/brain-food-300x187.jpg" alt="" width="300" height="187" /></a>Our minds like to be fed with the best information possible! In the case of researchers (academic, medical, business, or others), their works depend on it.  Here is a great blog post by Sol Lederman I would like to share. It&#8217;s on the topic of information quality vs. information coverage. It really illustrates the importance of search tools. I think use of &#8220;quantity&#8221; is important to note here. Quantity of information is not typically a bad thing&#8211; unless it&#8217;s a large amount of irrelevant results. If you would like to eat some brain food, Deep Web Tech is serving up a hot plate of your favorite info! Here are some sites you can feast on for a lifetime; bon appetit! <a href="http://biznar.com/biznar/" target="_blank">Business</a>, <a href="http://mednar.com/mednar/" target="_blank">Medical</a>, <a href="http://www.science.gov/" target="_blank">Science</a>, and <a href="http://worldwidescience.org/" target="_blank">World Wide Science</a>.</p>
<p>Sol Lederman wrote:</p>
<p>I recently discovered an article, <a href="http://www.brisbanegrammar.com/blogs/library/?p=870">5 Reasons Not to Use Google First</a>, that sings my song. The article addresses this question:</p>
<blockquote><p>Google is fast, clean and returns more results than any other search engine, but does it really find the information students need for quality academic research? The answer is often ‘no’. “While simply typing words into Google will work for many tasks, academic research demands more.” (<a href="http://manipulating-media.co.uk/2010/08/27/searching-for-and-finding-new-information-desk-research/">Searching for and finding new information – tools, strategies and techniques</a>)</p></blockquote>
<p>The next paragraph gave me a chuckle.</p>
<blockquote><p>As far back as 2004, James Morris, Dean of the School of Computer Science at Carnegie Mellon University, coined the term “infobesity,” to describe “the outcome of Google-izing research: a junk-information diet, consisting of overwhelming amounts of low-quality material that is hard to digest and leads to research papers of equally low quality.” (<a href="http://www.soi.city.ac.uk/~dbawden/bawden%20and%20brophy%20ap.pdf">Is Google enough? Comparison of an internet search engine with academic library resources</a>.)</p></blockquote>
<p>The article continues with its list of five good reasons to not use Google first.</p>
<p>Note that the recommendation isn’t to skip Google altogether. There’s a balance that’s needed to get the best value when performing research. The findings in the “Is Google enough?” article summarizes this point really well:</p>
<blockquote><p>Google is superior for coverage and accessibility. Library systems are superior for quality of results. Precision is similar for both systems. Good coverage requires use of both, as both have many unique items. Improving the skills of the searcher is likely to give better results from the library systems, but not from Google.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Deep Web Tech in the News: WorldWideScience.org</title>
		<link>http://deepwebtechblog.com/httpwp-mepzhks-on/</link>
		<comments>http://deepwebtechblog.com/httpwp-mepzhks-on/#comments</comments>
		<pubDate>Fri, 11 Feb 2011 16:13:30 +0000</pubDate>
		<dc:creator>Nicholas</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[Reviews]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1537</guid>
		<description><![CDATA[Sometimes when I lie in bed with the dream of federating 1,000,000,000 sources dancing around in my mind, I often wonder, &#8220;What&#8217;s the best search engine?&#8221;. I suppose that depends on who (or what) you seek. The Internet is the largest collection of information that has ever been amassed, leading us to need better search [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deepwebtechblog.com/wp-content/uploads/2011/02/Science-joke-blog-photo.jpg"><img class="alignleft size-medium wp-image-1541" style="margin: 10px; border: 10px solid black;" title="Science joke blog photo" src="http://deepwebtechblog.com/wp-content/uploads/2011/02/Science-joke-blog-photo-300x218.jpg" alt="" width="300" height="218" /></a>Sometimes when I lie in bed with the dream of federating 1,000,000,000 sources dancing around in my mind, I often wonder, &#8220;What&#8217;s the best search engine?&#8221;. I suppose that depends on who (or what) you seek. The Internet is the largest collection of information that has ever been amassed, leading us to need better search engines.</p>
<p>I use Google a lot when I want to search for the answer to a simple question, a tasty recipe, or I want a good <a href="http://www.youtube.com/watch?v=0Bmhjf0rKe8" target="_blank">laugh</a> (it gets me every time). I also find that using a wide variety of search tools is essential if you&#8217;re serious about getting the best possible search results. For all of the individuals out there that understand the need for specialized search engines, this next part will be a gem.</p>
<p>WorldWideScience.org, a premier scientific search engine has received some recent attention.   <a href="http://www.osti.gov/bios/warnick.html" target="_blank">Walter L. Warnick, Ph.D</a>., Director of the Office of Scientific and Technical Information (OSTI) for the U.S. Department of Energy, demonstrated  <a href="http://worldwidescience.org/" target="_blank">WorldWideScience.org</a> in front of multiple observers including those from the scientific community. According to openbiomed.info, &#8220;WorldWideScience.org is federated full-text, public, database of scientific and technical research information published from at least 70 cooperative countries&#8221;.  With only about 4% duplication with general search engines (that means Google), WorldWideScience.org   provides access to millions of deep web documents using a single, multi-language search query. One user wrote:</p>
<p>I decided to assess the biomedical research coverage by trying to find information on Methicillin-resistant Staphylococcus aureus (MRSA) and procedures to isolate patients with this infection.  I searched the keyword phrase “<strong>MRSA isolation protocol</strong>“: 892 documents were located, and when the <a href="http://worldwidescience.org/multilingual/result-list/fullRecord:MRSA+isolation+protocol/preferredLanguage:en/#ResultList=0%7C0%7C_%7CDATE%7C0" target="_blank">result was sorted by date</a>, open access resources such as the <a href="http://www.doaj.org/doaj?func=searchArticles" target="_blank">Directory of Open Access Journals (DOAJ)</a>, <a href="http://ukpmc.ac.uk/" target="_blank">UK PubMed Central</a>, and <a href="http://www.lenus.ie/hse/" target="_blank">LENUS (Irish Health Repository)</a> present very recent research.   In comparison, the <strong><a href="http://openbiomed.info/2011/01/worldwidescience-deep-web/scirus.com" target="_blank">SCIRUS</a> </strong>database searched for MRSA isolation protocol finds quantitatively more research, but when <a href="http://www.scirus.com/srsapp/search?q=MRSA+isolation+protocol&amp;t=all&amp;drill=yes&amp;sort=1&amp;p=0" target="_blank">sorted by date</a>, all of the recent literature identified is available in Science Direct via subscription-only access or for purchase US $ 31.50.</p>
<p>If you haven&#8217;t tried out the WorldWideScience, and you like having a mind-blowing amount of <em>relevant</em> information from all over the world at your finger tips, I suggest you give it a spin; you won&#8217;t be sorry you did. We understand the need for collecting information that&#8217;s not for answering the simple questions, or for cute cats (he is really cute though, check the laugh link above).</p>
<p>Still need convincing? Here is a <a href="http://www.youtube.com/watch?v=OVdEbimZy_0&amp;feature=player_embedded#" target="_blank">simple demonstration</a> and a shout-out to whomever made a pretty good video about a search engine, even with the 80&#8242;s cartoon-like sounds and &#8220;rad&#8221; guitar background music, I find it quite entertaining, there&#8217;s a special place in my heart for you.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/httpwp-mepzhks-on/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Deep Web Tech in the News: Image Search</title>
		<link>http://deepwebtechblog.com/deep-web-tech-in-the-news-image-search/</link>
		<comments>http://deepwebtechblog.com/deep-web-tech-in-the-news-image-search/#comments</comments>
		<pubDate>Mon, 31 Jan 2011 18:13:32 +0000</pubDate>
		<dc:creator>Nicholas</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[Partnerships]]></category>
		<category><![CDATA[Product Development]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1510</guid>
		<description><![CDATA[One small step for Science.gov, one giant leap for Federated Search. &#8220;Science.gov is a gateway to more than 42 scientific databases and 200 million pages of science information with just one query, and is a gateway to more than 2,000 scientific websites from 18 organizations within 14 federal science agencies. These agencies represent 97% of [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deepwebtechblog.com/wp-content/uploads/2011/01/astronaut_banjo.jpg"><img class="alignleft size-medium wp-image-1534" style="margin: 10px; border: 10px solid black;" title="astronaut_banjo" src="http://deepwebtechblog.com/wp-content/uploads/2011/01/astronaut_banjo-300x225.jpg" alt="" width="300" height="225" /></a>One small step for Science.gov, one giant leap for Federated Search.</p>
<p>&#8220;Science.gov is a gateway to more than 42 scientific databases and 200 million pages of science information with just one query, and is a gateway to more than 2,000 scientific websites from 18 organizations within 14 federal science agencies. These agencies represent 97% of the federal R&amp;D budget. Science.gov is the USA.gov portal to science and the U.S. contribution to WorldWideScience.org. Science.gov is hosted by the <a href="http://www.osti.gov/" target="_blank"><strong>Department of Energy Office of Scientific and Technical Information</strong></a>, within the Office of Science, and is supported by <a href="http://www.cendi.gov/" target="_blank"><strong>CENDI</strong></a>, an interagency working group of senior scientific and technical information managers.&#8221;</p>
<p>Science.gov received a pretty large upgrade in December, the image search is located under &#8220;special collections&#8221; and works just like science.gov except the results have thumbnails (<a href="http://www.science.gov/scigovimage/" target="_blank">www.science.gov/scigovimage/</a>). The search query now quickly pulls back related images from multiple sources into a thumbnail size result. This is one of very few publicly available science image search portals. Cheryl LaGuardia, an industry critic, wrote:</p>
<blockquote><p>For a free service this works mighty well: my test search for “tornedo” got the reply, “Did you mean “tornado”? with 151 results for the corrected spelling (a test, mind you, or perhaps I’m easing back into work slowly and may have inadvertently misspelled… no matter! The system works!). The resultant images are terrific, compelling enough to send Dorothy pedaling madly down the road away from them on her bicycle, with Toto in tow.</p></blockquote>
<p>Deep Web Technologies powers the entire website, and we look forward to using this innovation on other projects in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/deep-web-tech-in-the-news-image-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Search Strategies in Federated Search: Refine Search</title>
		<link>http://deepwebtechblog.com/search-strategies-in-federated-search-refine-search/</link>
		<comments>http://deepwebtechblog.com/search-strategies-in-federated-search-refine-search/#comments</comments>
		<pubDate>Mon, 20 Dec 2010 19:58:17 +0000</pubDate>
		<dc:creator>Brian Despain</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[The Deep Web]]></category>
		<category><![CDATA[Deep]]></category>
		<category><![CDATA[deep web analysis]]></category>
		<category><![CDATA[deep web technologies]]></category>
		<category><![CDATA[federated knowledge]]></category>
		<category><![CDATA[Federated Search  Blog]]></category>
		<category><![CDATA[queries]]></category>
		<category><![CDATA[querying the deep web]]></category>
		<category><![CDATA[results]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1317</guid>
		<description><![CDATA[Filtering results is as much an art as a science. Refining or clarifying an initial set of search results is a fairly common practice among search engine users that don&#8217;t know exactly what they are looking for. Yet in federated searching, refining a group of results is often not the best strategy for a user to find [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deepwebtechblog.com/wp-content/uploads/2010/12/filtering-water.jpg"><img class="alignright size-medium wp-image-1370" title="filtering water" src="http://deepwebtechblog.com/wp-content/uploads/2010/12/filtering-water-300x199.jpg" alt="" width="300" height="199" /></a></p>
<p>Filtering results is as much an art as a science.</p>
<p>Refining or clarifying an initial set of search results is a fairly common practice among search engine users that don&#8217;t know exactly what they are looking for. Yet in federated searching, refining a group of results is often not the best strategy for a user to find the best results they are looking for. Let&#8217;s do a quick walk-through of how a user might use a federated search application. Let&#8217;s say a user is interested in nanotechnology and is using our Explorit federated search application. When the user inputs the query, the search brings back roughly 2500 results (this is a rough number, and it can vary depending on the sources). After sorting through some of the results, the user decides that their interest really lies in medical nanotechnology, so they limit the search to just &#8220;medical nanotechnology&#8221;.</p>
<p>To get the <em>best</em> possible results from a federated search application, it makes sense to re-run the search with those two terms. The initial group of results was run with the term &#8220;nanotechnology&#8221; so it may contain <em>some </em>results with the term &#8220;medical&#8221; in it but since the original search strategy was to search for the nanotechnology industry as a whole, the odds of the best results for &#8220;medical technology&#8221; are slim to none. It&#8217;s far better for a user of federated search to type in the new query to get results from the various sources that match the new search strategy.</p>
<p>Now that the user has the information, how do they sort through it? Most search engines have their own relevancy ranking tuned to their own content. This means a two-term search strategy might be handled differently than a broader search and likely more efficiently.</p>
<p>The real goal, of course, to get the maximum number of results that match the users search strategy, and give them the tools for sorting through them. This strategy would also hold for true for discovery services as well, since many discovery services (such as Summon) only return 1,000 results in response to the query. Those 1,000 results are tuned to the initial query and refining such a small result set further, wouldn&#8217;t serve the needs of their users. The user would be better off, as we mentioned before, re-initiating the search.</p>
<p>Just as a side note, this 1,000 result limit is pretty common in large-scale indexes. Google for example, will only bring back 1,000 results for any query even though Google indicates that thousands or millions of results are available. Google brings back the 1,000 most relevant to the initial query, in an index that is filled with a huge amount of spam. So it&#8217;s a better idea to re-initiate your query with the refined search strategy to get access to more information.  This 1,000 limitation makes less sense in a product like Summon since presumably the index doesn&#8217;t have any spam in it.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/search-strategies-in-federated-search-refine-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

