<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Deep Web Technologies Blog &#187; Federated Search</title>
	<atom:link href="http://deepwebtechblog.com/category/federated-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://deepwebtechblog.com</link>
	<description>covering federated search and how to get the best from the Deep Web.</description>
	<lastBuildDate>Mon, 16 Apr 2012 16:31:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>The Charleston Advisor gives Deep Web Technologies high marks</title>
		<link>http://deepwebtechblog.com/the-charleston-advisor-gives-deep-web-technologies-high-marks/</link>
		<comments>http://deepwebtechblog.com/the-charleston-advisor-gives-deep-web-technologies-high-marks/#comments</comments>
		<pubDate>Tue, 28 Feb 2012 03:53:56 +0000</pubDate>
		<dc:creator>Sol</dc:creator>
				<category><![CDATA[Clients]]></category>
		<category><![CDATA[Federated Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1744</guid>
		<description><![CDATA[The highly regarded Charleston Advisor, known for its &#8220;Critical reviews of Web products for Information Professionals,&#8221; has given Deep Web Technologies 4 3/8 of 5 possible stars for its Explorit federated search product. The individual scores forming the composite were: Content: 4 1/2 stars User Interface/Searchability: 4 1/2 stars Pricing: 4 1/2 stars Contract Options: 4 [...]]]></description>
			<content:encoded><![CDATA[<p>The highly regarded <a href="http://www.charlestonco.com/" target="_blank">Charleston Advisor</a>, known for its &#8220;Critical reviews of Web products for Information Professionals,&#8221; has given Deep Web Technologies 4 3/8 of 5 possible stars for its Explorit federated search product. The individual scores forming the composite were:</p>
<div>
<ul>
<li>Content: 4 1/2 stars</li>
<li>User Interface/Searchability: 4 1/2 stars</li>
<li>Pricing: 4 1/2 stars</li>
<li>Contract Options: 4 stars</li>
</ul>
<div>
<p>The scores were assigned by two reviewers who played a key role in bringing Explorit to Stanford University:</p>
</div>
<ul>
<li>Grace Baysinger, Head Librarian and Bibliographer at the Swain Chemistry and Chemical Engineering Library at Stanford University</li>
<li>Tom Cramer, Chief Technology Strategist at Stanford University Libraries and Academic Information Resources</li>
</ul>
<p>The review upon which the scores are based, is available at <a href="http://searchworks.stanford.edu/view/9388228" target="_blank">Stanford</a>. (Click on the <a href="http://purl.stanford.edu/" target="_blank">p</a>url.stanford.edu link for access to the full text.) At just six pages, the review makes for a quick read. The first four pages describe the Explorit features, infrastructure and support, and makes the case for the partnership between Deep Web Technologies and Stanford University that led to the development of the locally branded xSearch federated search product. Pages five and six provide the reviewers&#8217; critical evaluation of Explorit, references, and their bios.</p>
</div>
<div>
<p>Key points from the critical evaluation include:</p>
<ol>
<li>&#8220;Compared to other federated search products, Stanford found that DWT offered the most compelling package of performance, features, and design.&#8221;</li>
<li>&#8220;While federated search engines&#8217; performance is inherently limited by the performance of its target sites, DWT&#8217;s progressive delivery of results gives researchers near real-time response with the first set of results while the application assembles a complete set of hits from all sources.&#8221; More information about how near-real time response works is available at the <a href="http://federatedsearchblog.com/2008/03/28/federated-search-the-challenges-of-incremental-results/" target="_blank">Federated Search Blog</a>.</li>
<li>Explorit was &#8220;the only service that included alerts, and the only service that allowed us to create customized &#8220;search engines&#8221; locally.</li>
<li>&#8220;DWT&#8217;s performance, good relevance ranking, and faceting capabilities are very helpful to users.&#8221;</li>
<li>&#8220;Because Abstracting and Indexing tools contain controlled vocabulary terms, when a user is searching xSearch, there are more discovery points than if they were searching Google Scholar or a publisher&#8217;s site.&#8221;</li>
</ol>
<div>
<p>More observations are available in the <a href="http://searchworks.stanford.edu/view/9388228" target="_blank">review</a>. More information about <a href="http://lib.stanford.edu/xsearch" target="_blank">xSearch</a> is available at Stanford. Our own press release about the review is available on <a href="http://www.deepwebtech.com/2011/11/deep-web-technologies%E2%80%99-explorit-featured-in-charleston-advisor/" target="_blank">our website</a>. An Explorit overview is also available at <a href="http://www.deepwebtech.com/products/explorit-overview/" target="_blank">our web-site</a>.</p>
</div>
</div>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/the-charleston-advisor-gives-deep-web-technologies-high-marks/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>NFAIS on discovery services</title>
		<link>http://deepwebtechblog.com/nfais-on-discovery-services/</link>
		<comments>http://deepwebtechblog.com/nfais-on-discovery-services/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 17:48:21 +0000</pubDate>
		<dc:creator>Abe</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[View from Inside]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1728</guid>
		<description><![CDATA[There&#8217;s an NFAIS draft Discovery Service Code of Practice that&#8217;s up for review. The National Federation of Advanced Information Services (NFAIS™) is releasing a draft Discovery Service Code of Practice for review and comment by March 16, 2012. NFAIS believes that discovery services have the potential to provide ease of information discovery, access, and use, [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s an <a href="http://nfais.org/MiBlog/Blogs/view/code-of-practice-discovery-services-review-of-draft">NFAIS draft Discovery Service Code of Practice</a> that&#8217;s up for review.<a href="http://info.nfais.org/info/codedraftintroduction.pdf"><img class="alignright size-medium wp-image-1740" title="nfais" src="http://deepwebtechblog.com/wp-content/uploads/2012/02/nfais-280x300.png" alt="" width="224" height="240" /></a></p>
<blockquote><p><strong>The National Federation of Advanced Information Services (NFAIS™) is releasing a draft Discovery Service Code of Practice for review and comment by March 16, 2012.</strong> NFAIS believes that discovery services have the potential to provide ease of information discovery, access, and use, benefiting not only its member organizations, but also the global community of information seekers. However, the relative newness of these services has generated questions and concerns among information providers and librarians as to how these services meet expectations with regard to issues related to traditional search and retrieval services; e.g. usage reports, ranking algorithms, content coverage, updates, product identification, etc. Accordingly, the NFAIS Code Development Task Force has developed this draft document to assist those who choose to use this new distribution channel through the provision of guidelines that will help avoid the disruption of the delicate balance of interests involved.</p></blockquote>
<p>I recently got an email raising concerns about the draft paper and why services such as my company&#8217;s Explorit Federated Search were being excluded by NFAIS.  I don&#8217;t want to go further into the specific concerns of the email but I do want to comment on some things related to Discovery Services.</p>
<p>First, I want to say that, despite all the marketing, the concept of a Discovery Service is not new. Two or three years ago OCLC coined the term Web Scale Discovery to refer to their WorldCat index. Soon thereafter ProQuest started referring to Summon as a Discovery Service and put a lot of marketing muscle behind that. As a consequence of that the term Discovery Service has now become synonymous with a large centralized index even though federated search application such as my company&#8217;s Explorit can certainly be considered to be a Discovery Service, and in fact I have been talking about Explorit being a Discovery tool much longer than OCLC, Summon and EBSCO have adopted the Discovery Service label for just what they do.</p>
<p>Large centralized indexes, mostly of metadata have been around for decades now. Consider Web of Science, Scopus, Infotrieve, Ingenta and many others. So OCLC and ProQuest didn’t invent something new, they just developed an improved version of something that’s been around for a long time, stuck a new label on it, and through substantial marketing efforts have gotten the term – Discovery Service &#8211; to have become associated exclusively with large centralized indexes.</p>
<p>On the very positive side, for what we do and the issues I have raised about Discovery Services <a href="http://deepwebtechblog.com/discovery-services-over-hyped-and-under-performed/">in the past</a>, this NFAIS effort at developing a Code of Practice for Discovery Services (i.e. WorldCat, Summon, EDS, and Primo) raises a fairly substantial set of issues with these Services that Deep Web Technologies can address. I can see this paper as useful in highlighting issues with these Discovery Services. Three big issues that come to mind (I can think of others) are:</p>
<ol>
<li>Muddiness about who owns content and inadvertent access to content (including meta data). &#8220;Lack of user authentication/verfiication and/or the inability to reliably identify an institution&#8217;s holdings&#8221; is cited by the NFAIS paper as a concern.</li>
<li>The ranking algorithm. Whose content is shown first in Discovery Service search results? There are issues with conflicts of interest among Discovery Service providers that makes me uneasy.</li>
<li>Coverage. What sources are available to users of Discovery Services? Who gets to decide that? What about access to the very specialized sources that Discovery Services are less likely to have access to?</li>
</ol>
<p>Take a look at the <a href="http://nfais.org/MiBlog/Blogs/view/code-of-practice-discovery-services-review-of-draft">NFAIS  draft Code of Practice</a> and raise your concerns.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/nfais-on-discovery-services/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Reducing Research Time and Costs in the Corporate Environment</title>
		<link>http://deepwebtechblog.com/reducing-research-time-and-costs-in-the-corporate-environment/</link>
		<comments>http://deepwebtechblog.com/reducing-research-time-and-costs-in-the-corporate-environment/#comments</comments>
		<pubDate>Thu, 28 Jul 2011 20:58:53 +0000</pubDate>
		<dc:creator>Brian Despain</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Marketing Announcements]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1702</guid>
		<description><![CDATA[One of the many unseen costs in corporations is the actual time spent doing research for the company. Researchers, product managers and scientists use a significant time researching a wide variety of sources to bring a new product to market. There are multiple costs in research. First, there is the cost of the research materials [...]]]></description>
			<content:encoded><![CDATA[<p>One of the many unseen costs in corporations is the actual time spent doing research for the company. Researchers, product managers and<img class="alignright size-medium wp-image-1710" title="researcher" src="http://deepwebtechblog.com/wp-content/uploads/2011/07/researcher-300x200.jpg" alt="" width="300" height="200" /> scientists use a significant time researching a wide variety of sources to bring a new product to market. There are multiple costs in research. First, there is the cost of the research materials themselves, namely the journals and subscription content that every technology driven organization uses to keep ahead of developments in the field. Maximizing utilization of subscription content is important to an enterprise. No one wants to buy content and then have it sit in the equivalent of a digital closet because the interface is too hard to use or there are too many interfaces to search for multiple sources. Enterprises want to get the maximum value for their subscription dollar.</p>
<p>The cost of the content isn&#8217;t the only cost to be considered either. Research scientists are paid on average <a href="http://www.indeed.com/salary/Research-Scientist.html">$84,000</a> annually. A principal research scientist earns on average <a href="http://www.indeed.com/salary/Research-Scientist.html">$120,000</a>. In organizations that are primarily research driven such as aerospace, semiconductors, chemical manufacturing, law and engineering, time spent in research is time not spent developing a product, improving a process or inventing the next great widget. Research needs to get done, and quickly, without any sacrifice on breadth or depth of research.</p>
<p>Federated search addresses these two cost areas of research while providing a third benefit to organizations. By providing a single point to searching hundreds of sources, researchers can issue a single search request through a simple Google-like interface and get thousands of results sorted by <a href="http://deepwebtechblog.com/clusters-that-think/">semantically related concepts</a>. Compare this with issuing multiple, individual search requests, then collating the results across multiple applications, de-duping the results and then getting the full text; federated search speeds this process up with every source that a researcher needs. Furthermore federated search decreases the likelihood of missing an important document. By extending the search across multiple sources, you give researchers more time to go in depth into the results. This significantly cuts the time to result for a researcher to begin analysis and to do the real job he or she is paid to do &#8211; build that next generation product, improve that process or design that industrial process.</p>
<p>Deep Web Technologies Explorit is used by some of the world&#8217;s leading research organizations to speed research. Our corporate customers are world leaders such as Boeing in aerospace, Intel in semiconductors and BASF in chemicals. These organizations have chosen Explorit to provide a competitive advantage in today&#8217;s world. Isn&#8217;t it time your firm looked at improving it&#8217;s bottom line and competitiveness? If you are interested in reducing research time and research costs and improving the effectiveness of your knowledge enterprise, contact Brian Despain, VP of Sales, toll free: 866-388-1407 x235 or visit <a href="http://www.deepwebtech.com">Deep Web</a> Technology and click on the chat link during business hours.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/reducing-research-time-and-costs-in-the-corporate-environment/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Age of Discovery</title>
		<link>http://deepwebtechblog.com/the-age-of-discovery/</link>
		<comments>http://deepwebtechblog.com/the-age-of-discovery/#comments</comments>
		<pubDate>Fri, 24 Jun 2011 16:06:22 +0000</pubDate>
		<dc:creator>Darcy Pedersen</dc:creator>
				<category><![CDATA[Features]]></category>
		<category><![CDATA[Federated Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1696</guid>
		<description><![CDATA[Abe Lederman is heading to the ALA Annual Conference this weekend in New Orleans to take part in a fascinating panel discussion: The Age of Discovery: Understanding Discovery Services, Federated Search and Web Scale.   Here&#8217;s a brief description: Findability, discovery services, federated search, web scale—ways to discover content are increasing all the time, but [...]]]></description>
			<content:encoded><![CDATA[<p>Abe Lederman is heading to the ALA Annual Conference this weekend in New Orleans to take part in a fascinating panel discussion: <a href="http://connect.ala.org/node/145429">The Age of Discovery: Understanding Discovery Services, Federated Search and Web Scale</a>.   Here&#8217;s a brief description:<img class="alignright size-full wp-image-1697" title="download" src="http://deepwebtechblog.com/wp-content/uploads/2011/06/download1.png" alt="" width="223" height="165" /></p>
<blockquote><p>Findability, discovery services, federated search, web scale—ways to discover content are increasing all the time, but how do we discover which discovery mechanism is appropriate? Join us to learn more about the discovery landscape. When is it appropriate to use federated search over a discovery service? How does this differ by type of researcher? What kinds of resources should be included in discovery tools? Learn discovery implementation from two librarians in the trenches; learn about “web scale” and how federated search and discovery are evolving from the experts; and how the rest of us can sort out this tangle of access methods!</p></blockquote>
<p>Join Abe and the other panelists Sunday, June 26, 2011, 4:00-5:30 pm, at the Hilton Riverside – Grand Salon C.</p>
<p>Abe&#8217;s presentation is available here: <a href="http://www.deepwebtech.com/ala2011.ppt">http://www.deepwebtech.com/ala2011.ppt</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/the-age-of-discovery/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WorldWideScience receives warm welcome at the UN</title>
		<link>http://deepwebtechblog.com/worldwidescience-receives-warm-welcome-at-the-un/</link>
		<comments>http://deepwebtechblog.com/worldwidescience-receives-warm-welcome-at-the-un/#comments</comments>
		<pubDate>Wed, 15 Jun 2011 15:05:10 +0000</pubDate>
		<dc:creator>Sol</dc:creator>
				<category><![CDATA[Clients]]></category>
		<category><![CDATA[Features]]></category>
		<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Multilingual Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1680</guid>
		<description><![CDATA[WorldWideScience is a global science gateway that combines national and international scientific databases into a search engine. From a single search form, a scientist, researcher, or curious citizen can search over fifty databases in English and now 22 multilingual sources (with translation to the searcher&#8217;s native language) and seven multimedia sources. WorldWideScience is the brainchild [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://worldwidescience.org">WorldWideScience</a> is a global science gateway that combines national and international scientific databases into a search engine.<a href="http://www.worldwidescience.org"><img class="alignright size-medium wp-image-1691" title="WorldWideScience now includes multilingual and multimedia sources!" src="http://deepwebtechblog.com/wp-content/uploads/2011/06/download-300x141.png" alt="" width="300" height="141" /></a> From a single search form, a scientist, researcher, or curious citizen can search over fifty databases in English and now 22 multilingual sources (with translation to the searcher&#8217;s native language) and seven multimedia sources. WorldWideScience is the brainchild of the director of the DOE Office of Scientific and Technical Information (OSTI), Dr. Walt Warnick. The gateway is maintained and hosted by OSTI and governed by the <a href="http://worldwidescience.org/alliance.html">WorldWideScience Alliance</a>.</p>
<p><a href="http://deepwebtech.com">Deep Web Technologies</a> is proud to have developed the federated search technology behind WorldWideScience. And, with the cooperation of the Microsoft Translation services team, Deep Web Technologies also implemented the multilingual technology. It was a major undertaking but a worthwhile one for the science community, whose members can now greatly expand their reach to scientific papers in languages beyond their own.</p>
<p>Dr. Warnick was invited to deliver a <a href="http://www.osti.gov/speeches/fy2011/warnick/UNC2011/index.shtml">presentation</a> at the 14th session of the United Nations&#8217; Commission on Science and Technology (CSTD). In a post at the <a href="http://www.osti.gov/ostiblog/worldwidescience-opens-international-doors">OSTI Blog</a>, Dr. Warnick shares the warm reception that WorldWideScience received.</p>
<blockquote><p>I wish more of my OSTI colleagues could have been in Geneva to share the warm response from the attendees.   Several country representatives offered up new sources for WorldWideScience (WWS).  Another member of the audience searched mobile WWS for his own name and remarked that he found many of his papers.  I received enthusiastic comments, so many that I couldn’t address all of them because of time constraints.  Significantly, the Chair of CSTD volunteered to pay the costs of becoming a member of the WorldWideScience Alliance.  There was great excitement about the possibilities for its use within the home countries of the attendees and how WWS advances the goals of CSTD.</p></blockquote>
<p>The paper &#8220;<a href="http://iospress.metapress.com/content/f767t1076251xu84/">Breaking down language barriers through multilingual federated search</a>&#8221; co-authored by Abe Lederman (founder and president of Deep Web Technologies), and Dr. Warnick, Brian Hitson, and Lorrie Johnson from OSTI, explains the importance of the gateway:</p>
<blockquote><p>&#8220;WorldWideScience.org (WWS) is a global science gateway developed by the US Department of Energy Office of Scientific and Technical Information (OSTI) in partnership with federated search vendor Deep Web Technologies. WWS provides a simultaneous live search of 69 databases from government and government-sanctioned organizations from 66 participating nations. The WWS portal plays a leading role in bringing together the world&#8217;s scientists to accelerate the discoveries needed to solve the planet&#8217;s most pressing problems. In this paper we present a brief history of the development of WWS and discuss how a new technology, multilingual federated search, greatly increases WWS&#8217; ability to facilitate the advancement of science.&#8221;</p></blockquote>
<p>Deep Web Technologies is delighted to be working with OSTI and other organizations to push the envelope of search technology and to make the world a smaller place.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/worldwidescience-receives-warm-welcome-at-the-un/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Preparing for ALA Panel and Federated Search Neutrality</title>
		<link>http://deepwebtechblog.com/preparing-for-ala-panel-and-federated-search-neutrality/</link>
		<comments>http://deepwebtechblog.com/preparing-for-ala-panel-and-federated-search-neutrality/#comments</comments>
		<pubDate>Tue, 31 May 2011 21:29:15 +0000</pubDate>
		<dc:creator>Abe</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[View from Inside]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1645</guid>
		<description><![CDATA[As is usual for me I’m up early this morning after the three day Memorial Day weekend, going through my Biznar Alerts, and I run into this interesting blog post: Net Neutrality and Federating Searching Jake, a librarian in the D.C. area and beer aficionado (he’s the beerbrarian), writes on how the neutrality of federated [...]]]></description>
			<content:encoded><![CDATA[<p>As is usual for me I’m up early this morning after the three day Memorial Day weekend, going through my <a href="http://biznar.com">Biznar</a> Alerts, and I run into this interesting blog post:</p>
<p><a href="http://beerbrarian.blogspot.com/2010/12/net-neutrality-and-federated-searching.html">Net Neutrality and Federating Searching</a></p>
<p>Jake, a librarian in the D.C. area and beer aficionado (he’s the beerbrarian), writes on how the neutrality of federated search solutions is often overlooked and that it is most disconcerting that librarians and users of federated search solutions are not even aware of the bias in federated search results. This bias is the result of the ulterior motives of some federated search / discovery services vendors whose primary business is selling content.</p>
<p>Jake writes:</p>
<blockquote><p>I am credentialed at an institution that uses EHIS. I searched for dozens of terms, and the results weren’t pleasant for EHIS. It’s a crude test, but EHIS failed it.</p>
<p>EHIS consistently promoted EBSCO resources, favoring Academic Search Premier, an interdisciplinary EBSCO database, over product from other vendors that are more specialized.</p></blockquote>
<p><img src="http://deepwebtechblog.com/wp-content/uploads/2010/12/evil_google.jpg" alt="" width="200" height="180" align="left" />This subject is one that I have addressed in several blog posts before including this post last December, <a href="http://deepwebtechblog.com/if-google-might-be-doing-it%e2%80%a6/">If Google might be Doing it …</a>, but I welcome reinforcement of this concern. In the last 6-12 months following the initial craze with Discovery Services I have seen significant more questioning in the library community of Discovery Services such as ProQuest’s Summon and EBSCO’s EDS.</p>
<p>Here at Deep Web Technologies we have put lots of emphasis on our relevance ranking algorithms, assigning a rank to each result returned (we bring back hundreds to several thousand results for some of the broader queries) based on how closely the title, author and snippets match the user’s query and not what the source that returned the result is.</p>
<p><img src="http://deepwebtechblog.com/wp-content/uploads/2011/05/Picture-46.png" alt="" align="right" />On Sunday, June 26th at the <a href="http://www.alaannual.org/">ALA Summer National Conference</a> in New Orleans I’ll be speaking on a panel on <a href="http://connect.ala.org/node/136968">The Age of Discovery: Understanding Discovery Services, Federated Search, and Web scale</a>. You can be assured that this is one topic that I will be discussing in my presentation.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/preparing-for-ala-panel-and-federated-search-neutrality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Federated search: the challenges of incremental results</title>
		<link>http://deepwebtechblog.com/federated-search-the-challenges-of-incremental-results/</link>
		<comments>http://deepwebtechblog.com/federated-search-the-challenges-of-incremental-results/#comments</comments>
		<pubDate>Fri, 13 May 2011 16:07:55 +0000</pubDate>
		<dc:creator>Sol</dc:creator>
				<category><![CDATA[Features]]></category>
		<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[View from Inside]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1628</guid>
		<description><![CDATA[Welcome to the second edition of &#8220;Best of the Federated Search Blog.&#8221; In this series I pull articles out of the Federated Search Blog archive and comment on them for the benefit of those considering Deep Web Technologies&#8216; offerings. In March, 2008 I explored the &#8220;incremental results&#8221; feature which Deep Web Technologies makes available in [...]]]></description>
			<content:encoded><![CDATA[<p>Welcome to the second edition of &#8220;Best of the Federated Search Blog.&#8221; In this series I pull articles out of the <a href="http://federatedsearchblog.com">Federated Search Blog</a> archive and<a href="http://www.federatedsearchblog.com"><img class="alignright size-medium wp-image-1608" title="bestof" src="http://deepwebtechblog.com/wp-content/uploads/2011/05/bestof-300x76.png" alt="" width="300" height="76" /></a> comment on them for the benefit of those considering <a href="http://www.deepwebtech.com">Deep Web Technologies</a>&#8216; offerings.</p>
<p>In March, 2008 I explored the &#8220;<a href="http://federatedsearchblog.com/2008/03/28/federated-search-the-challenges-of-incremental-results/">incremental results</a>&#8221; feature which Deep Web Technologies makes available in all its federated search applications. As a consultant to Deep Web Technologies I may be somewhat biased but I do believe that this feature is a huge differentiator for the company.</p>
<p>What are incremental results?</p>
<blockquote><p>The idea is simple: display results in chunks as they are received from the sources being searched. <a href="http://www.science.gov">Science.gov</a>, <a href="http://WorldWideScience.org">WorldWideScience.org</a>, and <a href="http://scitopia.org">Scitopia.org</a> are three applications that display incremental results.</p></blockquote>
<p>Why is it a big deal to provide incremental results? It&#8217;s because we live in the age of Google speed. Users don&#8217;t want to wait the 30 seconds it could take a content source to provide its results. The achilles heel of federated search is the fact that we have no control over how quickly sources respond with their results. If a federated search application is searching 30 sources at once and 29 of them return results quickly but one is slow to respond then the traditional approach to displaying search results has users wait until the last source returns its results. This is bad news for the impatient user.</p>
<p>Deep Web Technologies&#8217; approach is to wait just a few seconds, long enough to get a variety of documents from a number of sources. It then relevance ranks those documents and displays those results quickly to users. While users are inspecting those first results, Explorit (Deep Web Technologies&#8217; federated search engine) is gathering results from the other sources to display when the user is ready.</p>
<p>Explorit is polite to users. It doesn&#8217;t simply overwrite the first set of search results with a later batch. It instead informs the users that a newer set is available and asks the user if he wants that set. The user can take the offer, turn it down or defer it (waiting until later to refresh the results.)</p>
<p>Incremental results are a nice way to balance the federated search speed issue with the user demand for speed. We think the feature works well. You can judge for yourself at <a href="http://www.science.gov">Science.gov</a>, <a href="http://WorldWideScience.org">WorldWideScience.org</a>, and <a href="http://scitopia.org">Scitopia.org</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/federated-search-the-challenges-of-incremental-results/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Diagnosing Federated Search Source Problems: It&#8217;s Harder Than You Think</title>
		<link>http://deepwebtechblog.com/diagnosing-federated-search-source-problems-its-harder-than-you-think/</link>
		<comments>http://deepwebtechblog.com/diagnosing-federated-search-source-problems-its-harder-than-you-think/#comments</comments>
		<pubDate>Tue, 03 May 2011 22:23:55 +0000</pubDate>
		<dc:creator>Sol</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[View from Inside]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1607</guid>
		<description><![CDATA[Welcome to &#8220;The Best of the Federated Search Blog.&#8221; In this ongoing series I will be commenting on classic articles that I have authored for the Federated Search Blog. I aim to focus on the relevance of the article to current and prospective customers of Deep Web Technologies. In this first &#8220;Best of&#8221; article, Diagnosing [...]]]></description>
			<content:encoded><![CDATA[<p>Welcome to &#8220;The Best of the Federated Search Blog.&#8221; In this ongoing series I will be commenting on classic articles that I have authored for the<a href="http://deepwebtechblog.com/wp-content/uploads/2011/05/bestof.png"><img hspace="10" vspace="10" class="alignright size-medium wp-image-1608" src="http://deepwebtechblog.com/wp-content/uploads/2011/05/bestof-300x76.png" alt="" width="300" height="76" /></a> <a href="http://federatedsearchblog.com">Federated Search Blog</a>. I aim to focus on the relevance of the article to current and prospective customers of Deep Web Technologies.</p>
<p>In this first &#8220;Best of&#8221; article, <a href="http://federatedsearchblog.com/2008/06/23/diagnosing-federated-search-source-problems-its-harder-than-you-think/">Diagnosing federated search source problems: it&#8217;s harder than you think</a>, I introduced the complex nature of isolating and debugging source access problems. The gist of the challenge is that there are a number of potential points of failure and it&#8217;s not always obvious what is failing. Without knowing what is failing it&#8217;s not possible to correct the problem or to know who to notify to get the problem corrected. Plus, in order to be able to take action on a source access problem one needs to be aware of the problem. This requires a monitoring system that regularly probes all sources and alerts the appropriate persons of a problem.</p>
<p>Many users of federated search don&#8217;t realize that maintaining connectors (the software component accesses sources) is a substantial amount of work that requires a substantial investment. Prospective customers of federated search often focus too much on the bells and whistles of a particular implementation but don&#8217;t ask enough questions about whether their important sources can be searched and about what happens when a connector stops working.</p>
<p>Deep Web Technologies appreciates the difficulty of connector development and management. The company has a large catalog of connectors it has developed so its dedicated connector team has extensive experience with connector issues and is quick to identify and correct problems within its control. For those problems beyond its control, Deep Web Technologies has a publisher relations staff that can work with content providers to get those corrected. And, to ensure that problems are quickly discovered, Deep Web Technologies has developed custom software that frequently probes every single sources (except, of course, for those behind firewalls). If a source is intermittently down then an alarm is raised and the publisher, if appropriate is notified. If a source is &#8220;down hard&#8221; then the connector team swings into action and determines who owns the problem. If the problem is one that will result in a significant outage for a particular source then, upon a customer&#8217;s request, the source can be taken offline so that it is not searched at all during the outage period.</p>
<p>Connectors are the foundation of federated search. If the content you want isn&#8217;t available then nothing else matters. That&#8217;s why Deep Web Technologies works diligently to minimize the amount of time that a source isn&#8217;t available.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/diagnosing-federated-search-source-problems-its-harder-than-you-think/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brain Food&#8211; Are You Eating Healthy?</title>
		<link>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/</link>
		<comments>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/#comments</comments>
		<pubDate>Wed, 23 Feb 2011 18:53:57 +0000</pubDate>
		<dc:creator>Nicholas</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[The Deep Web]]></category>
		<category><![CDATA[Deep]]></category>
		<category><![CDATA[deep web analysis]]></category>
		<category><![CDATA[federated knowledge]]></category>
		<category><![CDATA[Federated Search  Blog]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[results]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[strategic advantage]]></category>
		<category><![CDATA[top 10]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1568</guid>
		<description><![CDATA[Our minds like to be fed with the best information possible! In the case of researchers (academic, medical, business, or others), their works depend on it.  Here is a great blog post by Sol Lederman I would like to share. It&#8217;s on the topic of information quality vs. information coverage. It really illustrates the importance of search [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deepwebtechblog.com/wp-content/uploads/2011/02/brain-food.jpg"><img class="alignleft size-medium wp-image-1573" style="margin: 10px; border: 10px solid black;" title="brain-food" src="http://deepwebtechblog.com/wp-content/uploads/2011/02/brain-food-300x187.jpg" alt="" width="300" height="187" /></a>Our minds like to be fed with the best information possible! In the case of researchers (academic, medical, business, or others), their works depend on it.  Here is a great blog post by Sol Lederman I would like to share. It&#8217;s on the topic of information quality vs. information coverage. It really illustrates the importance of search tools. I think use of &#8220;quantity&#8221; is important to note here. Quantity of information is not typically a bad thing&#8211; unless it&#8217;s a large amount of irrelevant results. If you would like to eat some brain food, Deep Web Tech is serving up a hot plate of your favorite info! Here are some sites you can feast on for a lifetime; bon appetit! <a href="http://biznar.com/biznar/" target="_blank">Business</a>, <a href="http://mednar.com/mednar/" target="_blank">Medical</a>, <a href="http://www.science.gov/" target="_blank">Science</a>, and <a href="http://worldwidescience.org/" target="_blank">World Wide Science</a>.</p>
<p>Sol Lederman wrote:</p>
<p>I recently discovered an article, <a href="http://www.brisbanegrammar.com/blogs/library/?p=870">5 Reasons Not to Use Google First</a>, that sings my song. The article addresses this question:</p>
<blockquote><p>Google is fast, clean and returns more results than any other search engine, but does it really find the information students need for quality academic research? The answer is often ‘no’. “While simply typing words into Google will work for many tasks, academic research demands more.” (<a href="http://manipulating-media.co.uk/2010/08/27/searching-for-and-finding-new-information-desk-research/">Searching for and finding new information – tools, strategies and techniques</a>)</p></blockquote>
<p>The next paragraph gave me a chuckle.</p>
<blockquote><p>As far back as 2004, James Morris, Dean of the School of Computer Science at Carnegie Mellon University, coined the term “infobesity,” to describe “the outcome of Google-izing research: a junk-information diet, consisting of overwhelming amounts of low-quality material that is hard to digest and leads to research papers of equally low quality.” (<a href="http://www.soi.city.ac.uk/~dbawden/bawden%20and%20brophy%20ap.pdf">Is Google enough? Comparison of an internet search engine with academic library resources</a>.)</p></blockquote>
<p>The article continues with its list of five good reasons to not use Google first.</p>
<p>Note that the recommendation isn’t to skip Google altogether. There’s a balance that’s needed to get the best value when performing research. The findings in the “Is Google enough?” article summarizes this point really well:</p>
<blockquote><p>Google is superior for coverage and accessibility. Library systems are superior for quality of results. Precision is similar for both systems. Good coverage requires use of both, as both have many unique items. Improving the skills of the searcher is likely to give better results from the library systems, but not from Google.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Deep Web Tech in the News: WorldWideScience.org</title>
		<link>http://deepwebtechblog.com/httpwp-mepzhks-on/</link>
		<comments>http://deepwebtechblog.com/httpwp-mepzhks-on/#comments</comments>
		<pubDate>Fri, 11 Feb 2011 16:13:30 +0000</pubDate>
		<dc:creator>Nicholas</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[Reviews]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1537</guid>
		<description><![CDATA[Sometimes when I lie in bed with the dream of federating 1,000,000,000 sources dancing around in my mind, I often wonder, &#8220;What&#8217;s the best search engine?&#8221;. I suppose that depends on who (or what) you seek. The Internet is the largest collection of information that has ever been amassed, leading us to need better search [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deepwebtechblog.com/wp-content/uploads/2011/02/Science-joke-blog-photo.jpg"><img class="alignleft size-medium wp-image-1541" style="margin: 10px; border: 10px solid black;" title="Science joke blog photo" src="http://deepwebtechblog.com/wp-content/uploads/2011/02/Science-joke-blog-photo-300x218.jpg" alt="" width="300" height="218" /></a>Sometimes when I lie in bed with the dream of federating 1,000,000,000 sources dancing around in my mind, I often wonder, &#8220;What&#8217;s the best search engine?&#8221;. I suppose that depends on who (or what) you seek. The Internet is the largest collection of information that has ever been amassed, leading us to need better search engines.</p>
<p>I use Google a lot when I want to search for the answer to a simple question, a tasty recipe, or I want a good <a href="http://www.youtube.com/watch?v=0Bmhjf0rKe8" target="_blank">laugh</a> (it gets me every time). I also find that using a wide variety of search tools is essential if you&#8217;re serious about getting the best possible search results. For all of the individuals out there that understand the need for specialized search engines, this next part will be a gem.</p>
<p>WorldWideScience.org, a premier scientific search engine has received some recent attention.   <a href="http://www.osti.gov/bios/warnick.html" target="_blank">Walter L. Warnick, Ph.D</a>., Director of the Office of Scientific and Technical Information (OSTI) for the U.S. Department of Energy, demonstrated  <a href="http://worldwidescience.org/" target="_blank">WorldWideScience.org</a> in front of multiple observers including those from the scientific community. According to openbiomed.info, &#8220;WorldWideScience.org is federated full-text, public, database of scientific and technical research information published from at least 70 cooperative countries&#8221;.  With only about 4% duplication with general search engines (that means Google), WorldWideScience.org   provides access to millions of deep web documents using a single, multi-language search query. One user wrote:</p>
<p>I decided to assess the biomedical research coverage by trying to find information on Methicillin-resistant Staphylococcus aureus (MRSA) and procedures to isolate patients with this infection.  I searched the keyword phrase “<strong>MRSA isolation protocol</strong>“: 892 documents were located, and when the <a href="http://worldwidescience.org/multilingual/result-list/fullRecord:MRSA+isolation+protocol/preferredLanguage:en/#ResultList=0%7C0%7C_%7CDATE%7C0" target="_blank">result was sorted by date</a>, open access resources such as the <a href="http://www.doaj.org/doaj?func=searchArticles" target="_blank">Directory of Open Access Journals (DOAJ)</a>, <a href="http://ukpmc.ac.uk/" target="_blank">UK PubMed Central</a>, and <a href="http://www.lenus.ie/hse/" target="_blank">LENUS (Irish Health Repository)</a> present very recent research.   In comparison, the <strong><a href="http://openbiomed.info/2011/01/worldwidescience-deep-web/scirus.com" target="_blank">SCIRUS</a> </strong>database searched for MRSA isolation protocol finds quantitatively more research, but when <a href="http://www.scirus.com/srsapp/search?q=MRSA+isolation+protocol&amp;t=all&amp;drill=yes&amp;sort=1&amp;p=0" target="_blank">sorted by date</a>, all of the recent literature identified is available in Science Direct via subscription-only access or for purchase US $ 31.50.</p>
<p>If you haven&#8217;t tried out the WorldWideScience, and you like having a mind-blowing amount of <em>relevant</em> information from all over the world at your finger tips, I suggest you give it a spin; you won&#8217;t be sorry you did. We understand the need for collecting information that&#8217;s not for answering the simple questions, or for cute cats (he is really cute though, check the laugh link above).</p>
<p>Still need convincing? Here is a <a href="http://www.youtube.com/watch?v=OVdEbimZy_0&amp;feature=player_embedded#" target="_blank">simple demonstration</a> and a shout-out to whomever made a pretty good video about a search engine, even with the 80&#8242;s cartoon-like sounds and &#8220;rad&#8221; guitar background music, I find it quite entertaining, there&#8217;s a special place in my heart for you.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/httpwp-mepzhks-on/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

