<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Deep Web Technologies Blog &#187; The Deep Web</title>
	<atom:link href="http://deepwebtechblog.com/category/deep-web/feed/" rel="self" type="application/rss+xml" />
	<link>http://deepwebtechblog.com</link>
	<description>covering federated search and how to get the best from the Deep Web.</description>
	<lastBuildDate>Tue, 25 Oct 2011 16:35:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>Brain Food&#8211; Are You Eating Healthy?</title>
		<link>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/</link>
		<comments>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/#comments</comments>
		<pubDate>Wed, 23 Feb 2011 18:53:57 +0000</pubDate>
		<dc:creator>Nicholas</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[The Deep Web]]></category>
		<category><![CDATA[Deep]]></category>
		<category><![CDATA[deep web analysis]]></category>
		<category><![CDATA[federated knowledge]]></category>
		<category><![CDATA[Federated Search  Blog]]></category>
		<category><![CDATA[research]]></category>
		<category><![CDATA[results]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[strategic advantage]]></category>
		<category><![CDATA[top 10]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1568</guid>
		<description><![CDATA[Our minds like to be fed with the best information possible! In the case of researchers (academic, medical, business, or others), their works depend on it.  Here is a great blog post by Sol Lederman I would like to share. It&#8217;s on the topic of information quality vs. information coverage. It really illustrates the importance of search [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deepwebtechblog.com/wp-content/uploads/2011/02/brain-food.jpg"><img class="alignleft size-medium wp-image-1573" style="margin: 10px; border: 10px solid black;" title="brain-food" src="http://deepwebtechblog.com/wp-content/uploads/2011/02/brain-food-300x187.jpg" alt="" width="300" height="187" /></a>Our minds like to be fed with the best information possible! In the case of researchers (academic, medical, business, or others), their works depend on it.  Here is a great blog post by Sol Lederman I would like to share. It&#8217;s on the topic of information quality vs. information coverage. It really illustrates the importance of search tools. I think use of &#8220;quantity&#8221; is important to note here. Quantity of information is not typically a bad thing&#8211; unless it&#8217;s a large amount of irrelevant results. If you would like to eat some brain food, Deep Web Tech is serving up a hot plate of your favorite info! Here are some sites you can feast on for a lifetime; bon appetit! <a href="http://biznar.com/biznar/" target="_blank">Business</a>, <a href="http://mednar.com/mednar/" target="_blank">Medical</a>, <a href="http://www.science.gov/" target="_blank">Science</a>, and <a href="http://worldwidescience.org/" target="_blank">World Wide Science</a>.</p>
<p>Sol Lederman wrote:</p>
<p>I recently discovered an article, <a href="http://www.brisbanegrammar.com/blogs/library/?p=870">5 Reasons Not to Use Google First</a>, that sings my song. The article addresses this question:</p>
<blockquote><p>Google is fast, clean and returns more results than any other search engine, but does it really find the information students need for quality academic research? The answer is often ‘no’. “While simply typing words into Google will work for many tasks, academic research demands more.” (<a href="http://manipulating-media.co.uk/2010/08/27/searching-for-and-finding-new-information-desk-research/">Searching for and finding new information – tools, strategies and techniques</a>)</p></blockquote>
<p>The next paragraph gave me a chuckle.</p>
<blockquote><p>As far back as 2004, James Morris, Dean of the School of Computer Science at Carnegie Mellon University, coined the term “infobesity,” to describe “the outcome of Google-izing research: a junk-information diet, consisting of overwhelming amounts of low-quality material that is hard to digest and leads to research papers of equally low quality.” (<a href="http://www.soi.city.ac.uk/~dbawden/bawden%20and%20brophy%20ap.pdf">Is Google enough? Comparison of an internet search engine with academic library resources</a>.)</p></blockquote>
<p>The article continues with its list of five good reasons to not use Google first.</p>
<p>Note that the recommendation isn’t to skip Google altogether. There’s a balance that’s needed to get the best value when performing research. The findings in the “Is Google enough?” article summarizes this point really well:</p>
<blockquote><p>Google is superior for coverage and accessibility. Library systems are superior for quality of results. Precision is similar for both systems. Good coverage requires use of both, as both have many unique items. Improving the skills of the searcher is likely to give better results from the library systems, but not from Google.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/brain-food-are-you-eating-healthy/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Search Strategies in Federated Search: Refine Search</title>
		<link>http://deepwebtechblog.com/search-strategies-in-federated-search-refine-search/</link>
		<comments>http://deepwebtechblog.com/search-strategies-in-federated-search-refine-search/#comments</comments>
		<pubDate>Mon, 20 Dec 2010 19:58:17 +0000</pubDate>
		<dc:creator>Brian Despain</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[The Deep Web]]></category>
		<category><![CDATA[Deep]]></category>
		<category><![CDATA[deep web analysis]]></category>
		<category><![CDATA[deep web technologies]]></category>
		<category><![CDATA[federated knowledge]]></category>
		<category><![CDATA[Federated Search  Blog]]></category>
		<category><![CDATA[queries]]></category>
		<category><![CDATA[querying the deep web]]></category>
		<category><![CDATA[results]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=1317</guid>
		<description><![CDATA[Filtering results is as much an art as a science. Refining or clarifying an initial set of search results is a fairly common practice among search engine users that don&#8217;t know exactly what they are looking for. Yet in federated searching, refining a group of results is often not the best strategy for a user to find [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deepwebtechblog.com/wp-content/uploads/2010/12/filtering-water.jpg"><img class="alignright size-medium wp-image-1370" title="filtering water" src="http://deepwebtechblog.com/wp-content/uploads/2010/12/filtering-water-300x199.jpg" alt="" width="300" height="199" /></a></p>
<p>Filtering results is as much an art as a science.</p>
<p>Refining or clarifying an initial set of search results is a fairly common practice among search engine users that don&#8217;t know exactly what they are looking for. Yet in federated searching, refining a group of results is often not the best strategy for a user to find the best results they are looking for. Let&#8217;s do a quick walk-through of how a user might use a federated search application. Let&#8217;s say a user is interested in nanotechnology and is using our Explorit federated search application. When the user inputs the query, the search brings back roughly 2500 results (this is a rough number, and it can vary depending on the sources). After sorting through some of the results, the user decides that their interest really lies in medical nanotechnology, so they limit the search to just &#8220;medical nanotechnology&#8221;.</p>
<p>To get the <em>best</em> possible results from a federated search application, it makes sense to re-run the search with those two terms. The initial group of results was run with the term &#8220;nanotechnology&#8221; so it may contain <em>some </em>results with the term &#8220;medical&#8221; in it but since the original search strategy was to search for the nanotechnology industry as a whole, the odds of the best results for &#8220;medical technology&#8221; are slim to none. It&#8217;s far better for a user of federated search to type in the new query to get results from the various sources that match the new search strategy.</p>
<p>Now that the user has the information, how do they sort through it? Most search engines have their own relevancy ranking tuned to their own content. This means a two-term search strategy might be handled differently than a broader search and likely more efficiently.</p>
<p>The real goal, of course, to get the maximum number of results that match the users search strategy, and give them the tools for sorting through them. This strategy would also hold for true for discovery services as well, since many discovery services (such as Summon) only return 1,000 results in response to the query. Those 1,000 results are tuned to the initial query and refining such a small result set further, wouldn&#8217;t serve the needs of their users. The user would be better off, as we mentioned before, re-initiating the search.</p>
<p>Just as a side note, this 1,000 result limit is pretty common in large-scale indexes. Google for example, will only bring back 1,000 results for any query even though Google indicates that thousands or millions of results are available. Google brings back the 1,000 most relevant to the initial query, in an index that is filled with a huge amount of spam. So it&#8217;s a better idea to re-initiate your query with the refined search strategy to get access to more information.  This 1,000 limitation makes less sense in a product like Summon since presumably the index doesn&#8217;t have any spam in it.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/search-strategies-in-federated-search-refine-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Crawling the Deep Web</title>
		<link>http://deepwebtechblog.com/crawling-the-deep-web/</link>
		<comments>http://deepwebtechblog.com/crawling-the-deep-web/#comments</comments>
		<pubDate>Tue, 09 Mar 2010 20:15:16 +0000</pubDate>
		<dc:creator>Darcy Pedersen</dc:creator>
				<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[The Deep Web]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=494</guid>
		<description><![CDATA[Nimish Sawant from LiveMint.com recently published a post on the Deep Web, and some of the services that search it.   He points to the differences between Google and other search appliances such as federated search.  Nimish raises the most popular search question of our time, &#8220;If Google can’t find the data, where exactly is it [...]]]></description>
			<content:encoded><![CDATA[<p>Nimish Sawant from LiveMint.com recently published a post on the Deep Web, and some of the services that search it.   He points to the differences between Google and other search appliances such as federated search. <a href="http://www.livemint.com/2010/03/09211503/Crawling-the-deep-web.html?h=C"><img class="alignright size-medium wp-image-495" title="Deep Web Portals" src="http://deepwebtechblog.com/wp-content/uploads/2010/03/Deep-Web-Portals-300x95.jpg" alt="Deep Web Portals" width="300" height="95" /></a> Nimish raises the most popular search question of our time, &#8220;<em>If Google can’t find the data, where exactly is it and why can’t it be  crawled</em>?&#8221;  He came at this question from a slightly different perspective:</p>
<blockquote><p>Let’s try to decode the deep Web by virtue of content. A database  contains information stored in tables that are created by programs such  as Access, SQL or Oracle. This data can only be retrieved by posting a  query. The query, when executed, searches the database to come up with  the result that has been specified. This is very different from  searching static Web pages that can be accessed directly by crawlers.</p></blockquote>
<p>Deep Web Technologies made the list of four companies that utilize federated search for the deep web.  It&#8217;s always nice to see articles that recognize our web portals such as <a href="http://www.biznar.com" target="_blank">Biznar</a> and <a href="http://www.mednar.com" target="_blank">Mednar</a> for both their Deep Web search capabilities and the federated search technology that powers them.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/crawling-the-deep-web/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Calling all web designers &#8230; RFQ for theme development issued</title>
		<link>http://deepwebtechblog.com/calling-all-web-designers-rfq-for-theme-development-issued/</link>
		<comments>http://deepwebtechblog.com/calling-all-web-designers-rfq-for-theme-development-issued/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 18:33:31 +0000</pubDate>
		<dc:creator>Larry Donahue</dc:creator>
				<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[The Deep Web]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=294</guid>
		<description><![CDATA[We have been inspired by the CSS Zen Garden (www.csszengarden.com), and have asked ourselves, &#8220;Could this apply in a way to make deep web research more productive and interesting?&#8221; We are therefore excited to announce that we are seeking the services of four (4) web designers (firms or individuals) to construct the CSS and graphic [...]]]></description>
			<content:encoded><![CDATA[<p>We have been inspired by the CSS Zen Garden (<a href="http://www.csszengarden.com/" title="Visit the CSS Zen Garden!">www.csszengarden.com</a>), and have asked ourselves, &#8220;Could this apply in a way to make deep web research more productive and interesting?&#8221;</p>
<p>We are therefore excited to announce that we are seeking the services of four (4) web designers (firms or individuals) to construct the CSS and graphic files for a theme for our upcoming Software-as-a-Service based <a href="http://www.altsearchengines.com/2009/01/11/federated-search-finds-content-that-google-cant-reach-part-i-of-iii/" title="Read more about federated search!">federated search</a> product.</p>
<p>To this effort, we have issued a Request-for-Quote (RFQ), which is <a href="http://deepwebtechblog.com/wp-content/uploads/2009/09/Theme-Development-RFQ-2009.09.8.pdf" title="Download our RFQ, in PDF format!">available here</a>.</p>
<p>The response deadline is noon (MST), September 15th, 2009, and we will pick four from the available group of responses by September 16th.  We want the project to begin as soon as possible, with a deadline for completion of the project by October 16th, 2009.</p>
<p>It&#8217;s very important to us to have four (4) great-looking themes by October 16th.  We have included a contest within our RFQ, where we will evaluate the themes submitted and award the first-place theme a $2,000 bonus, and the second-place theme a $1,000 bonus.  <i>Note:  This bonus is only available to those four (4) web designers we have selected from the responses we&#8217;ve received to this RFQ.</i></p>
<p>If you are an outstanding web designer, and have seen some of our websites at <a href="http://www.scitopia.org" title="Visit Scitopia!">www.scitopia.org</a>, <a href="http://www.worldwidescience.org" title="Visit WorldWideScience!">www.worldwidescience.org</a>, <a href="http://www.science.gov" title="Visit Science.gov!">www.science.gov</a>, <a href="http://www.scienceresearch.com" title="Visit ScienceResearch.com!">www.scienceresearch.com</a>, <a href="http://www.biznar.com" title="Visit Biznar!">www.biznar.com</a> or <a href="http://www.mednar.com" title="Visit Mednar!">www.mednar.com</a>, and think you can do better, this is your chance to prove it and get paid for it!  <a href="http://deepwebtechblog.com/wp-content/uploads/2009/09/Theme-Development-RFQ-2009.09.8.pdf" title="Download our RFQ, in PDF format!">Download our RFQ</a>, send us a proposal by noon (MST), September 15th, and get a chance to join us in some really fun design efforts!</p>
<p>Our hope is this project will help cement a few relationships for the long-term, so we have at least a couple of outstanding web designers we can send further theme requests to.</p>
<p>Please <a href="http://www.deepwebtech.com/contact/contact.php" title="Contact us!">contact us</a> for more information, or feel free to submit your questions in the form of a comment to this blog article.  We intend to funnel all questions and answers here, so all respondents can benefit from the answers we provide to the questions we receive.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/calling-all-web-designers-rfq-for-theme-development-issued/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Leave your hat and pick behind&#8230;</title>
		<link>http://deepwebtechblog.com/leave-your-hat-and-pick-behind/</link>
		<comments>http://deepwebtechblog.com/leave-your-hat-and-pick-behind/#comments</comments>
		<pubDate>Wed, 19 Aug 2009 14:37:56 +0000</pubDate>
		<dc:creator>Darcy Pedersen</dc:creator>
				<category><![CDATA[Marketing Announcements]]></category>
		<category><![CDATA[The Deep Web]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=270</guid>
		<description><![CDATA[&#8230;It’s now easier to mine information in the deep web. In the most recent article in the National Impact Series on the Department of Energy’s Office of Science website, Abe Lederman, President and CTO of Deep Web Technologies reveals some little known company history, like how, exactly, he came up with the idea to develop [...]]]></description>
			<content:encoded><![CDATA[<p><em> &#8230;It’s now easier to mine information in the deep web.  </em> </p>
<p>In the most recent article in the National Impact Series on the Department of Energy’s Office of Science <a href="http://www.sc.doe.gov/">website</a>, Abe Lederman, President and CTO of Deep Web Technologies reveals some little known company history, like how, exactly, he came up with the idea to develop software to mine databases.  Authored by Stacey Kish, the article, <a href="http://www.sc.doe.gov/News_Information/News_Room/2009/Aug%2013_InternetSearches.html">“Scientific Internet Searches Benefit from New Technology” </a>ties Deep Web Technologies’ past SBIR grants to our current next-generation federated search technology.</p>
<p>&#8220;Deep Web Technologies is a great SBIR success story,” said Lederman. &#8220;We develop powerful search solutions that can then be used in products such as <a href="http://www.worldwidescience.org">WorldWideScience.org</a>. As the benefit of these technologies has been realized, we’ve grown. We started with 2-1/3 employees, the 1/3 being my brother, and grew to 23 employees.”</p>
<p>The article plays on the previous National Impact article <a href="http://www.sc.doe.gov/News_Information/News_Room/2009/Aug%204.html">“Surfing the Internet Gets Deep”</a> featuring WorldWideScience.org as a one-stop search engine to mine scientific information from the deep web.  </p>
<p>“Deep web search engines allow scientists, researchers, educators, and engineers to easily share and transfer knowledge that can lead to cross pollination of new ideas from different fields of study. This sharing of knowledge may lead to breakthroughs and innovation in new and unique ways.”</p>
<p>The article speaks to the future of search and the obstacles in tackling the vast amount of data on the <div id="attachment_273" class="wp-caption alignright" style="width: 310px"><img src="http://deepwebtechblog.com/wp-content/uploads/2009/08/Scalability-300x183.jpg" alt="The Divide and Conquer Approach" title="Scalability" width="300" height="183" class="size-medium wp-image-273" /><p class="wp-caption-text">The Divide and Conquer Approach</p></div>Internet today.  Deep Web Technologies uses a &#8220;divide-and-conquer approach&#8221;.    “Where a 1,000-source search engine may struggle,” said Lederman, “by combining 10 to 20 federated search engines that each search 50 to 100 sources will allow each search engine to perform a manageable amount of work.”</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/leave-your-hat-and-pick-behind/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Search Engines of the Future?</title>
		<link>http://deepwebtechblog.com/search-engines-of-the-future/</link>
		<comments>http://deepwebtechblog.com/search-engines-of-the-future/#comments</comments>
		<pubDate>Thu, 16 Jul 2009 21:59:38 +0000</pubDate>
		<dc:creator>Larry Donahue</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[The Deep Web]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=176</guid>
		<description><![CDATA[Ian Hardy wrote an interesting article entitled, the search engines of the future, where he said that the search engines of the future will incorporate advanced semantic search, voice search and make better &#8220;connections,&#8221; as in connecting-the-dots, seeing patterns and recognizing context in all its forms. It&#8217;s a quick read, so Mr. Hardy doesn&#8217;t have [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://botropolis.com/wp-content/uploads/hal_9000b.jpg" style="float: right; padding-left: 5px; padding-bottom: 5px;" height=150 width=150>Ian Hardy wrote an interesting article entitled, <a href="http://news.bbc.co.uk/2/hi/programmes/click_online/8144765.stm">the search engines of the future</a>, where he said that the search engines of the future will incorporate advanced semantic search, voice search and make better &#8220;connections,&#8221; as in connecting-the-dots, seeing patterns and recognizing context in all its forms.</p>
<p>It&#8217;s a quick read, so Mr. Hardy doesn&#8217;t have the space to devote to further clarifying what the future holds for search engines.  I think, though, the future is pretty easy to foresee.  A harder question is, what are the precursor technologies that will be essential for the future of search?</p>
<p>The computer &#8212; your computer &#8212; is a tool that consolidates your access to the information universe.  I see that computer evolving to become your &#8220;partner&#8221; or &#8220;colleague,&#8221; who knows how to communicate with you, and knows all about you, your interests, your issues and your needs.  It understands the context of you and what you&#8217;re after.  If you&#8217;re looking for flights to Chicago, it knows the difference between needing flights for tomorrow (preference on availability) versus needing flights 6 months for now (preference on price or convenience).  It knows what airline you prefer, what airports you prefer and your frequent flier id.</p>
<p>If you do online research for work, your computer automatically knows that your request of political information is personal in nature (preference on summaries) versus research regarding your professional interests (preference on new, detailed, relevant material).  Your computer &#8212; your partner &#8212; would also know when you might want to be notified of some new information or event, that you didn&#8217;t otherwise proactively request.</p>
<p>In essence, your computer &#8212; using search technologies &#8212; will eventually become your second pair of eyes and ears.</p>
<p>What technologies will make this possible?  Obviously, smarter computer intelligence, especially as it relates to decoding speech (in all flavors of dialects and languages), understanding the nuances of contextual communications with humans, better relevance and ranking between disparate sets of information, and complex pattern recognition.</p>
<p>From the search engine perspective, they will get smarter with greater access to information.  One can imagine bing, Google or Yahoo possessing pentabytes of information in their indexes. </p>
<p>I think federated search technology &#8212; and the deep web &#8212; will become center stage in the future, as no one search engine will be able to possess all the world&#8217;s data, knowledge, information, wisdom, transactions, products and services, no matter how public or private, secure or insecure, timeless or transitory.</p>
<p>Federated search technology is the glue that enables the smarter computers of tomorrow to create the search engines of the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/search-engines-of-the-future/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Westlaw Without Dialog:  What&#8217;s An Attorney To Do?</title>
		<link>http://deepwebtechblog.com/westlaw-without-dialog-whats-an-attorney-to-do/</link>
		<comments>http://deepwebtechblog.com/westlaw-without-dialog-whats-an-attorney-to-do/#comments</comments>
		<pubDate>Wed, 20 May 2009 19:56:55 +0000</pubDate>
		<dc:creator>Larry Donahue</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[The Deep Web]]></category>
		<category><![CDATA[attorney]]></category>
		<category><![CDATA[dialog]]></category>
		<category><![CDATA[intellectual property]]></category>
		<category><![CDATA[legal]]></category>
		<category><![CDATA[westlaw]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=89</guid>
		<description><![CDATA[Until 2008, Dialog was formerly part of Thomson Scientific, itself a unit of financial information giant Thomson Reuters. Westlaw, a part of West Publishing, itself another unit of Thomson Reuters, had therefore been able to provide Dialog&#8217;s powerful database of sources (which include 900 databases of intellectual property), to its large group of professional subscribers. [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://statcont.westlaw.com/images/westlaw-logo-creditcard.gif" style="float: right; padding-left: 10px; padding-bottom: 10px; padding-top: 20px;">Until 2008, Dialog was formerly part of Thomson Scientific, itself a unit of financial information giant Thomson Reuters.  Westlaw, a part of West Publishing, itself another unit of Thomson Reuters, had therefore been able to provide Dialog&#8217;s powerful database of sources (which include 900 databases of intellectual property), to its large group of professional subscribers.</p>
<p>In 2008, Thomson Reuters sold Dialog to ProQuest Information and Learning, a provider of information to researchers and libraries.  As far as I can tell, there is very little press on the impact of the Dialog sale to Westlaw and its subscribers.</p>
<p><img src="http://www.dialog.com/images/logos/dialog_logo_color150x40.gif" style="float: left; padding-left: 10px; padding-bottom: 10px; padding-top: 20px;">However, I&#8217;ve begun to notice an interesting development occurring for Deep Web Technologies, in that we have received an inordinate number of inquires and expressions of interest from the legal world over the past few months.  Larger law firms and practices (especially those involving intellectual property and legal research), are beginning to take notice of federated search technology, and many in the legal world are beginning to see how federated search can provide a strategic advantage over mere access to Westlaw or similar organizations.</p>
<p>It makes me wonder whether there is a connection to the timing of this increased interest, with the loss of Dialog for Westlaw&#8217;s users.</p>
<p><img src="http://www.proquest.com/images/core/pqlogo.jpg" style="float: right; padding-left: 10px; padding-bottom: 10px; padding-top: 20px;">Given that I am an intellectual property attorney, I can pick on intellectual property attorneys.  We&#8217;re a diverse group &#8212; more diverse than appears at first blush.  First, there are four main branches of intellectual property:  Copyrights, Trademarks, Patents and Trade Secrets.  There are specialty areas within each of them, such as trade dress under Trademarks and international under all branches.  Let&#8217;s look at Patents for a minute.  Two patent attorneys care about very different information if one patent attorney specializes in biomedical devices, and the other specializes in electronics.  That&#8217;s an easy one to understand.</p>
<p>There can be strong differences, however, for patent attorneys in the same field.  Consider two patent attorneys specializing in electronics, if one specializes in computer hardware and the other specializes in semiconductors.  Interestingly, even two patent attorneys in same field of semiconductors could differ in their research needs if one specializes in patent drafting and the other specializes in litigation.</p>
<p>I could go on.</p>
<p>The point is, each and every attorney has very unique and differing research needs.  No one information source has everything that every attorney needs.  Not only that, many attorneys in large corporations, law firms or government offices have internal databases that also become a necessary part of their research.</p>
<p>Only federated search technology has the breadth to serve the particular needs of every attorney, because federated search conducts searches against all sources an attorney feels is important to their careers:  Internal sources, subscriber-based sources (i.e. Westlaw) and favorite &#8212; subject matter specific &#8212; sources (i.e. IEEE, Wall Street Journal, etc).</p>
<p>In today&#8217;s world, where attorneys are pressured to keep fees low, yet expected to be on top of their game (both legally and in their respective subject matter), simple, affordable and comprehensive access to information is critical to success.</p>
<p>If you&#8217;re a Westlaw subscriber, I&#8217;d love to hear your thoughts on the loss of the Dialog database.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/westlaw-without-dialog-whats-an-attorney-to-do/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Opportunities with Search for Newspapers</title>
		<link>http://deepwebtechblog.com/opportunities-with-search-for-newspapers/</link>
		<comments>http://deepwebtechblog.com/opportunities-with-search-for-newspapers/#comments</comments>
		<pubDate>Fri, 20 Feb 2009 19:33:13 +0000</pubDate>
		<dc:creator>Larry Donahue</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[The Deep Web]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=74</guid>
		<description><![CDATA[Peter Krasilovsky of the Newspaper Association of America just posted an article regarding the value of web-based search to newspaper organizations (read the article). In the article, he very briefly discusses federated search and the value to newspapers.  He says, &#8220;A near relative of enterprise search is federated search. It is an effort to collect [...]]]></description>
			<content:encoded><![CDATA[<p>Peter Krasilovsky of the Newspaper Association of America just posted an article regarding the value of web-based search to newspaper organizations (<a href="http://www.naa.org/Resources/Articles/Digital-Media-Adv-Sec2-Search/Digital-Media-Adv-Sec2-Search.aspx">read the article</a>).</p>
<p>In the article, he very briefly discusses federated search and the value to newspapers.  He says, <i>&#8220;A near relative of enterprise search is federated search. It is an effort to collect all forms of content –articles, archives, advertising and classifieds &#8212; and organize them so they are all displayed in response to a user’s query (i.e. searching for &#8220;pizza&#8221; yields results that include reviews, ratings advertising for pizza restaurants and even job listings for pizza delivery. Seattle Times.com, Boston.com and StarTribune.com are all using federated search.)&#8221;</i></p>
<p>He goes on to say, <i>&#8220;Federated search is especially helpful for newspapers that have multiple websites.&#8221;</i>  With all due respect to Mr. Krasilovsk, he is promoting a common misconception about federated search.  Federated search makes it possible to search other search engines, aggregating all the results, de-duplicating them and producing a ranked list from multiple search engines.</p>
<p>This is an important distinction, because by adding true federated search to their websites, online newspapers have a tremendous opportunity to increase the value of the search results to their readers.  It&#8217;s no longer about just the information located within the news websites themselves, but about the topics of interest to their readers.  For example, a news story about breast cancer can turn up articles about the topic from the NIH, governmental sources, etc.</p>
<p>This makes for a richer experience for the user, giving them incentives to come back and stay longer, ultimately increasing ad revenues for the online newspaper.</p>
<p>Thank you, Mr. Krasilovsk. for bringing up the concept of federated search.  If you truly harness the power of federated search for newspapers, the opportunities are boundless.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/opportunities-with-search-for-newspapers/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Deep Web Search &#8211; How to optimize queries?</title>
		<link>http://deepwebtechblog.com/deep-web-search-how-to-optimize-queries/</link>
		<comments>http://deepwebtechblog.com/deep-web-search-how-to-optimize-queries/#comments</comments>
		<pubDate>Tue, 03 Feb 2009 18:22:59 +0000</pubDate>
		<dc:creator>Brian Despain</dc:creator>
				<category><![CDATA[The Deep Web]]></category>
		<category><![CDATA[queries]]></category>
		<category><![CDATA[querying the deep web]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=48</guid>
		<description><![CDATA[When conducting a Deep Web search, one of the common problems is understanding and optimizing queries.  Most deep web search engines are federated which means the query is passed simultaneously to multiple engines. If the federated search provider has done their job, then the query should seamlessly match to the logic of the native interface. [...]]]></description>
			<content:encoded><![CDATA[<p>When conducting a Deep Web search, one of the common problems is understanding and optimizing queries.  Most deep web search engines are federated which means the query is passed simultaneously to multiple engines. If the federated search provider has done their job, then the query should seamlessly match to the logic of the native interface. At <a href="http://www.deepwebtech.com">Deep Web</a> Technologies we develop out connectors to mimic the capabilities of the native interface. However connectors cannot exceed the capacity of the native interface in a federated solution. Very often because we have made the federated search so seamless to the user, they make assumptions about the capabilities about the sources being searched. So here is some simple rules to remember when doing a federated search.</p>
<p>1. Not all sources are created equal. Your complex query might run well on one source but 15 ORS strung together with 4 NOTs isn&#8217;t going to run well on most search interfaces.  (That&#8217;s a real life example btw). So start the query simply and then use a refine search mechanism or our own smart clustering engine to really narrow your query.<br />
2. Start queries simply. Our software provides a robust clustering engine which will enable you sort through the results and categorizes them.<br />
3. Remember this is the deep web, which means the sources are more technical and have far more detail. You are far less likely to encounter keyword spam in the deep web (unlike the surface web which is infested with it). This means highly technical uncommon keywords do well and the sources provide far better information. Here&#8217;s an example search <a href="http://www.google.com/search?hl=en&amp;safe=off&amp;rlz=1B3GGGL_enUS228US228&amp;ei=4IeISYboCpKWsQOl_4SYBg&amp;sa=X&amp;oi=spell&amp;resnum=1&amp;ct=result&amp;cd=1&amp;q=fibromyalgia&amp;spell=1">fibromyalgia</a> on Google. Conduct the same search at <a href="http://www.mednar.com">Mednar</a>. Here&#8217;s a link to <a href="http://mednar.com/mednar/resultList.html?ssid=71033a90%3A11f05989022%3A-4992&amp;searchUrl=search.html%3Fget%3Dtrue%26fullRecord%3Dfibromyalgia%26c%3DMED-FDAD%26c%3DTOXNETTOXLINE%26c%3DMED-NLIBM%26c%3DMED-NIDCR%26c%3DMED-ADA%26c%3DCDER%26c%3DMED-NIDCD%26c%3DMED-NHGRI%26c%3DHEL-NSTAN%26c%3DMED-NCCAM%26c%3DMEDLINEPLUS-2%26c%3DMED-JEFIC%26c%3DNCI%26c%3DMED-ODIS%26c%3DHEL-WHO%26c%3DMEDNEJM%26c%3DHEL-COCLIB%26c%3DMED-NIMH%26c%3DNGC%26c%3DHEL-NHELST%26c%3DHEL-PILOT%26c%3DHEL-CNTRW%26c%3DMDCONSULT%26c%3DMED-DHHS%26c%3DMED-SAMHS%26c%3DPESTICIDES%26c%3DTOXNETHSDB%26c%3DMED-NEI%26c%3DMED-NINDS%26c%3DMED-NIDDKD%26c%3DMED-NIEHS%26c%3DMED-NHLBI%26c%3DHEL-WCTDIS%26c%3DMED-NIAA%26c%3DCANCER-THERAPY%26c%3DPUBMED%26c%3DMED-NIBIB%26c%3DMED-NCRR%26c%3DMED-EKNIH%26c%3DAAAS-EUREKALERT%26c%3DMED-NINR%26c%3DHEL-AAG%26c%3DMED-NIAID%26c%3DCLINICALTRIALS%26c%3DHEL-IMPRO%26c%3DMED-NIGMS%26c%3DHEL-AIDS%26c%3DGOOGLESCHOLAR&amp;numberMarked=0&amp;resultPane=0">fibromyalgia</a> on Mednar. See the depth of difference in a Deep Web search?<br />
4. Fielded search is your friend. Deep web sources almost always support fielded search. This is a key advantage of the deep web over the surface web. This means you can search multiple fields and expect relevant reasonable responses.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/deep-web-search-how-to-optimize-queries/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

