<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Deep Web Technologies Blog &#187; Multilingual Search</title>
	<atom:link href="http://deepwebtechblog.com/category/features/multilingual-search/feed/" rel="self" type="application/rss+xml" />
	<link>http://deepwebtechblog.com</link>
	<description>covering federated search and how to get the best from the Deep Web.</description>
	<lastBuildDate>Tue, 31 Aug 2010 22:31:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Reminiscing on a 12-Year Partnership with OSTI</title>
		<link>http://deepwebtechblog.com/reminiscing-on-a-12-year-partnership-with-osti/</link>
		<comments>http://deepwebtechblog.com/reminiscing-on-a-12-year-partnership-with-osti/#comments</comments>
		<pubDate>Wed, 25 Aug 2010 04:17:10 +0000</pubDate>
		<dc:creator>Abe</dc:creator>
				<category><![CDATA[Features]]></category>
		<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Multilingual Search]]></category>
		<category><![CDATA[View from Inside]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=901</guid>
		<description><![CDATA[This afternoon, I put aside an hour from yet another hectic day to read Dr. Walter Warnick’s article, “Federated Search as a Transformational Technology Enabling Knowledge Discovery: the Role of WorldWideScience.org.” This article by Dr. Warnick&#8211;or Walt to me&#8211;presents a wonderful overview of OSTI’s mission dating all the way back to 1947. OSTI (Department of Energy [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone size-medium wp-image-903" title="Blog Post Pic" src="http://deepwebtechblog.com/wp-content/uploads/2010/08/Blog-Post-Pic-300x139.jpg" alt="" width="300" height="139" /></p>
<p>This afternoon, I put aside an hour from yet another hectic day to read Dr. Walter Warnick’s article, “<a href="http://www.osti.gov/ILDS_38_2Warnick2010.pdf">Feder</a><a href="http://www.osti.gov/ILDS_38_2Warnick2010.pdf">ated Search as a Transformational Technology Enabling Knowledge Discovery: the Role of WorldWideScience.org</a>.” This article by Dr. Warnick&#8211;or Walt to me&#8211;presents a wonderful overview of OSTI’s mission dating all the way back to 1947. OSTI (Department of Energy Office of Scientific and Technical Information), originally known as the Technical Information Division, was tasked with collecting and disseminating the wealth of non-classified research from the Manhattan Project.  Having lived in Los Alamos the past 15 years, where development of the atomic bomb took place, I’m very familiar with the history of the Manhattan Project and the reasons behind the creation of OSTI. Nevertheless, I found Walt’s article to be an informative and insightful read that provided a unique insider’s perspective.</p>
<p>Dr. Warnick talks quite a bit about the OSTI corollary, which asserts that accelerating the diffusion of scientific knowledge will accelerate the advancement of science.  In the 12 years that I have known him, it has been Dr. Warnick’s singular goal to do everything in his power to increase the speed of scientific discovery.  I know Walt to be a trail-blazer, highly respected among federal government employees in his dedication and leadership at OSTI.  He has made major strides towards making science more accessible to “science-attentive” citizens, researchers and students.</p>
<p>The article focuses on the major role played by OSTI in championing, supporting and adopting federated search, which is the enabling technology for WorldWideScience.org, Science.gov, DOE Science Accelerator and other sites developed and maintained by OSTI. Deep Web Technologies has benefitted greatly from our 12-year partnership with OSTI, who has supported the development of the Explorit federated search technology, motivated us to keep pushing the boundaries of federated search capabilities and been an eager early adopter of our products.</p>
<p>In my next blog article,  I will be highlighting a few of the many accomplishments achieve through our partnership with OSTI, so please stay tuned.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/reminiscing-on-a-12-year-partnership-with-osti/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Breaking Down the Language Barriers</title>
		<link>http://deepwebtechblog.com/breaking-down-the-language-barriers/</link>
		<comments>http://deepwebtechblog.com/breaking-down-the-language-barriers/#comments</comments>
		<pubDate>Wed, 30 Jun 2010 17:15:34 +0000</pubDate>
		<dc:creator>Abe</dc:creator>
				<category><![CDATA[Features]]></category>
		<category><![CDATA[Multilingual Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=860</guid>
		<description><![CDATA[It was an honor to attend and for my company to have played a key role in the launch of multilingual WorldWideScience.org in Helsinki this past June 11th. Beginning more than three years ago, the R&#38;D effort that ultimately resulted in the launch of our ground-breaking multilingual federated search capability involved plenty of hard work [...]]]></description>
			<content:encoded><![CDATA[<div id="attachment_862" class="wp-caption alignnone" style="width: 310px"><a href="http://deepwebtechblog.com/wp-content/uploads/2010/06/MWWS-photo11.jpg"><img class="size-medium wp-image-862  " title="Deep Web Partners in WWS Multilingual Launch" src="http://deepwebtechblog.com/wp-content/uploads/2010/06/MWWS-photo11-300x200.jpg" alt="" width="300" height="200" /></a><p class="wp-caption-text">Photo credit: Jakke Nikkarinen/STT Info Kuva Pictured, from left, Dr. Walter Warnick, U.S. Department of Energy Office of Scientific and Technical Information (OSTI) Director; Yuri Arskiy, All-Russian Institute of Scientific and Technical Information (VINITI) Director; Tony Hey, Microsoft Research Corporate Vice-President; Richard Boulderstone of the British Library and the WorldWideScience Alliance Chairman; and Wu Yishan, Institute of Scientific and Technical Information of China (ISTIC) Chief Engineer.</p></div>
<p>It was an honor to attend and for my company to have played a key role in the launch of multilingual WorldWideScience.org in Helsinki this past June 11<sup>th</sup>. Beginning more than three years ago, the R&amp;D effort that ultimately resulted in the launch of our ground-breaking multilingual federated search capability involved plenty of hard work by lots of folks at Deep Web Technologies. It certainly could not have been accomplished without our invaluable partnerships with the Department of Energy Office of Scientific and Technical Information (OSTI), the WorldWideScience Alliance, and Microsoft Research.<span id="more-860"></span></p>
<p>Multilingual WorldWideScience was launched at a special ceremony culminating the International Council for Scientific and Technical Information’s (ICSTI) Annual Conference titled <em><a href="http://www.vtt.fi/sites/icsti2010/icsti2010_welcome.jsp?lang=en">From Information to Innovation</a></em>, which was attended by several hundred people.</p>
<p>Dr. Walt Warnick, Director of OSTI, gave the <a href="http://worldwidescience.org/speeches/June2010/warnick_multi.html">keynote</a> presentation at the launch. As the driving force behind the creation of Science.gov and WorldWideScience.org, Dr. Warnick is someone with whom I have had the pleasure of working closely in the past decade, and he has continually pushed my company to advance to the next level in state-of-the-art Federated Search.</p>
<p>Dr. Warnick started his presentation with a quote from Sir Isaac Newton (written on Feb 15, 1676 to fellow British Researcher Robert Hooke):</p>
<p style="text-align: center;">“If I have seen further it is only by standing on the shoulders of giants.”</p>
<p>Presenting his vision of accelerating access to worldwide scientific information in order to advance scientific discovery, Dr. Warnick talked about the significant role that multilingual search can play in providing both non-English speakers translated access to research in languages other than their own and English speakers with access to the ever-increasing body of non-English scientific content.</p>
<p>After Dr. Warnick’s opening remarks, I had the opportunity to demo and explain how multilingual WorldWideScience works to the Conference attendees. Rather than go into detail about my demonstration, I’d like to point you to a wonderful review of multilingual WorldWideScience written by <a href="http://intellogist.wordpress.com/2010/06/14/conduct-a-global-literature-search-in-seconds/">Kristin Whitman – Conduct a global literature search in seconds!</a></p>
<p>Next up was Richard Boulderstone, Chairman of the WorldWideScience Alliance and Director of eStrategy at the British Library. He spoke on the growth and significance of WorldWideScience:</p>
<p style="text-align: center;"><em>“Since its launch in 2007 WorldWideScience.org has grown at an absolutely phenomenal rate, providing researchers with easy access to the publicly funded research output of 65 different countries from around the world. Fast becoming a key resource for researchers around the world, these new search and translation tools are absolutely essential to opening up research and enabling the global scientific community to share knowledge in the pursuit of progress.”</em></p>
<p><em> </em></p>
<p>Tony Hey, Corporate Vice-President of External Research for Microsoft, next talked about the significance of Microsoft’s partnership with the WorldWideScience.org Alliance:</p>
<p style="text-align: center;"><em>“The launch of multilingual WorldWideScience.org adds yet another resource that we can all leverage in support of collaborative relationships. Those relationships, in turn, expedite our ability to drive research that has the power to improve lives around the world. All of us at Microsoft Research look forward to more meaningful contributions to multilingual WorldWideScience.org to make the world’s scientific and technical information globally accessible. It has been an honor to be involved in this groundbreaking project.”</em></p>
<p><em> </em></p>
<p>Following Tony Hey, Wu Yishan, Chief Engineer of the Institute of Scientific and Technical Information of China (ISTIC), addressed the audience and commented on the volume of scientific literature being published in Chinese and the growing need for multilingual searching:</p>
<p><em> </em></p>
<p style="text-align: center;"><em>“In 2008, while Chinese scholars published 110,000 papers on international journals recorded by SCI, they also published 470,000 papers on domestic Chinese journals. Without accessing these 470,000 papers, it is impossible to obtain a realistic feeling about the thrust of scientific and technological advancement in China. Therefore, the need for mutual translation between English and Chinese and for cross-language retrieval is increasingly urgent.”</em></p>
<p>The final remarks at this event were made by Yuri Arskiy, Director of the All-Russian Institute of Scientific and Technical Information, who spoke through a translator. The unavailability of Russian scientific research in English was  a major impetus for the development of multilingual WorldWideScience. Dr. Arskiy addressed the many challenges in searching multiple databases, such as different specifications and software platforms and various classification systems across scientific disciplines. WorldWideScience.org&#8217;s federated search technologies overcomes these challenges and will make Russian science results more accessible than ever before.</p>
<p>Finally, Wu Yishan returned to the stage to announce that ISTICI will be hosting the next ICSTI Annual Conference in Beijing in June 2011. If I’m going to have a reason to attend the ICSTI Conference in Beijing next year, my team at DWT, in collaboration with our partners at OSTI and the WorldWideScience Alliance, better get busy planning and implementing the next round of enhancements to WorldWideScience.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/breaking-down-the-language-barriers/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>No Alternative to Federated Search</title>
		<link>http://deepwebtechblog.com/no-alternative-to-federated-search/</link>
		<comments>http://deepwebtechblog.com/no-alternative-to-federated-search/#comments</comments>
		<pubDate>Thu, 03 Jun 2010 22:59:08 +0000</pubDate>
		<dc:creator>Darcy Pedersen</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Multilingual Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=768</guid>
		<description><![CDATA[[ Editor's note: This article first appeared in the OSTI Blog and then in the Federated Search Blog. Dr. Walt Warnick, Director of the Office of Scientific and Technical Information, part of DOE, and Sol Lederman co-authored the article. For some important search applications there is no alternative to federated.] Discovery services have begun to [...]]]></description>
			<content:encoded><![CDATA[<p>[ Editor's note: This article first appeared in the <a onclick="javascript:urchinTracker('/outbound/www.osti.gov/ostiblog/home/entry/no_alternative_to_federated_search1?ref=/');" href="http://www.osti.gov/ostiblog/home/entry/no_alternative_to_federated_search1">OSTI  Blog</a> and then in the <a href="http://www.federatedsearchblog.com">Federated Search Blog</a>. Dr. Walt Warnick, Director of the <a onclick="javascript:urchinTracker('/outbound/www.osti.gov?ref=/');" href="http://www.osti.gov/">Office  of Scientific and Technical Information</a>, part of DOE, and Sol Lederman  co-authored the article. For some important search applications there is  no alternative to federated.]</p>
<p><img src="http://federatedsearchblog.com/images/OneWay.png" alt="" width="225" height="82" align="right" />Discovery services have begun to  appear in the search landscape.  Discovery services provide access to  documents from publishers with which they have relationships by indexing  the publishers’ metadata and/or full text. Discovery services are  marketed to libraries where patrons appreciate near-instantaneous search  results and where library staff is willing to restrict access to  sources available from the service (and optionally the library’s own  holdings.)  While these services tout themselves as improvements to  federated search, the reality is that there is no alternative to  federated search for a number of important applications.</p>
<p><a onclick="javascript:urchinTracker('/outbound/worldwidescience.org?ref=/');" href="http://worldwidescience.org/">WorldWideScience.org</a> is a global gateway to science. The federated search application was  conceived and developed at OSTI and hosted by us. The portal performs  live federated search of 70 databases from 66 countries. Participating  members provide access to their national research databases. For a  number of reasons this important gateway to millions of research  documents does not lend itself to the discovery service model.</p>
<p>WorldWideScience.org content is free to  the public.  Several difficult technical hurdles make it highly  impractical to index content from member databases. The first challenge  is that most databases will not provide a harvesting mechanism such as  OAI-PMH. Without such a mechanism there is no method of predictably  harvesting the entire contents of a database. From OSTI’s perspective,  it is not acceptable to provide access to only a subset of a scientific  collection. Federated search completely avoids this problem by having  the source’s search engine query the entirety of its contents.</p>
<p>The second major challenge is that meta data does not exist for  documents in many of the databases in WorldWideScience.org. Discovery  services rely upon meta data to “homogenize” information about documents  that they place in their unified indexes.</p>
<p>A third challenge is that WorldWideScience.org will soon be  multi-lingual. While discovery services could pre-translate contents,  doing that would be impractical as the volumes are so huge and  constantly expanding.</p>
<p>A fourth challenge to indexing all of the content from  WorldWideScience.org is that the science portal federates portals which  themselves are federated search applications. These challenges make  indexing and packaging the contents of WorldWideScience.org so  expensive, difficult, and time consuming that no organization is likely  to do it.</p>
<p>The onerous technical hurdles that would need to be overcome to make  content such as that in WorldWideScience.org searchable by a discovery  service illuminate the case for federated search. In the federated  search model, content providers need only provide a search interface to  their database, which they are already providing to their users.  Ideally, the search interface is one that lends itself to machine search  and retrieval. But even if it is not, in most cases, if a human can  search it, a federated search application can be programmed to search it  also. Also, federated search does not expect metadata.  WorldWideScience.org serves its content owners by eliminating all  barriers to participation. Even language translation is not a burden to  the database owners. If the member nations sanction a particular  database then the burden of inclusion of that database is taken on  solely by the vendor that developed and maintains the federated search  engine, <a onclick="javascript:urchinTracker('/outbound/deepwebtech.com?ref=/');" href="http://deepwebtech.com/">Deep  Web Technologies</a>.</p>
<p>Another advantage of federated search is that applications can be  easily integrated with other applications.  For example, <a onclick="javascript:urchinTracker('/outbound/scienceresearch.com?ref=/');" href="http://scienceresearch.com/">ScienceResearch.com</a> provides access to a mix of proprietary and open content, such as  WorldWideScience. Through our federated search approach, the  WorldWideScience.org Alliance maintains autonomy while extending the  reach of its materials. Best of all, we do all of this without burdening  anyone. In this way we advance our mission of accelerating science.</p>
<p>But don’t take us wrong.  We at OSTI would welcome a discovery  service which seeks to make DOE material more accessible.  OSTI systems  are already set up to facilitate such a collaboration.  However, the  technology of discovery services is less suitable for certain important  purposes, like WorldWideScience.org, now fulfilled by federated search.</p>
<p>Walt Warnick<br />
OSTI Director</p>
<p>Sol Lederman<br />
OSTI Consultant</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/no-alternative-to-federated-search/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>NFAIS on Multilingual Search</title>
		<link>http://deepwebtechblog.com/nfais-on-multilingual-search/</link>
		<comments>http://deepwebtechblog.com/nfais-on-multilingual-search/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 09:00:15 +0000</pubDate>
		<dc:creator>Darcy Pedersen</dc:creator>
				<category><![CDATA[Features]]></category>
		<category><![CDATA[Multilingual Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=485</guid>
		<description><![CDATA[Abe Lederman, our company President recently presented on Multilingual Search at the NFAIS conference yesterday.  His talk, entitled &#8220;Federated Search: Breaking Down the Language Barrier&#8221; addressed the development of translation tools on the web, and how Deep Web Technologies is supporting to organizations find and deliver the most relevant results, regardless of language. The presentation [...]]]></description>
			<content:encoded><![CDATA[<p>Abe Lederman, our company President recently presented on Multilingual Search at the <a href="http://www.nfais.org/">NFAIS</a> conference yesterday.  His talk, entitle<a href="http://deepwebtech.com/talks/NFAIS.pdf"><img class="alignright size-medium wp-image-489" title="Breaking Down the  Language Barrier" src="http://deepwebtechblog.com/wp-content/uploads/2010/03/FireShot-Pro-capture-044-NFAIS_pdf-application_pdf-Object-deepwebtech_com_talks_NFAIS_pdf-300x224.png" alt="Breaking Down the Language Barrier" width="300" height="224" /></a>d &#8220;<a href="http://deepwebtech.com/talks/NFAIS.pdf" target="_blank">Federated Search: Breaking Do</a><a href="http://deepwebtech.com/talks/NFAIS.pdf" target="_blank">wn the Language Barrier</a>&#8221; addressed the development of translation tools on the web, and how Deep Web Technologies is supporting to organizations find and deliver the most relevant results, regardless of language. The presentation highlights the WorldWideScience.org translation environment currently in our engineering department, where our translation application will integrate with the current version of <a href="http://www.worldwidescience.org" target="_blank">WorldWideScience.org</a> in June.</p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/nfais-on-multilingual-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Multilingual Search that Google Imagines</title>
		<link>http://deepwebtechblog.com/the-multilingual-search-that-google-imagines/</link>
		<comments>http://deepwebtechblog.com/the-multilingual-search-that-google-imagines/#comments</comments>
		<pubDate>Mon, 28 Dec 2009 16:48:00 +0000</pubDate>
		<dc:creator>Darcy Pedersen</dc:creator>
				<category><![CDATA[Federated Search]]></category>
		<category><![CDATA[Multilingual Search]]></category>

		<guid isPermaLink="false">http://deepwebtechblog.com/?p=392</guid>
		<description><![CDATA[Earlier this year, Deep Web Technologies announced the development of a multilingual translation capability.  Since then, we&#8217;ve been talking with a number of people regarding this groundbreaking service that will be released in 2010.  Don DePalma, Chief Research Officer at Common Sense Advisory wrote a post on the 17th of December about this feature.  Please [...]]]></description>
			<content:encoded><![CDATA[<p><em> Earlier this year, Deep Web Technologies announced the development of a multilingual translation capability.  Since then, we&#8217;ve been talking with a number of people regarding this groundbreaking service that will be released in 2010.  <a href="http://www.commonsenseadvisory.com/AboutUs/ExecutiveTeam/DonaldADePalma/tabid/124/Default.aspx" target="_blank">Don DePalma</a>, Chief Research Officer at <a href="http://www.commonsenseadvisory.com/" target="_blank">Common Sense Advisory</a> wrote a post on the 17th of December about this feature.  Please visit his company blog to read the <a href="http://www.globalwatchtower.com/2009/12/17/multilingual-search-deepweb-google/">entire article</a>.<br />
</em></p>
<blockquote><p>Clicking on the Advanced Search tab, we thought of the <a onclick="javascript:urchinTracker ('/outgoing/en.cop15.dk/');" href="http://en.cop15.dk/" target="_blank">U.N. Climate Change Conference</a> as we looked for “global warming.” We limited our search to sources in the Czech Academy of Science, just two of the dozens of scientific sites from around the world in languages ranging from Chinese to Japanese to Russian. The first hit was an English-language paper on “Modeling mortality risks due to heat stress in East Asia” from the Czech Academy,  while the second was “Budeme žít v globálním Somálsku? : O klimatickém konci civilizace a strachu z katastrofy” (”will we live in a global Somalia? The climatic end of civilization and fear of disaster”). Being in a <a onclick="javascript:urchinTracker ('/outgoing/www.imdb.com/title/tt1190080/');" href="http://www.imdb.com/title/tt1190080/" target="_blank">2012</a> on-<a onclick="javascript:urchinTracker ('/outgoing/www.imdb.com/title/tt0898367/');" href="http://www.imdb.com/title/tt0898367/" target="_blank">the-road</a> kind of mood, we also looked for “pandemic” in a broader pool of sites, yielding French, Chinese, and Spanish articles (<a onclick="javascript:urchinTracker ('/outgoing/www.globalwatchtower.com/images/multi-mingual_2009-12-17/Multi-Lingual_2009-12-17.png');" href="http://www.globalwatchtower.com/images/multi-mingual_2009-12-17/Multi-Lingual_2009-12-17.png" target="_blank">click here for a screenshot</a> of our query results).</p>
<p>What happened behind the scenes? Deep Web translated the search terms into the languages of the sources, searched them, and returned some translated details.</p></blockquote>
<p><em><a href="http://www.globalwatchtower.com/2009/12/17/multilingual-search-deepweb-google/" target="_blank">Read the rest&#8230;</a></em></p>
]]></content:encoded>
			<wfw:commentRss>http://deepwebtechblog.com/the-multilingual-search-that-google-imagines/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
