The International Nuclear Information System, INIS, operated by the International Atomic EnergyAgency (IAEA) in Vienna, Austria, houses the INIS Collection which offers access to over 3.6 million bibliographic records and over 350,000 full-text documents on topics such as Radioactive Waste Management, Nuclear Safety and the Fukushima and Chernobyl accidents. This set of non-conventional literature is available directly through the INIS online search, and is also included in the WorldWideScience.org federated search portal.
The WorldWideScience.org portal, governed by the WorldWideScience Alliance, is a global gateway to scientific databases and portals. The portal accelerates information discovery through a one-stop search of databases around the world, including the INIS Collection.
Cooperation between the International Nuclear Information System (INIS) and WWS dates back to June 2009 when the first contacts were made during the Summer Public Conference, Managing Data for Science, organized by the International Council for Scientific and Technical Information (ICSTI) in Ottawa, Canada. Following the conference, a WWS Alliance meeting was organized where inclusion of the INIS database as a new information resource into the WWS was discussed.
In 2012, WorldWideScience.org was the top referrer to the INIS Collection, with search numbers predicted to grow.
Current statistics of searches coming to the INIS Collection Search from WWS are impressive. There were almost 70 000 unique searches during the first 7 months of 2015 and this number is constantly increasing. The first 7 months of 2015 alone generated the same amount of searches as the whole of 2014. By the end of 2015, the number of searches coming from WWS to INIS is expected to reach 100 000.
Deep Web Technologies proudly provides the federated search technology that powers the WorldWideScience.org portal. In June, DWT upgraded the WorldWideScience.org portal to include a mobile, responsive design and localization of page text. Other small improvements ensure that WorldWideScience will continue to advance information discovery for its researchers and Alliance members.
We are now a Microsoft Translator Alliance Partner. DWT was invited to join the brand new Microsoft Translator Partner program which is designed to create opportunities for businesses to deliver best-in-class translation solutions.
DWT uses the Microsoft Translator API to create multilingual search solutions for global customers. These solutions tap into the available Microsoft Translator languages and automatic translation capability so researchers can query multilingual databases in their own language, and then translate results from the sources back into their query language. Behind the scenes, DWT and Microsoft Translator shift the query from the user’s language into the language of the information sources, search the sources for the most relevant results, retrieve, aggregate and rank the results, and then translate the results back into the user’s selected language. It’s a seamless, powerful process that melts walls between researchers of different languages by allowing information discovery and flow without the need for personal translators.
This unique, multilingual search feature originally launched with WorldWideScience.org in 2010. WorldWideScience.org is a global science gateway comprised of national and international scientific databases and portals. With the DWT-Microsoft Translator partnership, users can search in ten different languages and translate results from sources in multiple languages back into their selected search language.
In early 2015, Microsoft sponsored a “Customer Story” page entitled “WorldWideScience Alliance and Deep Web Technologies”, revealing the history, technology and benefits of the WorldWideScience.org multilingual solution. In April, DWT followed the customer story with a guest article in Multilingual Magazine, expanding on the nature of multilingual applications, the development and benefits of multilingual searching for government agencies and global companies.
Deep Web Technologies’ is the technology behind other notable public multilingual search portals such as Science.Ciencia.gov and UNECA’s ASKIA.org. Each of these custom portals uses DWT’s Explorit Everywhere! federated search technology and the Microsoft Translator API to search and translate.
DWT’s partnership with Microsoft Translator is vital to the steady pulse of our multilingual applications. We believe that each multilingual application we build furthers the cross-pollination of global ideas and scientific advancement. Share our vision: In the not-too-distant-future, language barriers for science researchers and global business searchers no longer exists, and informed research will access critical information in multilingual databases with one search.
Marcy Phelps recently posted her “Favorite Alternatives to Google” on her blog, Phelps Research. Her list of four top Google alternatives included DuckDuckGo, TinEye, WolframAlpha and Biznar.
Biznar – When I need to cut through the clutter and focus on high-quality sources, I head to this deep web business research portal. Right now they draw results from about 100 authoritative sources, and they’re in the process of expanding that list.
Of course, we were pleasantly surprised to hear about this since Biznar is one of four free public portals created and maintained by DWT.
In a separate post on the Somerville Public Library Blog, “Searching the Deep Web“, the post author, Kevin, relays his surprising results from a broad Biznar search:
Another database described in this article was BizNar, a business research database created by Deep Web Technologies. I did a search for “cookies” (just because) and got hits for (among many other things) dessert blogs, a CNN report on the black market for cookies in Hong Kong (yes, really), a pay-to-view analysis of the world market for cookies, and a bid from the Bureau of Prisons stating that the penitentiary in Marion, IL is looking to purchase 10,000 cookies, 25,532 bagels and/or muffins and/or cinnamon rolls, and 64,440 shelf-stable flour tortillas. There were also (of course) lots of articles on the virtual cookies that marketing companies leave on your computer.
DWT is in the process of adding and updating sources in Biznar. If you have suggestions on how to make the portal more useful, new sources to add, or category changes, please let us know!
Spend Matters July 29, 2015 post, Accenture and the Future of Procurement Technology: The Virtual Company Mall, discusses one of five focus areas for advancement in e-procurement technologies mentioned in the Accenture Report: Procurement’s Next Frontier – The Future Will Give Rise to an Organization of One. According to Accenture, the five focus areas are the Virtual category room, the Virtual supplier room, the Supplier network, the Virtual company mall, and Supply analytics, and they are required for future procurement success. These solutions will unite: “cross-functional teams, internal stakeholders, business users, other buying companies, logistics providers, unchartered suppliers, transactional suppliers, strategic suppliers and innovators.”
Accenture describes the Virtual Company Mall like this:
Owned and managed by procurement, the Company Mall will feature a cloud-based set of pre-approved private and public ‘shops’ (i.e., including content from outside the company) ?from which internal customers can select goods and services, supported with business logic that guides their purchasing based on policies, preferred suppliers and contracts. The mall will include a robust service desk that directs customers to the right shops and provides spot-buy services as required, as well as virtual agents delivering consistent and automated buying support. These services will be enabled by a mix of digital disruptors, including cognitive systems through intelligence augmentation and, where possible, intelligent automation.
What will it take to make the Virtual Company Mall a reality? You guessed it: Federated Search. Spend Matters states: “It will require a strong e-procurement capability inclusive of powerful federated search, including internal catalog, supplier hosted catalogs, traditional web storefronts and dynamic storefronts.”
The existing Explorit Everywhere! federated search capabilities could easily be applied to future procurement technologies and the Virtual Company Mall vision. Integrating federated search technologies like Explorit Everywhere! will smooth the transition from current procurement practices to the single search, Virtual Mall. Consider:
- Searching disparate sources, such as internal catalogs, supplier catalogs, and multiple storefronts in real-time for up-to-date prices, descriptions and codes.
- Aggregating results for top relevant products and services to maximize efficiency.
- Flagging duplicate products and services for easy vendor/product comparison.
- Narrowing results to small selections to save to the supply chain or to purchase products or services outright.
- Retrieving information about products or services from obscure companies or little known vendors ensuring comprehensive source research, comparison and vendor discovery.
- Integrating best practices for search and retrieval of products, as well as e-commerce technologies, thereby creating a seamless experience for the procurement team.
Federated search is Deep Web Technologies’ specialty. We are the experts in single search technology. Aggressive adopters of new procurement technologies should contact Deep Web Technologies to discuss partnership opportunities.
Explorit Everywhere! supports all kinds of fancy search logic, such as Boolean operators, nested parenthesis and advanced search fields. This is a big selling point for organizations with researchers who want to search for a precise bit of information, retrieve a small result set, and then narrow that set of results even further with filters and sorts.
Serious researchers know where their information might be hiding, how to search for their information, and what search string may bump their information out of the source and into their lap. Using Explorit Everywhere! can save these researchers time and effort; one search across all of the resources they need to examine takes just a quick click of the search button.
Researchers familiar with a wide range of resources also know that some database search engines are stuck in the stone age. These engines simply do not process advanced logic such as parenthesis, wildcards, many advanced search fields or even Boolean operators. When researchers search these directly, they must use basic search strings to even retrieve results at all. For these engines, searching with broad queries, and iteratively searching and reviewing results is just part of the package. In contrast, a few modern search engines can handle extremely complex queries, replete with parenthesis, quotations, and Booleans and wildcards, handing the user a golden platter of relevant results.
Explorit Everywhere! usually includes both types of sources, with search engines supporting the very simple to the most complex queries. When a user submits a search string, Explorit Everywhere! must first evaluate the query. The resource connector, a bit of code that submits the query string from Explorit Everywhere! to the source, is programmed to “know” the source parameters and limitations. The connector acts as a proxy for the researcher by submitting the query to the resource, then retrieving the results for Explorit Everywhere! to rank against results from other sources. In many cases, the query string is submitted to the source exactly as it was entered by the user. In other cases, however, the query must be reshaped to make the string more acceptable to source idiosyncrasies. And, although Explorit Everywhere! connectors are very good at understanding how a source work and optimizing the query to submit to the source, there is only so much they can do when faced with a complex query and a Neanderthal source. It’s like trying to fit a square peg into a round hole.
To ensure consistent results retrieval, complex queries may not be the best place to start a search. If a complex search through Explorit Everywhere! yields results from only a few sources, researchers should consider rephrasing a query into simpler terms. To ferret information out of selected sources, including the more primitive sources, start with a simpler search to retrieve a broad results set from most or all of the selected resources.
Once the results are retrieved, researchers can view the whole playing field – all the results, across all of their resources. Starting from this vantage point, the informed researcher can narrow that field with laser precision, using filters, sorts, tabs and clustering, or iterate their search based on their initial findings.
Of course, users can still search with complex Boolean strings, taping together a montage of brackets and parenthesis with a patchwork of wildcards and quotations. Explorit Everywhere! will dutifully perform a search and no doubt return results, often with great success. For those elusive results, consider broadening the search query, and narrowing the results set after all sources have returned their results successfully. While this process may seem backward and less efficient initially, it ultimately delves deeper into those entrenched databases containing pertinent information.
Deep Web Technologies’ fearless leader, Abe Lederman, will travel from New Mexico, USA to Coventry, UK to attend the first ever Resource Discovery Tools for Health Libraries meeting on September 11. The growing use of discovery tools in the healthcare sector prompts discussion about suitable technologies and the nature of search for health librarians.
The event is hosted by the University Hospitals of Coventry and Warwickshire NHS Trust. Founded in 1948, the NHS includes four individual systems:
The NHS aims to provide a wide range of free health services in response to the needs and requirements of the population.
This event is free to all health librarians. DWT, as one of the event sponsors, will have an opportunity to present how Explorit Everywhere! applications further search and discovery in the healthcare industry.
University Hospital, Coventry, United Kingdom
Friday, 11 September 2015 from 10.00 to 16:00 (BST)
If you plan on attending the conference and would like to meet with Abe, please let us know as soon as possible. His schedule is quickly filling up!
As much as we like to think that Explorit Everywhere! is simple to use, it still holds the junior heavyweight championship title for feature-rich technologies. Throw on top of that the concepts of “federated search”, “Deep Web”, and “discovery services”, and it’s easy to get lost in a maze of information. Since Deep Web Technologies is all about pulling the needle out of the haystack for you, we thought it was time to create an easy reference post on the world of Explorit Everywhere!. Want to know where you can find out about how we rank results? How about federated search or the Deep Web? Take a gander through some of these posts:
Explorit Everywhere! Features
The Deep Web
You may also enjoy reading this post from the Federated Search Blog enumerating informational posts about federated search and discovery services. A couple of key posts include:
WorldWideScience.org has received a tremendous amount of press so far in 2015. On January 8th, Microsoft published a case study on WorldWideScience.org and Deep Web Technologies:
“WorldWideScience.org is the result of years of research and innovation. Although the underlying technology itself is exciting, Deep Web Technologies and the WorldWideScience Alliance are most interested in what it enables for users. “This solution increases access to worldwide information, which is the biggest benefit,” explains Johnson. “We search approximately 100 repositories that we estimate include more than 500 million pages of science and technology information. So instead of having to go to 100 different sources to find content, WorldWideScience.org using Microsoft Translator offers the ability to search all of them with a single query.”
Then, the April/May issue of Multilingual.com Magazine published an article entitled, “Advancing Science by Overcoming Language Barriers.” The article discussed the rise of WorldWideScience.org and its role in bridging language barriers using Microsoft’s machine translation.
In late June, Deep Web Technologies updated WorldWideScience.org, just in time for the WorldWideScience Alliance meeting in Germany. Responsive design is now an integral part of the application making it much easier to add new features now and in the future. The spotlight enhancements include:
- Mobility: WorldWideScience.org is now mobile and can now be accessed from any device. When a user goes to the application on a mobile device, the interface will automatically adjust to their screen size, making it easier to search and view results.
- Localization: While WorldWideScience.org has been a multilingual application for years, allowing users to translate results into their language of choice, now, when a user chooses English, Spanish, French or Portuguese, WorldWideScience.org will automatically update the interface text to the selected language too.
There are a host of other small improvements to WorldWideScience.org. This upgrade is setting the stage for future enhancements such as MyLibrary, the ability to save results for future reference, and additional language localizations. Take a look from your smartphone or tablet and let us know what you think!
WorldWideScience.org isn’t the only application recently updated. Science.gov received a facelift recently as well.
People tend to think of Google as the authority in search. Increasingly, we hear people use “google” as a verb, as in, “I’ll just google that.” General users, students and even professional researchers are using Google more and more for their queries, both mundane and scholarly, perpetuating the Google myth: If you can’t find it on Google, it probably doesn’t exist. Google’s ease of use, fast response time and simple interface gives users exactly what they need…or does it?
Teachers say that 94% of their students equate “Research” with “Google”. (Search Engine Land)
“Another concern is the accuracy and trustworthiness of content that ranks well in Google and other search engines. Only 40 percent of teachers say their students are good at assessing the quality and accuracy of information they find via online research. And as for the teachers themselves, only five percent say ‘all/almost all’ of the information they find via search engines is trustworthy — far less than the 28 percent of all adults who say the same.”
Do teachers have a point here? Is it possible that information found via search engines is less than trustworthy, and if so, where do teachers and other serious researchers need to go to find quality information? Deep Web Technologies did a little research of our own to see just how results on Google vs. popular Explorit Everywhere! search engines differs in quality of science sources.
How Google Works
Google, and other popular search engines such as Bing and Yahoo, search the surface web for information. The surface web, as opposed to the Deep Web, consists of public websites that are open to crawlers to read the website’s information and store it in a giant database called an index. When a user searches for information, they are actually searching the index of information, not the website itself. The results that are returned are the ones that people seemed to like in the past, or most popular results for the query. That’s right…the most popular…not necessarily the most relevant information or quality resources.
We should probably also mention those sneaky ads at the top of the page that look informative, but can be quite deceptive. A JAMA article states this about medical search ads:
“Many of the ads, the researchers noted, are very informational — with ‘graphs, diagrams, statistics and physician testimonials’ — and therefore not identifiable to patients as promotional material.
This kind of ‘incomplete and imbalanced information’ is particularly dangerous, they note, because of its deceptively professional appearance: ‘Although consumers who are bombarded by television commercials may be aware that they are viewing an advertisement, hospital websites often have the appearance of an education portal.'”
Researchers thinking that Google reads their mind and magically returns the right information on the first page of results should think again. The #1 position on a Google results page gets 33% of the traffic, so is a highly sought-after spot on a Google page. Unfortunately, with SEO tricks inflating page-rank on Google and ads vying for top spot, that number one result, or even the top page of results, may not be entirely germane or even contain much scholarly content. But those results rank high because they’ve worked the Google system.
So, a search performed on Google may return educational results, but the source itself may be unreliable, pure opinion or even company marketing as in the example above. For those needing credible information from recognized, authoritative sources, Google results just don’t cut it. For example, searching for the term “Climate Change” and organizing the top 25 results into categories – Opinions, News, Government, Ads, Wiki Sources, Peer Reviewed and Education – we find that the two biggest categories are News and Opinions. This doesn’t support Google as an authoritative source of information for scientific research.
Where are Quality Science Sources?
Scholarly researchers may need some publicly available information, but more often than not they need information that is not publicly available, i.e. from Google. Much of what they look for is in password protected repositories, subscription databases, or part of an organization’s internal collection of information. These sources of information are not available to Google’s crawlers, so they are not available through Google. Databases and sources of information like these are part of what is known as the Deep Web. The Deep Web contains 95% of the information on the Internet, such as scientific reports, medical records, academic information, subscription information and multilingual databases. You can read more about the Deep Web here.
How is a Deep Web Search Better than Google for Scholars?
For scholars needing to go deeper into their research, Deep Web databases often contain key information and current data unavailable through Google.
Deep Web sources must be searched through specialized search engines, like Explorit Everywhere! by Deep Web Technologies. Explorit Everywhere! combines all of the Deep Web resources, making them available to search from a single search box, kind of like Google. But, there are no gimmicks, SEO tactic to get the results higher up on the page or sly ranking systems that websites can use to maneuver themselves into the number one position. It’s a simple matter of good sources and good results, aggregated and ranked so the best results are at the top. Don’t worry about wading through ads or junky opinions; if you’re searching through Explorit Everywhere!, you are searching high quality, relevant sources.
Explorit Everywhere! outperforms Google by eliminating the clutter and providing dependable, scholarly sources of current information to the user. Time and again, Explorit Everywhere! has proven itself to find the needle in the haystack for serious researchers.
Do Your Own Comparison – Google vs. Science.gov
Most Deep Web search engines are, well, Deep. They aren’t freely available because the sources themselves are private or only available to registered users. Most academic libraries subscribe to premium sources of information, for example, and those databases are considered part of the Deep Web since they aren’t available to search through Google. And, while some reputable sources of information that once existed only on the Deep Web, such as PubMed and NASA, are now publicly available through Google, these sources tend to get buried amidst other results so they aren’t always easy to find. Many libraries feature these authoritative databases in guides, links or in search portals like Explorit Everywhere! simply to highlight the source rather than forcing users to wade through un-relevant results.
There are a few publicly available search engines where you can test drive a Deep Web search and see the difference for yourself. Science.gov, developed and maintained by the DOE Office of Scientific and Technical Information, uses Explorit Everywhere! to search over 60 databases and over 2200 selected websites from 15 federal agencies. The results are from authoritative, government sources, and extraordinarily relevant. When you perform a search on Science.gov, there is no question about the sources you are searching. Explore the difference!
Whether you are a student or scientist, knowing where to start your science search is very important. In most cases, serious research doesn’t start with Google. A 2014 IDC study shows that only 56% of the time do knowledge workers find the information required to do their jobs. Having the right sources available through an efficient Deep Web search like Explorit Everywhere! is critical to finding significant scientific information and staying ahead of the game.
I recently had a conversation with a VC and he brought up the acronym “SMAC”. SMAC, he explained, stands for Social, Mobile, Analytics and Cloud, and pointed out that these four areas are red-hot with investors now.
In a Forbes, May, 2014 blog article, Ravi Puri, Senior Vice President, North America Oracle Consulting Services defined SMAC and talked about: “The convergence of these trends is creating a coming wave of disruption that will let companies drive improved customer satisfaction, sustainable competitive advantage and significant growth in enterprise value—but only if you are ready for it.”
More recently Casey Galligan, Morgan Stanley Wealth Management Market Strategist, advises investors to not shy away from this sector and invest in leading SMAC companies and writes: “We believe that companies levered to these key secular growth areas will continue to be differentiators.”
It is an exciting time to be Deep Web Technologies, as we have been working in a number of these areas for a while now and are poised to make significant contributions to advance the state-of-the-art of all SMAC technology areas directly and through partners in the years ahead. Let me give you some examples:
- Social – At its heart, Explorit Everywhere! connects people to information. That’s one reason that Explorit Everywhere! naturally integrates well with social networking sites. These sites offer rich information to end-users in the form of opinions, rants, new developments, scientific breakthroughs and more. An organization may have a variety of social networks supporting their philosophy and marketing their brand, such as Twitter, Facebook, LinkedIn, Pinterest, and blogs. These social networks are plenty rife with interesting and useful tidbits for marketing folks, researchers, students and other professionals alike. Explorit Everywhere! can search all of these networks for relevant information in five seconds or less. To follow things up, Explorit Everywhere! lets the user share what they’ve found back to their own networks, completing the number one rule of thumb for social networks: share and share alike. Social integration engages users and simplifies the searching and posting to multiple networks by social networking users.
- Mobile – The mobile wave is more than just a fad; it’s the future. As we mentioned in our previous post, Explorit Everywhere! Goes Mobile, when we reach the year 2020 we may see around 50 billion connected devices slinging information around the world. When it comes to mobility, we needed Explorit Everywhere! to be flexible and device-driven, with an ultra-sleek user interface. Advances in mobile technology require that we stay up-to-date, and Explorit Everywhere! accomplishes this through its use of responsive design and vigilance of new devices searching our application.
- Analytics – Explorit Everywhere!’s statistics package has been collecting usage statistics for years now which enable our clients to maximize the ROI of the content that they license. Deep Web Technologies is an expert at gathering information from multiple sources, aggregating the results and categorizing them into concepts that expand the breadth of a researcher’s information. But even beyond that, Explorit Everywhere! can feed collected, pinpoint information it retrieves into best-of-breed analytical tools and software for further filtering and sifting. Explorit Everywhere! complements big data dashboards by funneling a broad swath of relevant material down the pipe for further analysis. On the front-end Explorit Everywhere! can also enhance what the user sees in the dashboard with complementary information drawn from a variety of sources, both internal and external to an organization.
- Cloud – Enterprise search is moving toward the cloud, and with that comes silos of information lost in the cloud. Explorit Everywhere! performs a real-time search, of multiple databases across multiple clouds of information together with information residing in Corporate silos that have not been moved to the cloud. These clouds may be behind a firewall, or outside of the firewall, but often stump indexers due to the nature of resources. Explorit Everywhere! connects to the databases wherever they are making the world a much smaller place.
Explorit Everywhere!’s integrated SMAC features create a holistic search experience, ensuring that our clients are at the forefront of technology, and not trailing behind the curve. With the best of this generation and next-generation technology, Explorit Everywhere! clients are part of the changing technology scene. We’re riding not just the mobile wave, but regularly improving connections to social networks, tuning our analytics and simplifying our cloud-based technology. And, the process of finding the most current information will shift as the future unfurls. Explorit Everywhere! will leverage SMAC and other next-generation technologies to embrace new concepts, connect with data wherever it may sit, and engage our users. Explorit Everywhere! is state-of-the-search.