Most search engines use spider/crawler type programs to find information, following trails of hyperlinks that link the never-ending Web. But this leaves an almost infinite amount of data which lies below the surface unexplored. Deep Websearch start-up companies are trying to develop programs that analyze search terms to then broker the query to relevant databases.
Google's strategy sends a program to analyze every database's content, "define" its content, then hit it with related search terms to develop a predictive model of what the database contains. Another start-up company is attempting to index every public database, hitting them with automated search terms to "dislodge" the information. The goal is interconnected data... a cross-referencing of pre-analyzed information to best answer a specific query. It's almost as if these Deep Web programs are alive, reasoning, and thinking for themselves.
The article mentions that in the future, Google may have problems implementing a "change." There is a fear of overcomplication and driving away faithful users. Also, I wonder if all of this will end up making things more efficient. When described on paper, it seems very promising. But, how does a program sift through an infinite amount of information to find, link, and cross-reference reliable and accurate information? How does it link seemingly unrelated information? How does it stay up-to-date? If we are going to be able to answer questions about flight fares, etc.. those are changing minute-by-minute.
Finally, it is mentioned that the long-term implications for something like this are directed more towards businesses and less towards individual web-surfers. But, I think that it is also very important for libraries. The article mentions health sites cross-referencing pharmaceutical companies and medical research, or news sites cross-referencing public records on government databases. This could have a large impact on the types of information that libraries could provide.