Indexing is a well-understood and broadly applied approach to data integration. Google search is probably the best example. Independent of any underlying data structure, indexing services like Google “ingest” text and make the contents available for incredibly fast search, with links to the underlying source systems. Scoring systems and relevancy metrics make indexing an ever more powerful tool to get most relevant search results within seconds.
However, indexing technologies run afoul of certain fundamental challenges. Most indexing systems underperform or even fail, because they either can’t handle the quantitative precision of the attribute search or the ‘joining’ is cumbersome to implement and substantially mitigates the benefits of the entire approach in terms of both scalability and effort. Another fundamental challenge has to do with extensibility of search. As companies seek to transition to Cloud computing, the gap becomes more important as it is typically not cost effective or even technically possible to transition these scientific search capabilities directly into the Cloud.