“Just-in-Time” Spatial Data Integration (or, Bridging Data Silos with Visualization)
If you work with spatial data, odds are good that it’s spread across multiple applications or data stores in a way that isn’t easy to integrate. Often, there is clear value in breaking down barriers between these “silos” (individually useful but isolated mapping solutions) as this may enable new insights or capabilities, or reduce the cost of maintaining redundant data.
I explored two approaches in a previous post, but a recent webcast (with Directions Media, Iowa DOT, and Google) opened my eyes to a new possibility: when synchronizing or centralizing data is too costly, instead use lightweight visualization tools that can pull data from multiple sources on the fly.
In my earlier post, I proposed two high-level strategies: (Option A) store all your data in one place (e.g., a database) and make all your applications work with that, or (Option B) leave existing applications with their separate data stores, but then implement processes to push their data to a central repository (e.g., a data warehouse) and publish that. Spatial Data Infrastructures are good large-scale examples of Option B, where local governments gather and manage local data and then publish it to a national or multinational repository in a common data model.
I found two interesting aspects to the webcast. First, Iowa DOT is aggressively pursuing Option A. They store the bulk of their data in an Oracle Spatial database and have adapted a wide variety of GIS and CAD applications to work with it. Other applications access the data directly via SQL or indirectly through various web services. In spite of this, due to legacy applications and partnerships with other states, they still have data silos.
Second – and this is where it gets really interesting – they presented an approach where multiple data sources are fused together in the visualization stage. The idea is that it is easier to query multiple data sources (e.g., using web services) in the visualization layer than to keep composite or centralized data stores in sync. Merging data this late in the game likely doesn’t make sense in all cases (e.g., if complex analysis or transformation is required), but the basic advantages are clear: users benefit from an integrated display of data in one system when they would previously have consulted two.
Do you have spatial data stored in multiple systems or applications? Do you see value in combining this data for visualization or analysis? If so, how are you planning to proceed? Or, if you’ve already fought these battles, do you have any insights to share?
I’m starting to get a sense of the wide diversity of approaches to breaking down barriers between you and your data. Want to hear some of ours? Tune into the free live stream of our FME 2011 World Tour event this Friday (March 4) or register to attend in-person at one of 25+ cities worldwide.