Reflecting on the ESRI Fed UC – Is It JUST About the Data?
I just returned a short time ago from the ESRI Federal User Conference, which as always was a great conference (Stu Rich and James Fee have some good recaps). Reflecting on the conference it is clear that the times they are a changing and the future is exciting. There was a lot of buzz around deploying solutions or services into the cloud. ESRI demonstrated its ArcGIS 10 capabilities and the seamless integration between the desktop and the the web – ArcGIS 10 is cloud ready and is indeed a very ambitious and exciting release. As any developer knows the best way to get started with a task are examples and for ArcGIS 10 ESRI is working to deliver a lot of examples/templates to help users get started. Crowdsourcing, or Volunteer Geographic Information (VGI) as ESRI called it, was also a hot topic at the show. ESRI demonstrated how its tools are making it easy for every citizen to be a sensor.
John Calkins opened his session with the statement “Who cares about the new stuff?” He made the point that with ArcGIS 10 users will find the interface easier to use, users will be able to do tasks faster and hence be more productive. At the end of the day that is the most important thing to users. Indeed, everyone cares about performance, ease of use, and productivity. As usual John hits the nail on the head. Kudos also to the District of North Vancouver as its website was highlighted for its user interface during the plenary.
At the conference we talked with many organizations and one theme that came up again and again was building services on or integrating data from disparate datasets.
Whether at the State level; where they are working to combine data from counties, or at the National level; where they are working to combine data from states – the challenge is the same. In order to create these statewide or nationwide services or a Common Operating Picture, you must first address both data integration issues. Whether you are combining all the contributors data into a single data repository, as has been done by the State of Indiana, or are simply creating a common federated view across a set of distributed servers – the problem is the same. In order to solve the data integration issue, you must resolve both “data integration” and “data model integration” issues to effectively combine data from multiple sources.
What is the difference between “Data Model Integration” and “Data Integration”?
- Data Model Integration is resolving differences between different schemas. For example this attribute becomes that attribute, this table maps to these tables, this dataset is in projection X and needs to be in projection Y, or this data is in a topological organization and this is in a polygonal form. The point here is that, in general, it is not related to the health of the actual data.
- Data Integration is what happens when you bring datasets together. At boundaries where data sources meet, you often find so-called “tile issues” where data is missing, or does not line up, or where overlaps occur that shouldn’t be there. Another common data integration example is when you have redundant data from different sources that need to be combined, using the best that both have to offer while avoiding duplicates in the result.
For effective construction of services that span multiple data jurisdictions both “data model integration” and “data integration” must be addressed. As we work to take FME forward we are constantly looking at both of these challenges in an effort to solve all integration headaches.
After all, It’s All About the Data er…Data Model!