Tuesday, September 28, 2010

Are we there yet?

Is anybody home?
It's been three weeks since we announced that the Esri Geoportal extension would be released as open source. It may have seemed a bit quiet since, but rest assured, a lot has happened.

In my previous post I indicated we were looking at a Creative Commons-esk license model. Several people pointed out that this license is not recommended for source code. And yes we did see the FAQ on that topic.

And so started a quest for an appropriate license model that would give everyone: developers, implementers, and (sorry folks!) Esri what they need. If there only was a geek channel on TV. this would have made a great 'America's next top (license) model' show.

Along the way we revisited a great resource collected by the state of Massachusetts IT Division. It’s a bit dated, but the basics still apply.

While Esri has experience with open source from a licensing-in perspective, with the Geoportal extension going open source, we're flipping into a new role for the first time of licensing-out Esri software under an open source license.


From the 50+ (!) models discussed in that spreadsheet, it would have been great fun to use the Motosoto or Sleepycat license models, just because of their names. But no, we didn't select either.

Geoportal Extension will be released under the Apache 2.0 license!

Are we there yet?

Almost. We now have the task of moving source code, documentation, and such to a public source repository and find a proper way to integrate/link with the Esri websites (resource centers and such). But with this big step made, we're getting close.

Tuesday, September 7, 2010

Geoportal Extension to Become Open Source

Despite code documentation and samples as included on the Geoportal Resource Center, implementers of the Geoportal Extension have continued asking for source code access to support integration with content management systems, map viewers, desktop etc.

Esri listened to these requests and I am happy to be able to announce that:

the Geoportal Extension will enter a next phase in its evolution and become a Free and Open Source solution from Esri.

Seven years ago we created what was then called the GIS Portal Toolkit as a software and services solution, based on code from a number of earlier projects and prototypes for data discovery and map viewing. 

Since starting with GIS Portal Toolkit, those using it to create geoportals and clearinghouses have had access to its source code. Understandably, as people looked at creating websites that reflected the organization's identity.

At version 9.3 this resulted in Geoportal becoming a fully supported extension with a full maintenance program. We put a lot of effort in making Geoportal Extension configurable to a large extent with respect to authentication, metadata profile support, the index used for discovery, localization, skin development, and more.

The Geoportal Extension will be released under one of the variants of the Creative Commons open source license and will include elements like:
  • Geoportal Web application
  • OGC CSW 2.0.2 catalog service with OGCCORE and ISO Application Profile support
  • INSPIRE compliant discovery service
  • Extensible FGDC, ISO, DC metadata support
  • Configurable search engine including spatial ranking algorithm
  • Federated searches to standards-based (CS-W), Web 2.0 (OpenSearch), or other types of search providers (CMS, Document Management Systems, …)
  • CSW clients (.NET + Java)
  • Data Discovery Widget for Flex
  • Data Discovery Widget for Silverlight
  • Data Discovery Widget for HTML
  • Ontology Service (java webapp)
  • WMC clients (.NET)
  • Publishing client (.NET)
We have some administration to do, but check back in later this month and next as we get the code out there. I'm looking forward to working with a larger community to further develop this product.

Monday, July 26, 2010

Data.gov Adds Geoviewer

Today, Data.gov added a new capability to its growing arsenal of tools that allow for using the data the website makes accessible. The so-called Data.gov GEO Viewer has some interesting capabilities:
  • Data loaded into viewer in Real-time through web URLs – the viewer downloads data directly from the authoritative source. An ArcGIS Server Geoprocessing service uncompresses data if needed (.zip, .gz, .tar), transforms data to JSON, and streams this back to the flex viewer.
  • The GEO Viewer loads data in Web Mercator (if data or service supports it).Otherwise the GEO Viewer changes its basemap projection to Geographic Coordinate System and loads the data.
  • The viewer supports the following data types:
    • Map Services: OGC WMS, ArcGIS
    • Feeds: GeoRSS
    • Files: KML/KMZ, Shapefile
  • The GEO Viewer allows for mashing up multiple datasets, map services, and feeds in one view. It supports basic navigation using the keyboard (without the need to use the "shift+alt+F7+drag the mouse+release alt and mouse button at the same time"-like features...).
  • Set a basic color for the added data layer, set transparency for the layers, and use a swipe/see-through feature.
  • Basic identify operation on the added data.
  • Switching the basemap.
There are some limitations with this viewer, most of which are due to the fact that it downloads data from the source every time someone wants to see it:
  • File size limit of 10MB – Shapefiles and KML files can have large compression ratios. While the registered file in Data.gov may be an under 10MB KMZ file, this can easily expand into a 100MB KML that then is streamed as JSON features to the client. This simply takes time.
  • The information about the files is not enough to make an upfront assessment of whether the file is viewable or not. Almost every file in Data.gov is a .zip file. The GEO Viewer has to determine if it's dealing with an Esri Shapefile, OGC KML, Arc/Info Export (e00, remember these?), Microsoft Excel, CSV, or whatever format(s) until after it downloads the file. The metadata in neither raw data catalog nor geodata catalog includes this information. A result is that sometimes users will only be notified that the file type is not supported until after the viewer is launched.
  • Registration of content is not readily usable by an application (James Fee found one of these...). There are several registrations of content that link to web pages or web applications, rather than the actual data. In this case, the content is however also available as an ArcGIS Server Map Service (although that's not in the registration in Data.gov).
 Are we there yet?

No. This GEO Viewer is not the end point. It's another step towards allowing users to interact and understand the data discoverable through Data.gov. The viewer illustrates the need to include more map services in Data.gov. Even registering map services alone may not be enough. In this world of service architectures, platforms, and such, we become more and more dependent on each other. Offering a service means signing up to a responsibility to keep this service running and available for an extended period of time.


Oh! And the proof of the pudding is in the eating! Here some samples:
  1. My most favorite dataset on Data.gov: the locations and characteristices of world copper smelters.
  2. For the patient folks: Active Mines and Mineral Plants in the US
  3. My next house location: Geophysical Surveys of Bear Lake, Utah-Idaho
  4. The dataset James was looking for: USGS Oil and Gas Assessment Database

Give it a whirl and provide your feedback to Data.gov.

Monday, July 5, 2010

Announcing the ArcGIS Editor for OpenStreetMap

For a while, ArcGIS users have been able to use the OpenStreetMap (OSM) content as a basemap in ArcGIS Desktop or in web applications thanks to a republishing of this content through ArcGIS Online. After the earthquakes, we have received many requests from users of ArcGIS who want to contribute to OSM, but who prefer to use the editing capabilities of ArcGIS Desktop.

For users of ArcGIS 10 this is now possible using the new free add-on ArcGIS Editor for OpenStreetMap.

The ArcGIS Editor for OpenStreetMap is designed to help ArcGIS desktop users to become an active member in the growing community of users building an open and freely available database of geographic data.

The provided tools allow the user to download data from the OSM servers and store it locally in a geodatabase. The user can then use the advanced editing environment of ArcGIS Desktop 10 to create, to modify, and to delete data. Once the edits are complete, the edit changes can be posted back to the OSM server and become available to all OSM users.

The interaction with the OSM server is accomplished using as set of geoprocessing tools to download, to manage, and to upload data.

A total of six tools support the a disconnected editing like workflow: download data from OSM, edit locally, and upload the result back to OSM.

OSM has a very flexible data model, to support some consistency in created feature types. However for more focused data capture activities, such as those that occurred after the Haiti and Chile earthquakes, a more focused data model approach is suggested. To use the new ArcGIS 10 template feature, we have mapped the common tags used in OSM to attributes and feature types, created templates for these, and implemented suggested symbols.

Editing is straightforward. After downloading your work area from OSM, you use the normal ArcGIS Desktop editing features. There are some things to keep in mind in this first release:
  • Only simple and single part geometries are supported.
  • You cannot create features with more than 5000 nodes. The OpenStreetMap server has a limit of accepting geometries with up to 2000 nodes.
  • Deleting a point, line, or polygon can have an effect of changing the relation in which the feature participates.
  • Editing of OSM relations or super-relations directly is not supported in this first release.
  • Polyons generated from data downloaded from OSM may be corrupt. To be safe: run the repair geometry tool before starting to edit.
As with any multi-user editing environment, you may run into a situation where multiple users edit the same area. This results in conflicts when trying to upload your edits to OSM. In order to mitigate the conflict the ArcGIS Editor for OpenStreetMap offers a simple Conflict Editor to help resolve the situation. Best practice is to edit a relatively small area and to save back to OSM frequently.

We are releasing this first version of the ArcGIS Editor for OpenStreetMap. and are looking for feedback. More details on the tool will become available over time, including access to the source code, and enhanced documentation.

Enjoy!

Tuesday, June 8, 2010

Can I Have One NSDI with Some Confusion on the Side Please?

In this age of publish first, then filter, and instant gratification, it is easy to loose some of the real questions out of sight. The merging of Data.gov and Geodata.gov (yes, that is the plan) raises some questions that have gotten lost in the excitement from the last week.

Here are a couple observations on the subject that could be made by anyone who has been following the two sites over the past year(s):
  1. Geodata.gov harvests most of its content from over 300 other catalogs (visit the Geodata.gov Statistics tab and view the information on Partner Collections). Data.gov does not have this capability. These catalogs represent federal, state, and local government, academia, NGO, and commercial providers of geospatial resources (visit the same tab on Geodata.gov and view the information on Publisher Affiliations). Data.gov on the other hand focuses on content from the Executive Branch of the Federal Government. Where would the remaining content of Geodata.gov go? http://www.otherdata.gov?
  2. Geodata.gov focuses on FGDC+ISO metadata with the industry looking at migrating to the new North American Profile of ISO 191xx metadata. Data.gov has developed its own metadata specification and vocabulary that is quite different from this. Just look at a details page on Data.gov to confirm this. What is the position on this subject of FGDC and other federal agencies who have created standards-based metadata for many years?
  3. Geodata.gov has focused on the GIS analysts and first responders (check the original Statement of Work, I'm sure it's online somewhere). Data.gov seems to focus on a different audience (although honestly it's not entirely clear to me if that audience consists of developers or the general public. It’s a bit of both).
  4. Geodata.gov has supported a number of user communities in two ways:
    • by allowing them to create community pages with resources beyond structured metadata that are of interest to those communities. The content in these pages is managed by the communities themselves. How should Data.gov support these communities of interest?
    • by supporting community-oriented collections that group metadata from multiple source catalogs. Examples are RAMONA (the states’ GIS inventory), the Oceans and Coast Working Group (interested in all content in the US coastal zone), and Data.gov (actually, this is also configured as a collection in geodata.gov). These collections are exposed on the Geodata.gov Search tab and in the CS-W and REST interfaces to the catalog.Where would these collections end up after a merger of Geodata.gov and Data.gov?
    • Geodata.gov has created a Marketplace where those who are looking for data and those who have plans to acquire data can discovery each other and collaborate. A dating service of a different kind. While not specifically targeted at the masses, isn't one of the key principles of NSDI to collaborate to reduce redundant investments?
  5. Geodata.gov has created a search widget that has been implemented by several agencies such as the State of Delaware that enables searching geodata.gov directly from the website and thus getting access to state and other geospatial resources covering the area of the state. This widget can mean significant cost savings for agencies as they don't have to create their own clearinghouses. Will Data.gov provide such a role as well?
  6. Through FGDC CAP grants several tools were built that work against the Geodata.gov REST or CSW interfaces. I mentioned some of these capabilities and the links to these tools in my recent blog post. Merging Geodata.gov and data.gov would ideally not break these investments.
It would be nice to see the passion that was expressed over the last week be repeated, but now discussing some of these and other questions that affect the geospatial community at large.

Sunday, June 6, 2010

Building your own ArcGIS.com client

ArcGIS.com provides a great collection of resources and, as Jack explains below, allows other people to discover the work ESRI users are doing.



ArcGIS.com includes a cool website, but as we learned when developing the Geoportal Extension, it also provides a RESTful interface. This meant we could offer users of the Geoportal Extension access to the information others are sharing through ArcGIS.com.

In the Geoportal Extension we allow distributed searches to go to ArcGIS.com. We implemented this early on in our contribution to the Group on Earth Observation

Realizing that many organizations aren't waiting for yet another portal, we developed a simple mechanism to integrate a search widget into any web page that would allow searching Geoportals. This has resulted in an HTML widget that can be embedded with 2 simple lines of HTML. By default this widget searches the Geoportal it is part of. But hold on, there's more!

The Geoportal can search external catalogs, including ones that implement the Open Geospatial Consortium (OGC) Catalog Service for the Web (CS-W), but since 9.3.1 it can also search... ArcGIS.com! Try it at the GEO Portal by going to the search page and selecting ArcGIS.com from the 'search in' dialog. You'll notice it searches ArcGIS.com with the keywords you give. This means any Geoportal 9.3.1+ is a client to ArcGIS.com.

But back to the widget.

Directing the searches from the widget to ArcGIS.com is possible by adding a parameter that instructs the Geoportal to direct the searches to the identified remote site. And thus here is a widget that searches ArcGIS.com. All it took was a minimal HTML like this:

<html>
<body>
<p>Search widget for ArcGIS.com </p>
<script type="text/javascript"  
src="http://serverapi.arcgisonline.com/jsapi/arcgis/?v=1.3" ></script>
<script type="text/javascript" 
src="http://geoss.esri.com/geoportal/widgets/searchjs.jsp?rid=ArcGISOnline" >
</script>
</body>
</html> 
 
Using these lines you could embed ArcGIS.com searches in your own web page. Using this approach, you could build your own ArcGIS.com client. Take a look at databasin.org for a more sophisticated example.

To learn more about the options for the widget, visit the Geoportal Extension help pages.

PS: At version 10, the Geoportal will support federating searches to more than one remote catalog and also include ctalogs of non-structured metadata or even non-spatial content, such as Wikipedia, Flickr, YouTube, or your document management system. Try out the weekly release of Geoportal 10 at out public sandbox. Let me know if you find any issues. We are wrapping up development, but we're open to your feedback.

Friday, June 4, 2010

Accessing the Data.gov catalog through an open interface

In its first year, Data.gov has grown from 47 datasets to over 270,000 datasets. These datasets aren’t actually hosted at Data.gov. The government agencies making these datasets available, host the files (or web services), and share them with the community through data.gov. But how did these datasets become discoverable at Data.gov?

Actually, the datasets are registered with Geodata.gov, a national catalog of geospatial resources that has been around for some 7 years and that “serves as a public gateway for improving access to geospatial information and data under the Geospatial One-Stop E-Government initiative”.

Geodata.gov provides access to almost 400,000 geospatial resources from over 300 partner collections from federal, state, and local government, as well as academia and commercial providers. Rather than having to sift through as many web sites, users can go to Geodata.gov and perform searches there. Creators of the geospatial resources can register this content with Geodata.gov if they choose to do so.  From its inception Geodata.gov has aimed to be inclusive in the sense that it doesn’t matter what geospatial technology you use to create or consume geospatial data (or web services) in order to use Geodata.gov or its content.

This design principle of being open and interoperable applies not only to the content but to the site itself as well. Since its launch Geodata.gov has provided a search interface following the Open Geospatial Consortium (OGC) Catalog Service for the Web (CS-W) specification. Later geodata.gov added a RESTful interface that returns search results as GeoRSS, KML, HTML, and GeoJSON. These interfaces are intended to support using the content registered with Geodata.gov without using the website.

The RESTful interface has been used by the Carbon Project to develop a desktop widget that allows for content discovery on Geodata.gov directly on your windows desktop, as well as developers who have extended tools like NASA’s World Wind. ESRI has developed clients for ESRI’s ArcGIS Desktop and Explorer that use the CS-W interface to provide its users with data discovery capabilities. All these are free tools intended to help bring the content registered in Geodata.gov to the users.

So what does this have to do with Data.gov? Well, when Data.gov was in search for content (pun intended), it was just common sense to reuse the effort already put in a catalog of geospatial content: Geodata.gov. Since June 2009, Data.gov has been using the CS-W interface provided by Geodata.gov.

Federal agencies can mark the content they have registered with Geodata.gov for sharing with Data.gov. It is this subset that is discoverable in the Geodata Catalog on Data.gov and you can search this subset using the interfaces mentioned before, allowing you to build your own discovery clients to the content available in the Geodata Catalog of Data.gov and include spatial searching, advances filtering, etc. Features that are not (yet) available at Data.gov itself.

How? In the RESTful interface, simply adding the parameter isPartOf=data.gov will filter Geodata.gov for content that has been marked for sharing with Data.gov. A request for orthoimagery that is discoverable through the Geodata Catalog in Data.gov thus becomes:

http://geo.data.gov/geoportal/rest/find/document?isPartOf=data.gov&searchText=orthoimagery&f=html

Doing this in the CS-W interface means creating an OGC CS-W request like this:

<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" xmlns:ows="http://www.opengis.net/ows" version="2.0.2" service="csw" xmlns:dc="http://purl.org/dc/elements/1.1/" resultType="results"> 
  <csw:Query typeNames="csw:Record">
    <csw:ElementSetName>summary</csw:ElementSetName>
    <csw:Constraint version="1.1.0">
      <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
        <ogc:And>

          <ogc:PropertyIsLike wildCard="%" escape="" singleChar="">
            <ogc:PropertyName>AnyText</ogc:PropertyName>
            <ogc:Literal>isPartOf:data.gov</ogc:Literal>
          </ogc:PropertyIsLike>

          <ogc:PropertyIsLike wildCard="%" escape="" singleChar="">
            <ogc:PropertyName>AnyText</ogc:PropertyName>
            <ogc:Literal>orthoimagery</ogc:Literal>
          </ogc:PropertyIsLike>

        </ogc:And>
      </ogc:Filter>
    </csw:Constraint>
  </csw:Query>
</csw:GetRecords>


More details on these interfaces to use the content of Geodata.gov and Data.gov’s Geodata Catalog is available in the API Documentation.

Whether you want to use the RESTful interface or prefer the CS-W + XML approach, the content in Data.gov and Geodata.gov is yours to discover. Use that content to make a nice map or two. Please don’t use it to plan your strategy to take over the world.

Friday, May 28, 2010

Clouse Encounters of the Semantic Kind

Explorers are we intrepid and bold
It was bound to happen. Some time ago I got curious about the whole semantic web thing. Working on the geoportal extension at ESRI, we're looking for ways to improve connecting users with producers of geospatial resources. With the advent of systems of systems (although sometimes feeling like turtles all the way down), assuming that a single catalog will do the trick is not an option. So I embarked on a journey into the world of linked data, RDF, and all the fun that comes with that


Out in the world amongst wonders untold
A couple months ago, I got invited to participate and present in a workshop at WMO about information access enablers. Tim Berners Lee suggested to the organizer to look into the RDF model as a way to allow linking data across organizations.

Equipped with a wit, a map, and a snack
So after seeing data.gov experimenting with SPARQL I felt it time to do some experimenting myself. Got some content from data.gov through the REST interface provided by geodata.gov (all 270,000+ geospatial datasets in data.gov actually are registered in geodata.gov and data.gov reuses this content through a web service. how gov 2.0 is that!), downloaded joseki, generated a Turtle file of the catalog, and had my own SPARQL server up and running. All while flying from Amsterdam to DC on my way from WMO to the Gov 2.0 Expo.

We're searching for fun, and we're on the right track
At Gov 2.0 I got a unique chance to sit down with TBL and discuss some of our work. You just don't pass on an opportunity like that! INFORMATION.ZIP. Later that day TBL met with Jack and it suffices to say that SPARQs flew through the room (pun intended). How to model spatial relations in RDF? How to handle relations that aren't explicitly expressed but are determined on-the-fly as a result of some question? What does 'nearby' actually mean?

Like any meeting with your professor at college, you leave said meeting with more work than you entered... I loaded various w3c documents, RFCs, and more prior to board the airplane for California.We're just starting to learn the possibilities of RDF, SPARQL. Providing a text box for someone to fill out an obscure query is not enough. But there already are some good examples available, such as the site This We Know.

to be continued...  


PS: rdf:about="http://www.blogger.com/post-create.g?blogID=1395973636692118549"

Sunday, April 18, 2010

foaf.rdf

<rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:foaf="http://xmlns.com/foaf/0.1/"
      xmlns:admin="http://webns.net/mvcb/">
<foaf:PersonalProfileDocument rdf:about="">
  <foaf:maker rdf:resource="#me"/>
  <foaf:primaryTopic rdf:resource="#me"/>
  <admin:generatorAgent rdf:resource="http://www.ldodds.com/foaf/foaf-a-matic"/>
  <admin:errorReportsTo rdf:resource="mailto:leigh@ldodds.com"/>
</foaf:PersonalProfileDocument>
<foaf:Person rdf:ID="me">
<foaf:name>Marten Hogeweg</foaf:name>
<foaf:title>Mr</foaf:title>
<foaf:givenname>Marten</foaf:givenname>
<foaf:family_name>Hogeweg</foaf:family_name>
<foaf:mbox_sha1sum>e67968237c1c331dd97df17ea80c345ba994548d</foaf:mbox_sha1sum>
<foaf:homepage rdf:resource="http://martenhogeweg.blogspot.com"/>
<foaf:depiction rdf:resource="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPX1K7Kshyphenhyphen343PJHhnT-vlE9WPpeGtq1MFWTJiu40k4XWu7cO4AAD6Ik6HABk0vgwKurGgXVZJsA1eXKc_j891oAkyXTeqH9yOSSb5Nwwda-G8P2fqW_tjRF1bJnWofynz-nxUOdgo0Tea/s220/marten.jpg"/>
<foaf:phone rdf:resource="tel:(909)-793-2853"/>
<foaf:workplaceHomepage rdf:resource="http://www.esri.com"/>
<foaf:workInfoHomepage rdf:resource="http://www.esri.com/geoportal"/>
<foaf:schoolHomepage rdf:resource="http://www.vu.nl/en/index.asp"/></foaf:Person>
</rdf:RDF>

Sunday, January 10, 2010

What type of resource are you?

As developers of discovery and access tools, we run into the situation where we discover a resource in a remote catalog and have to understand the specific type of that resource so that our clients can work with it. This is not a geo-specific problem. No one who has opened a link that ended in .pdf has ever been surprised that the resource on the other end of the URL was opened in Adobe Reader. Somehow my system recognized this to be a document and saw I have a client installed that can work with that document. This was not because of the .pdf extension, but thanks to the fact that the PDF came to me with a MIME type (maintained by IANA).

I found an old blog post on the OGC website where someone asked about dropping OGC-specific MIME types in WMS 1.3.0. The question remains unanswered in the blog since May 10, 2005...

I pose that defining MIME types for various geospatial resources (from file storage types like ESRI Shapefile to web services like OGC WMS) will benefit users of these geospatial resources and developers alike. Here I'm not speaking of the data/images obtained from these services, but (in OGC case) recognizing the service endpoint as such.