User:Vieglais/SOC GUID

From Phyloinformatics
Jump to: navigation, search

GUID Test and Monitor Service

Rationale
Globally Unique Identifiers (GUIDs) are a critical component of open data sharing systems to avoid collision between data objects. There are several systems for implementing resolvable GUIDs (RGUIDs) which enable a user to retrieve metadata about the object, and/or the object itself over the Internet. Some widely used mechanisms for RGUIDs are plain old URLs (URI, http://www.w3.org/Addressing/), persistent URLs (PURL, http://purl.oclc.org/), Digital Object identifiers (DOI, http://www.doi.org/), and Life Sciences Identifiers (LSID, http://en.wikipedia.org/wiki/LSID). All of these schemes rely upon at least one service endpoint that will take an RGUID and return something useful about it, and so all schemes are subject to failure if there is some problem with access to or function of the service(s). Likewise, systems such as a distributed data archive framework, relying upon these RGIUD services would also be subject to these quality of service issues. As such an important addition to the universe of RGUID implementations would be a mechanism for testing (e.g. Does the RGUID resolve? Is the service accessible? Is the service responding in a timely manner?), and monitoring (alert admins in event of failure).
Approach
  1. Define the testing and monitoring system requirements, architecture, internal data model, and programmatic and user interfaces.
  2. Identify requirements for a statistically appropriate measure of accessibility for a potentially very large number of individual RGUIDs for the various schemes.
  3. Implement or modify existing code to produce RGUID clients for the common schemes, with each client presenting a common API for integration with the testing and monitoring framework.
  4. Implement a simple, extensible framework for executing tests of all protocols at arbitrarily scheduled times.
  5. Implement the monitoring service which is able to generate continuous reporting statistics about the availability of an arbitrary number of RGUID service endpoints
Challenges
The different RGUID systems have conceptually similar operations, though differ in their underlying implementation. Providing a common client API and implementation will present some difficulties. It is expected that greater challenges will be faced in the efficient implementation and operation of the monitoring service as a very large number of
Involved Toolkits or Projects
Mentors
Dave Vieglais (vieglais at ku edu)