Ben,
The organization of data into collections, both physically and logically,
is primarily an implementation issue. Ron is right that the THREDDS
catalog can be implemented in ebRIM profile in CSW with data hierarchies be
exposed to the clients. Because the goal, and more importantly, the
resources, of our project is to make data cataloged in THREDDS be
searchable through the CSW GMU hace already implemented, we didn't plan to
implement a new CSW server that can fully expose THREDDS catalog including
its hierarchies. Our focus was to mapping the THREDDS metedata items to
the GMU CSW so that CSW clients can search THREDDS data, through the
mapping relationship, in the CSW server. This means that we are focused on
searching data based on client provided criteria, rather than display the
data products hierarchically, in the CSW. We need to include in our
logical data model the parent and child relationships between a data
product, either a collection or a direct data set, and its parent (for
non-root data, i.e., highest collection) and its child (for non-leaf data,
i.e., a direct data set). With such we will be able to know the hierarchy
of the catalog. To expose the hierarchy to a client, the server needs to
know to which level the client is expecting to view. It seems that in
THREDDS, the server will always expose the immediate child nodes of a
specific node to a client and the client can then click on one of the child
nodes to see the next level nodes. This can be recursively continue until
leaf nodes, direct data sets, are exposed. In CSW, our current
implementation is to search all direct data sets and return a subset which
meet the search criterion/criteria. With in parent/child relationship be
added, our CSW can also be implemented to search either immediate child or
n-level down for a certain node. However, additional information
(parameters/Values) will be needed to specify how a client can let the
server know the depth of search and from where the search begins. We need
to investigate if such additional parameters/values are compliant with the
ISO19115 profile (it should not be a problem in ebRIM because it's flexibly
extensible). 19115 does provide description of metadata hierarchical
levels but this parameter/value information for specifying levels of
exposure is a different issue.
In summary, I agree that the data collection/hierarchy information is
useful in many ways. If time and resources permit, we'll certainly explore
the this. At this moment, we want to focus on our original plan to
complete the mapping of THREDDS to 19115 at data set level.
Regards.
Wenli
At 19:06 2006-11-19 -0700, Ben Domenico wrote:
Hello again,
I think I may need to clarify the situation. The current primary focus of
the CSW/THREDDS gateway that GMU is working on is mainly on mapping
THREDDS catalog metadata to ISO 19115 so that THREDDS metadata can be made
available in an international standard form. It would be a mistake to
divert that project from it's high priority objectives.
On the other hand, the question of how to provide inventory catalogs of
"collections" of datasets, and catalogs of those collections -- as THREDDS
does -- keeps coming up in many different settings. It arose in the OGC
GALEON interoperability experiment; it came up in discussions at the 3rd
Interoperability Workshop on the Automated Harvesting of Data and Metadata
last week.
So I sent the message to the THREDDS and GALEON email lists in order to
get a wider group thinking about the issue which I think is a key to
making all these data services work together. For those of you who are
not familiar with THREDDS catalogs, an example of a heirarchical set of
catalogs is available for a variety of real-time data at:
http://motherlode.ucar.edu:8080/thredds/catalog.html
As you will note as you drill down through the collections, you can get
the underlying xml representation of any of any of these catalogs by
replacing the .html with .xml in the URL.
From Ron Lake's notes, it sounds like CSW.ebRIM can be used to provide
this type of functionality via a standards-based interface.
It's important though that, while we consider the long range goals, we
also retain realistic expectations of the current project.
I hope this clarifies rather than confuses the issue.
-- Ben