[IGS-DCWG-12] Re: [IGS-DCWG-10] Resupplied Data, Comments from BKG
Michael Scharber
mscharber at josh.ucsd.edu
Mon Jun 16 14:55:06 PDT 2003
******************************************************************************
IGS-DCWG Mail 16 Jun 14:55:10 PDT 2003 Message Number 12
******************************************************************************
Author: Michael Scharber/SOPAC
I have been pleased to read the various points offered on this thread and
would like to throw in my own humble observations.
First of all, I'm ashamed to ask but I must:
Is there A SINGLE LIST of sites (four character code or whatever),
maintained by A SINGLE person/agency, pertaining to data that MUST be
archived at all global data centers? And, by list I mean something
available anonymously through ftp|http which is also easily parsed
AND limited only to these sites.
I only ask because I myself do not even know, specifically, which sites
comprise the set which all GDCs are required to archive, and therefore
mirror from/to other GDCs.
My ignorance aside, if I can get this topic straight at least, then I at
least know what it is we're raising questions about in the first place.
Here then are three topics of personal interest to me.
a) IGS Network Topology --- PUSH, DON'T PULL --- I fully support the
notion that data for "these" sites (pending definition or my own
education) ALWAYS follow a PUSH strategy where (at least):
a1. The originator of data for a given site [re]supplies data to
an IGS archive (regional, global or whatever) whenever THEY
feel necessary.
a2. ONE AND ONLY ONE global data center supplies THE OTHER TWO
with data for a subset of sites from "the list". This subset
is mutually exclusive with regard to the subset of the other
global data centers.
a3. There may be originators of data for one or more sites from
"the list" who can very well PUSH their data to ALL THREE
global data centers. These sites would then NOT exist in the
subset lists of any of the GDCs.
a4. If I don't make any sense on any of a1-a3 then remember
this...no GDC PULLS data from another GDC. Ideally, to
best achieve the "mirror" among GDCs (in my opinion), there
should be NO PULLING of data among GDCs. Each GDC should have
an upload ftp server ready to receive data from other GDCs and
IGS data centers PUSHING data onward.
b) "Knowledge" of [re]submitted IGS raw data files
I agree with Heinz that the GSAC could serve very well as an effective
means of identifying data resubmissions, as well as primary submissions.
I think there is a great deal of utility inherent in the GSAC which is not
being used. One of which Heinz points out is data "publication" time.
This piece of information, attributed to every file published to the GSAC,
is (by definition) in UTC. It is also something that gets updated when
data is "republished" to the GSAC as well, thereby lending itself to
useful statistics gathering and informed re-retrieval on the part of the
user community.
Of course, the catch is all GDCs must be GSAC Wholesalers. We're close to
this actually, with SOPAC and CDDIS already GSAC Wholesalers, and IGN
working on becoming one shortly.
I believe that if we can straighten out these two aspects of IGS data
archiving then we have a good base to begin approaching some of the many
good points offered by Edouard, Nacho, Carey and Heinz.
c) IGS Data Resubmissions.....Notification Service:
I like this concept alot but feel it needs further discussion. Most
importantly, how many of these services would exist? One? Three (one for
each GDC)? Or dozens (one at each GPS archive sprinkled around the
world)? How many should a user know/care about? How many would be
posting redundant messages (messages already posted by another archive)?
I think there would be more trouble generated, and user confusion created,
if more than one such service exists. How can there only be ONE then?
I think that IF, at least to start, all GDCs perform as GSAC Wholesalers
then a single, third-party, agency/individual (perhaps an analysis center)
could use the GSAC (write a simple application to routinely check for
resubmissions of IGS data and weed out duplicate copies) to host a DCWG
Resubmission listserver similar to this one......creating emails with a
specific format that, perhaps, Nacho could supply. Then, the IGS mailing
list would be trimmed of such messages, and users could choose to
subscribe to the DCWG Resubmission listserver if they care to. The
emails, as Nacho explains, would be both human readable and machine
parsable. The catch (at least with regard to the GSAC) is there is
currently NOTHING in the GSAC to allow for a statement of "why" a
particular data file was resubmitted.
That's a problem.....but something the GSAC could possible adapt to
handle.
Sorry this message was sooooooooo long.
Best Regards everyone.
Michael
> Dear Colleagues,
>
> some remarks to the discussion about resupplied data:
>
> 1) It is practical to distinguish
>
> Case a) Data flow between various data centers (operational, local,
> ======= regional and global)
>
> and
>
> Case b) Data flow between a data center and an analysis center
> =======
>
> 2) If all data centers follow the "put approach", resubmitted files
> could easily and correctly be handled. As soon as an updated file
> occurs in the "incoming" directory of a data center it will be
> forwarded to the next level, e.g., from local to regional data center.
> "Case a)" could thus be satisfied. The mirror between the
> global data centers has to be arranged diffenrently.
>
> 3) The announcement of a file resubmission by IGS-Mail is meaningful, but
> not reliable enough from the analysis center's point of view.
>
> 4) More difficult to handle is "Case b)". The analysis centers need to
> know about new submissions. The GSAC (GPS Seamless Archive Center)
> initiative stores the file creation date and the corresponding providers
> into the "data holding file (*.dhf)". GSAC could serve as the
> "central information source" for analysis centers to check for
> resupplied data (provided all data centers participate in GSAC.
>
> Perhaps my remarks could contribute to the discussion.
>
> With kind regards,
>
> Heinz
>
> -------------------------------------------------------------------------
> Dr. Heinz Habrich
>
> Bundesamt fuer Kartographie und Geodasie
> Richard-Strauss-Allee 11, 60598 Frankfurt am Main, Germany
>
> Phone +49 69 6333267 E-Mail heinz.habrich at bkg.bund.de
> Fax +49 69 6333425 URL http://www.bkg.bund.de
> -------------------------------------------------------------------------
>
--
*******************************************************
Michael Scharber
Scripps Institution of Oceanography
Institute of Geophysics and Planetary Physics
8785 Biological Grade
IGPP Room 4212
La Jolla, CA 92037
mscharber at josh.ucsd.edu
(858)534-1750
*******************************************************
More information about the IGS-DCWG
mailing list