[IGS-DCWG-71] Re: [IGS-DCWG-70] Re: [IGS-DCWG-68] Re: [IGS-DCWG-67] Re: [IGS-DCWG-65] Change in compression scheme
Giovanni Sella
Giovanni.Sella at noaa.gov
Fri May 7 14:09:49 PDT 2010
******************************************************************************
IGS-DCWG Mail 07 May 14:10:01 PDT 2010 Message Number 71
******************************************************************************
Author: Giovanni Sella
Dear Colleagues,
With much delay I'm going to add my comments on this proposed change. I
have talked to the NOAA archive group that supports us, the National
Geophysical Data Center, and got a very strong recommendation for gz
over bzip2. They actually only pull our gz copies and not our Z copies
of our files. This is the basic logic:
gzip is used by most archive centers as it is much faster than bzip2
It supports archiving multiple files like tar, bzip2 does not.
It does not compress as much as bzip2 but the tradeoff in faster speed
makes it greatly preferred. This is a very important consideration
given the ever increasing number of stations. AC would pay a very high
price in the uncompressing of the files with bzip2.
gzip has been used for many years and is tested on numerous platforms
and understood, bzip2 is a new beast.
gzip has the option to increase the compression rate from the default 6
to 9 with no extra cpu overhead when checksums are run.
NGDC has been after me for some time to add MD5 checksums as a more
robust checksum than those in gzip. Just this week I rolled out this new
set of files 1 for every file they archive. I'm doing it forward in time
for the moment and then will go back and do the older data as well. The
very small MD5 checksum could easily be exchanged between data centers
to ensure that the same data holdings exist without having to query the
gzip file itself.
I personally am a very strong proponent of abandoning UNIX compress .Z
and switching to .gz and have discussed this a couple of times in the
past with Carey and Angie, but the re-processing was always looming and
that was daunting enough without having to mix in a change in compression.
My two cents. Have a nice weekend.
Giovanni
Carey Noll wrote:
> ******************************************************************************
> IGS-DCWG Mail 01 Apr 12:42:21 PDT 2010 Message Number 70
> ******************************************************************************
>
> Author: Carey Noll
>
> bzip2 will provide better compression according to my system
> administrator. He is currently running some tests of various
> compression utilities; I will send a summary report to the DCWG
> when he has finished his tests.
>
> I would recommend doing these changes in steps. Let's tackle
> this proposed compression change first before we start thinking
> about more "drastic" changes such as in filenaming, etc. That is
> my preference/recommendation anyways!
>
> I also would like to encourage more feedback on Mike Schmidt's
> earlier email proposal for OCs to push data to all global data
> centers. It sounds to me like an "easy" implementation to help
> keep data current and available at all GDCs.
>
> Thanks,
> Carey.
> -----
> On Apr 1, 2010, at 11:42 AM, Nacho Romero wrote:
--
Giovanni Sella, Ph.D.
CORS Program Manager,
National Geodetic Survey For EXISTING CORS e-mail:
NOAA-NOS, SSMC3-8716, 1315 East-West Hwy. ngs.corscollector @
noaa.gov
Silver Spring, MD 20910, USA For PROPOSED CORS e-mail:
Tel 001-301-713-3198x126, Fax 001-301-713-4324 ngs.proposed.cors @
noaa.gov
CORS Guidelines: www.ngs.noaa.gov/CORS/Establish_Operate_CORS.html
More information about the IGS-DCWG
mailing list