Re: [netcdfgroup] nf90_char size

  • To: Dave Allured - NOAA Affiliate <dave.allured@xxxxxxxx>
  • Subject: Re: [netcdfgroup] nf90_char size
  • From: Wei-Keng Liao <wkliao@xxxxxxxxxxxxxxxx>
  • Date: Sat, 2 May 2020 22:48:34 +0000
  • Authentication-results: noaa.gov; dkim=none (message not signed) header.d=none; noaa.gov; dmarc=none action=none header.from=northwestern.edu;
Hi, Dave

Thanks for following up with the correct information about the dimension 
objects.
I admit that I am not familiar with the NetCDF4 dimension representation in 
HDF5.

Wei-keng

> On May 2, 2020, at 5:28 PM, Dave Allured - NOAA Affiliate 
> <dave.allured@xxxxxxxx> wrote:
> 
> Wei-king, thanks for the info on the latest release.  Minor detail, I found 
> that hidden dimension scales are still stored as arrays, but the arrays are 
> left unpopulated.  HDF5 stores these as sparse, which means no wasted space 
> in arrays that are never written.
> 
> For Davide, I concur with Wei-king that netcdf-C 4.7.4 is okay for your 
> purpose, and should not store wasted space.  Version 4.7.3 behaves the same 
> as 4.7.4.
> 
> I wonder when they changed that, some time between your 4.4.1.1 and 4.7.3.  
> Also you used HDF5 1.8.18, I used 1.10.5.  That should not make any 
> difference here, but perhaps it does.
> 
> 
> On Sat, May 2, 2020 at 1:01 PM Wei-Keng Liao <wkliao@xxxxxxxxxxxxxxxx> wrote:
> 
> If you used the latest NetCDF 4.7.4, the dimensions will be stored as scalars.
> 
> Wei-keng
> 
> > On May 2, 2020, at 1:42 PM, Davide Sangalli <davide.sangalli@xxxxxx> wrote:
> > 
> > Yeah, but  BS_K_linearized1 is just a dimension, how can it be 8 GB big ?
> > Same for  BS_K_linearized2, how can it be 3 GB big ?
> > These two are just two numbers
> > BS_K_linearized1 =  2,025,000,000
> > (it was chosen has a maximum variable size in my code to avoid overflowing 
> > the maximum allowed integer in standard precision)
> > BS_K_linearized2 = 781,887,360
> > 
> > D.
> > 
> > On 02/05/20 19:06, Wei-Keng Liao wrote:
> >> The dump information shows there are actually 8 datasets in the file.
> >> Below is the start offsets, sizes, and end offsets of individual datasets.
> >> There is not much padding space in between the datasets.
> >> According to this, your file is expected to be of size 16 GB.
> >> 
> >> dataset name                 start offset    size            end offset
> >> BS_K_linearized1             2,379           8,100,000,000   8,100,002,379
> >> BSE_RESONANT_COMPRESSED1_DONE        8,100,002,379   2,025,000,000   
> >> 10,125,002,379
> >> BSE_RESONANT_COMPRESSED2_DONE        10,125,006,475  2,025,000,000   
> >> 12,150,006,475
> >> BS_K_linearized2             12,150,006,475  3,127,549,440   15,277,555,915
> >> BSE_RESONANT_COMPRESSED3_DONE        15,277,557,963  781,887,360     
> >> 16,059,445,323
> >> complex                              16,059,447,371  8               
> >> 16,059,447,379
> >> BS_K_compressed1             16,059,447,379  99,107,168      16,158,554,547
> >> BSE_RESONANT_COMPRESSED1     16,158,554,547  198,214,336     16,356,768,883
> >> 
> >> Wei-keng
> >> 
> >>> On May 2, 2020, at 11:28 AM, Davide Sangalli <davide.sangalli@xxxxxx> 
> >>> wrote:
> >>> 
> >>> h5dump -Hp ndb.BS_COMPRESS0.005000_Q1
> 
> <snip>