[thredds] Non-linear growth of query time for NCSS point request

  • To: "thredds@xxxxxxxxxxxxxxxx" <thredds@xxxxxxxxxxxxxxxx>
  • Subject: [thredds] Non-linear growth of query time for NCSS point request
  • From: Marcelo Andrioni <marceloandrioni@xxxxxxxxxxxxxxxx>
  • Date: Tue, 7 Jan 2020 16:46:31 +0000
  • Authentication-results: correio2.petrobras.com.br; spf=Pass smtp.mailfrom=marceloandrioni@xxxxxxxxxxxxxxxx; dkim=pass (signature verified) header.i=@petrobrasbr.onmicrosoft.com; dmarc=pass (p=quarantine dis=none) d=petrobras.com.br
  • Ironport-sdr: MQtwo8bKy2wKf17sSMEwS1mog6f7Rqemj+bOPl5mRLkFydCYbs6YcsTQQq8BbYQeVA2kVlWO2p gxMarvFMRlpQ==
Hello,

I am seeing some problems when running very long requests from a single 
location using NCSS.
My dataset has 40 years of monthly files with hourly data (from ERA5). The 
dataset dimensions are: 
time 355728 X latitude 176 X longitude 172

When requesting one year of data for four variables for a single location (NCSS 
Grids As Point Data) it takes 5.2 seconds. I would expect that two years would 
take 10.4s and so forth, but I am seeing a non-linear growth in the time. I run 
a few tests using wget to retrieve the data and got:

query (years)   real time (s)   expected time (s)
1       5.2     5.2
2       13      10.4
3       23      15.6
4       37      20.8
5       53      26
6       70      31.2
7       91      36.4
8       117     41.6
9       142     46.8
10      175     52

The query for the whole dataset (40 years), expected to take 208s (5.2 x 40) 
took 53 minutes (3180s). Did anyone faced a similar problem? Maybe I need to 
make some changes to the cache configuration?

The results are for a dataset aggregated with '<aggregation dimName="time" 
type="joinExisting">', but similar results were found for this dataset when 
using FMRC (a little slower actually).

The system configuration is:
THREDDS 4.6.14
Apache Tomcat/8.5.45
Linux Kernel 4.15.0-72-generic

Thank you.

Greetings
--
Marcelo Andrioni