Hi Stonie,
On Thu, Jan 22, 2026 at 3:52 PM Stonie Cooper <cooper@xxxxxxxx> wrote:
> Daryl and Mike - I did follow up with NOAA on the Canadian RADAR data. As
> a reference, a clip from the answer:
>
Thank you for that response, very informative.
> This, along with the Great Lakes wind/wave data output (^L) are ending up
> with a feedtype that doesn't aline with expectations because the SBN
> Product header is being utilized differently than designed or has been used
> in the past, such that the noaaportIngester won't know that the Canadian
> RADAR is RADAR or the wave/wind model output as NGRID or HDS.
>
> ...
>
> My initial thought was to simply update the LDM noaaportIngester to catch
> the errant data, but was given an immediate brake-check by the car ahead of
> me that the fix would have to be propagated across the entirety of the IDD
> simultaneously and across all versions that happen to running at the same
> time. So, the solution has be one that can be utilized by all versions and
> permutations - and thus the regex filtering on the user side that doesn't
> touch the insertion point.
>
I guess this is where I'm getting confused. Hope you don't mind me rubber
ducking you on this...
- What would obviously be required is to change noaaport ingester code on
the top level nodes. For the IDD, iirc that's Unidata, Wisconsin and a
couple of other sites that feed back into the top level @ UPC. I agree,
all those points would need to update more or less simultaneously, but just
a few key sites shouldn't be hard...
- Sites that rely on IDD exclusively would not need to take any further
action as the IDD is now the "clean feed."
- Sites that have their own NOAAPort installation would now have a
situation. If they rely on their own NOAAPort feed exclusively, they could
opt into this adjustment or not but it'd be their call. But sites that
have both NOAAPort and a redundant IDD feed would definitely need to update
their ingester code or they'll see duplicate products across that spectrum!
- Sites that receive data from multiple sources now have to be concerned
about what their blended up upstream providers are doing. For example, if
I'm getting data from the IDD and COD, I need to ensure COD updated their
NOAAPort ingest code to match Unidata's, again at about the same time, or
I'll still see duplicates. While I'm sure COD would play nice, it would
still be out of my hands at that point; either I still need to filter as
you're describing or I'd need to drop them as a redundant source if they
neglect that update on their end.
You're right, this gets complicated and unpredictable as one goes further
down the tree. It'd be situational depending on who a site is REQUESTing
from, and worst is it would transcend tiers while sites further down the
tree would be at the mercy of _all_ nodes that are up their branch. I had
to think it through but yeah, that's ugly and with no solid way to prevent
impactful duplications.
Given that I gotta agree, the safe call is to avoid edits to noaaport
ingester code to move products to other feeds, that'd be a dangerous course
correction.
> As you are both knowledgeable in the area of constructing regular
> expressions to the benefit of your operations, I post here for the benefit
> of others as a reminder that monitoring data and constructing regular
> expressions is your best defense against overutilization of storage. The
> filtering string that *may* be used:
>
> DDPLUS|IDS ^([^LS]|S[^D]|SD[^C]|SDC[^N])
>
Thanks for the REGEX. I wish we could add some NOT patterns in easier ways
than this, but I understand and no that's not really an ask. We appreciate
you providing that to us & everyone!
My response does not address the metadata archaeology that has to be
> performed on the unidentifiable GRIB - that is a whole separate issue. For
> those that have students in data labs, this is an opportunity to provide
> those students with experience by giving them the task of hunting down the
> identity of these new products and providing clarity on the correct
> decoding.
>
I want to add onto this to say it was purely my curiosity that drove me
into that grib data earlier. If I could say one thing to the students out
there, GET CURIOUS! Go hunt down the identity of these products like what
I was trying to do, even if just for your own understanding. Find some
grib2 tables, learn what they represent and what tools are out there to
interrogate this data (e.g. wgrib2) and answer these questions. That's how
problems get solved. This is a complicated technical arena with lots of
moving parts, but people like Daryl, myself, Unidata staff and others on
these lists always love to help. And that's what these lists are for too.
So if you're a student, get curious; try to figure this stuff out, look up
what the keywords we're using here mean, and give it a try for yourself.
Faculty and program leaders, show your interested students these lists and
encourage them to participate and ask questions, that's why we're here.
Alright, putting that soap box away now. Thanks again for the intel
Stonie, appreciated!
Best,
-Mike