Yet more fun!
I just discovered that it doesn't fail every time. The Python script reads
the entire product from stdin, writes it to disk, and then sends a message
to QPID (AWIPS).
I think the read/write part is failing, as I have a zero-byte file
resulting from a failed run. I might just put a bash script in front of it
that executes "cat > filename", then call the Python.
Matt
On Thu, Apr 23, 2026 at 10:08 PM Mike Zuranski <mike@xxxxxxxxxxxxxxxxxxx>
wrote:
> Matt,
>
> I'll level with you, at this point I'm stumped too. In my experience this
> doesn't sound like a LDM issue, especially if it started spontaneously and
> is not resolved by remaking the product queue and either an LDM or an OS
> restart. There must be some clue though, either by invoking the script
> manually in different ways (e.g. cat, xargs), using different python
> interpreters or something else to recreate the issue manually and checking
> the exit code that way. Or by wrapping the PQACT PIPE command in a shell
> script to capture/log any error output to a designated log file. There must
> be another breadcrumb somewhere...
>
> Best,
> -Mike
>
> On Thu, Apr 23, 2026 at 4:12 PM Matthew Foster - NOAA Affiliate <
> matthew.foster@xxxxxxxx> wrote:
>
>> I have the python interpreter hard-coded and explicit in the script
>> shebang line.
>>
>>
>> On Thu, Apr 23, 2026 at 3:10 PM Pete Pokrandt <poker@xxxxxxxxxxxx> wrote:
>>
>>> I'm old school - I usually say
>>>
>>> which python
>>>
>>> at the command prompt where it works, which should return something like
>>>
>>> /usr/bin/python
>>>
>>> or
>>>
>>> ~/miniforge3/bin/python
>>>
>>> and then hard code that into my script on the top line - e.g.
>>>
>>> #!/home/poker/miniforge3/bin/python
>>>
>>> Pete Pokrandt - System Engineer IV
>>> UW-Madison Dept of Atmospheric and Oceanic Sciences
>>> 608-262-3086 - poker@xxxxxxxxxxxx
>>> ------------------------------
>>> *From:* Michael Taylor - NOAA Affiliate <michael.c.taylor@xxxxxxxx>
>>> *Sent:* Thursday, April 23, 2026 2:59 PM
>>> *To:* Pete Pokrandt <poker@xxxxxxxxxxxx>
>>> *Cc:* Mike Zuranski <mike@xxxxxxxxxxxxxxxxxxx>; Matthew Foster - NOAA
>>> Affiliate <matthew.foster@xxxxxxxx>; LDM <ldm-users@xxxxxxxxxxxxxxxx>
>>> *Subject:* Re: [ldm-users] Stumped by PIPE action
>>>
>>> I've had issues using a Python script in a pact with the following
>>> shebang:
>>>
>>> #!/usr/bin/env python3
>>>
>>> The script won't have access to STDOUT or STDERR
>>>
>>> Mike Taylor
>>>
>>> On Thu, Apr 23, 2026 at 1:23 PM Pete Pokrandt <poker@xxxxxxxxxxxx>
>>> wrote:
>>>
>>> Is it running/finding the correct version of python? Sometimes if the
>>> script runs at the command line but not from cron or spawned by the LDM
>>> it's because it's running under a different environment and using a
>>> different version of python that doesn't have the modules that you need .
>>> Just another thing to check.
>>>
>>> Pete
>>>
>>> Pete Pokrandt - System Engineer IV
>>> UW-Madison Dept of Atmospheric and Oceanic Sciences
>>> 608-262-3086 - poker@xxxxxxxxxxxx
>>> ------------------------------
>>> *From:* ldm-users <ldm-users-bounces@xxxxxxxxxxxxxxxx> on behalf of
>>> Mike Zuranski <mike@xxxxxxxxxxxxxxxxxxx>
>>> *Sent:* Thursday, April 23, 2026 1:18 PM
>>> *To:* Matthew Foster - NOAA Affiliate <matthew.foster@xxxxxxxx>
>>> *Cc:* LDM <ldm-users@xxxxxxxxxxxxxxxx>
>>> *Subject:* Re: [ldm-users] Stumped by PIPE action
>>>
>>> Okay if we ruled those out I've got a few more thoughts...
>>>
>>> Broken pipes are usually a sign that the script is exiting before it
>>> reads from STDIN. Sometimes that means it's not executable, or the shebang
>>> path is incorrect, but I'm assuming if it's been working for you recently
>>> it's not a code issue. If it works when you run it manually then it could
>>> be environmental; how do you run it manually, and are you sure it's using
>>> the same python binary/environment as LDM would? IIRC you could also see
>>> this if the destination of that script is unreachable, e.g. downed mount,
>>> but again if it works manually that doesn't sound likely.
>>>
>>> Something else you can do to help debug this is wrap the python script's
>>> execution in a shell script which logs error output, and adjust your pqact
>>> to invoke that. That way you can see exactly how that script is failing by
>>> using that intended LDM pipeline.
>>>
>>> Hope this helps,
>>> -Mike
>>>
>>> On Thu, Apr 23, 2026 at 2:06 PM Matthew Foster - NOAA Affiliate <
>>> matthew.foster@xxxxxxxx> wrote:
>>>
>>> Thanks for the quick reply Mike.
>>>
>>> pqactcheck returns "syntactically correct"
>>>
>>> I've restarted LDM a couple of times this week in the process of trying
>>> to figure this out. We use a systemd script for stopping/starting that
>>> recreates the queue on start, so the queue is fresh. This machine (AWS
>>> instance, actually) has 16 CPU cores and 64 GB of RAM. RAM, CPU usage and
>>> disk space all look normal.
>>>
>>> I've administered LDMs for many years, and I've never had a pqact issue
>>> stump me like this one.
>>>
>>> Matt
>>>
>>>
>>> On Thu, Apr 23, 2026 at 12:49 PM Mike Zuranski <mike@xxxxxxxxxxxxxxxxxxx>
>>> wrote:
>>>
>>> Hi Matt,
>>>
>>> Some basic troubleshooting questions:
>>>
>>> Does ldmadmin pqactcheck show any errors? When was the last time LDM
>>> was restarted or the PQ remade? Does the health of your system (e.g. disk
>>> space, memory, cpu) appear normal?
>>>
>>> -Mike
>>>
>>> On Thu, Apr 23, 2026 at 1:44 PM Matthew Foster - NOAA Affiliate <
>>> matthew.foster@xxxxxxxx> wrote:
>>>
>>> I have a PIPE action in our pqact.conf that was working...until it
>>> stopped. I can't figure out what changed. Putting pqact in DEBUG yielded
>>> the following log messages...
>>>
>>> 20260423T171759.226677Z pqact[3286699]
>>> palt.c:processProduct:1332 INFO 928481a80643522717d70b2fa139c28e
>>> 35526969 20260423171755.509410 NOTHER 1130006 TICY70 KNES 231707
>>> 20260423T171759.226808Z pqact[3286699]
>>> palt.c:prodAction:1217 DEBUG pipe: {cmd: "-close
>>> /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/20260423/17/TICY70_KNES_23170
>>> 7_1130006.2026042317 TICX99", ident: "TICY70 KNES 231707"}
>>> 20260423T171759.227206Z pqact[3286699]
>>> filel.c:pipe_open:1854 DEBUG 6 3295768
>>> 20260423T171759.227247Z pqact[3286699]
>>> filel.c:pipe_prodput:2174 DEBUG 6 TICY70 KNES 231707
>>> 20260423T171759.227414Z pqact[3295768]
>>> filel.c:pipe_open:1819 INFO Executing decoder
>>> "/usr/local/ldm/agents/storeAndIngestFile.py"
>>> 20260423T171759.292967Z pqact[3286699]
>>> filel.c:fl_removeAndFree:421 DEBUG Deleting failed PIPE entry:
>>> cmd="-close /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/20260423/
>>> 17/TICY70_KNES_231707_1130006.2026042317 TICX99", pid=3295768
>>> 20260423T171759.293001Z pqact[3286699]
>>> filel.c:pipe_close:1901 DEBUG 6, 3295768
>>> 20260423T171759.293050Z pqact[3286699]
>>> pbuf.c:pbuf_flush:111 ERROR Broken pipe
>>> 20260423T171759.293070Z pqact[3286699]
>>> pbuf.c:pbuf_flush:111 ERROR Couldn't write to pipe: fd=6,
>>> len=4096, cmd="-close /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/20260423/17/TICY70_KNES_231707_1130006.2026042317 TICX99"
>>> 20260423T171759.293088Z pqact[3286699]
>>> palt.c:prodAction:1229 ERROR Couldn't process product:
>>> feedtype=NOTHER, pattern="^(TICY70) (KNES) (..)(..)(..)", action=pipe,
>>> args="-close /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/(\3:yyyy)(\3:mm)\3/\4/\1_\2_\3\4\5_(seq).%Y%m%d%H TICX99"
>>> 20260423T171759.293385Z pqact[3286699] filel.c:reap:3056
>>> WARN Child 3295768 exited with status 1
>>>
>>> The pqact.conf entry is...
>>> ANY<TAB>^(TICY70) (KNES) (..)(..)(..)
>>> <TAB>PIPE<TAB>-close /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/(\3:yyyy)(\3:mm)\3/\4/\1_\2_\3\4\5_(seq).%Y%m%d%H TICX99
>>>
>>> I also ran the Python script from the command line, and it runs without
>>> errors.
>>>
>>> I'm thinking a clue might be the presence of "-close" in the cmd entries
>>> above, but I can't see anything wrong with the pqact.conf entry.
>>>
>>> Also, multiple products/patterns that use this PIPE action and script
>>> are all failing with these errors.
>>>
>>> I appreciate any input!
>>>
>>> Matt
>>>
>>>
>>> --
>>> Matt Foster
>>> Sr. Application Development Met.
>>> NOAA Affiliate - KBR
>>>
>>>
>>> _________________________________________________________
>>> NOTE: All exchanges posted to NSF Unidata maintained email lists are
>>> made publicly available through the web. Users who post to any of the
>>> lists we maintain are reminded to remove any personal information that
>>> they do not want to be made public.
>>>
>>> NSF Unidata ldm-users Mailing List
>>> (ldm-users@xxxxxxxxxxxxxxxx)
>>> For list information, to unsubscribe, or change your membership options,
>>> visit: https://mailinglists.unidata.ucar.edu/listinfo/ldm-users/
>>> <https://urldefense.com/v3/__https://mailinglists.unidata.ucar.edu/listinfo/ldm-users/__;!!Mak6IKo!OKwA_h0qESJpH-obji8ljVM3JW2BNySVouiad5FB8A2C4a1QjLB_kQJsnN1Pkewxplb_FcdtvN_vhrfiNsS_$>
>>>
>>>
>>>
>>> --
>>> Matt Foster
>>> Sr. Application Development Met.
>>> NOAA Affiliate - KBR
>>>
>>>
>>> _________________________________________________________
>>> NOTE: All exchanges posted to NSF Unidata maintained email lists are
>>> made publicly available through the web. Users who post to any of the
>>> lists we maintain are reminded to remove any personal information that
>>> they do not want to be made public.
>>>
>>> NSF Unidata ldm-users Mailing List
>>> (ldm-users@xxxxxxxxxxxxxxxx)
>>> For list information, to unsubscribe, or change your membership options,
>>> visit: https://mailinglists.unidata.ucar.edu/listinfo/ldm-users/
>>> <https://urldefense.com/v3/__https://mailinglists.unidata.ucar.edu/listinfo/ldm-users/__;!!Mak6IKo!KREilj5LBJgGxyGqqj0ZS2l_I96HF_8USOaKMxDK6hfc0wqJZ_Wam5ccgn4t-pgejbz6GO7UWhn98WdfQVyHNKhjhXQccw$>
>>>
>>>
>>
>> --
>> Matt Foster
>> Sr. Application Development Met.
>> NOAA Affiliate - KBR
>>
>>
>>
--
Matt Foster
Sr. Application Development Met.
NOAA Affiliate - KBR