Okay if we ruled those out I've got a few more thoughts...
Broken pipes are usually a sign that the script is exiting before it reads
from STDIN. Sometimes that means it's not executable, or the shebang path
is incorrect, but I'm assuming if it's been working for you recently it's
not a code issue. If it works when you run it manually then it could be
environmental; how do you run it manually, and are you sure it's using the
same python binary/environment as LDM would? IIRC you could also see this
if the destination of that script is unreachable, e.g. downed mount, but
again if it works manually that doesn't sound likely.
Something else you can do to help debug this is wrap the python script's
execution in a shell script which logs error output, and adjust your pqact
to invoke that. That way you can see exactly how that script is failing by
using that intended LDM pipeline.
Hope this helps,
-Mike
On Thu, Apr 23, 2026 at 2:06 PM Matthew Foster - NOAA Affiliate <
matthew.foster@xxxxxxxx> wrote:
> Thanks for the quick reply Mike.
>
> pqactcheck returns "syntactically correct"
>
> I've restarted LDM a couple of times this week in the process of trying to
> figure this out. We use a systemd script for stopping/starting that
> recreates the queue on start, so the queue is fresh. This machine (AWS
> instance, actually) has 16 CPU cores and 64 GB of RAM. RAM, CPU usage and
> disk space all look normal.
>
> I've administered LDMs for many years, and I've never had a pqact issue
> stump me like this one.
>
> Matt
>
>
> On Thu, Apr 23, 2026 at 12:49 PM Mike Zuranski <mike@xxxxxxxxxxxxxxxxxxx>
> wrote:
>
>> Hi Matt,
>>
>> Some basic troubleshooting questions:
>>
>> Does ldmadmin pqactcheck show any errors? When was the last time LDM was
>> restarted or the PQ remade? Does the health of your system (e.g. disk
>> space, memory, cpu) appear normal?
>>
>> -Mike
>>
>> On Thu, Apr 23, 2026 at 1:44 PM Matthew Foster - NOAA Affiliate <
>> matthew.foster@xxxxxxxx> wrote:
>>
>>> I have a PIPE action in our pqact.conf that was working...until it
>>> stopped. I can't figure out what changed. Putting pqact in DEBUG yielded
>>> the following log messages...
>>>
>>> 20260423T171759.226677Z pqact[3286699]
>>> palt.c:processProduct:1332 INFO 928481a80643522717d70b2fa139c28e
>>> 35526969 20260423171755.509410 NOTHER 1130006 TICY70 KNES 231707
>>> 20260423T171759.226808Z pqact[3286699]
>>> palt.c:prodAction:1217 DEBUG pipe: {cmd: "-close
>>> /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/20260423/17/TICY70_KNES_23170
>>> 7_1130006.2026042317 TICX99", ident: "TICY70 KNES 231707"}
>>> 20260423T171759.227206Z pqact[3286699]
>>> filel.c:pipe_open:1854 DEBUG 6 3295768
>>> 20260423T171759.227247Z pqact[3286699]
>>> filel.c:pipe_prodput:2174 DEBUG 6 TICY70 KNES 231707
>>> 20260423T171759.227414Z pqact[3295768]
>>> filel.c:pipe_open:1819 INFO Executing decoder
>>> "/usr/local/ldm/agents/storeAndIngestFile.py"
>>> 20260423T171759.292967Z pqact[3286699]
>>> filel.c:fl_removeAndFree:421 DEBUG Deleting failed PIPE entry:
>>> cmd="-close /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/20260423/
>>> 17/TICY70_KNES_231707_1130006.2026042317 TICX99", pid=3295768
>>> 20260423T171759.293001Z pqact[3286699]
>>> filel.c:pipe_close:1901 DEBUG 6, 3295768
>>> 20260423T171759.293050Z pqact[3286699]
>>> pbuf.c:pbuf_flush:111 ERROR Broken pipe
>>> 20260423T171759.293070Z pqact[3286699]
>>> pbuf.c:pbuf_flush:111 ERROR Couldn't write to pipe: fd=6,
>>> len=4096, cmd="-close /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/20260423/17/TICY70_KNES_231707_1130006.2026042317 TICX99"
>>> 20260423T171759.293088Z pqact[3286699]
>>> palt.c:prodAction:1229 ERROR Couldn't process product:
>>> feedtype=NOTHER, pattern="^(TICY70) (KNES) (..)(..)(..)", action=pipe,
>>> args="-close /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/(\3:yyyy)(\3:mm)\3/\4/\1_\2_\3\4\5_(seq).%Y%m%d%H TICX99"
>>> 20260423T171759.293385Z pqact[3286699] filel.c:reap:3056
>>> WARN Child 3295768 exited with status 1
>>>
>>> The pqact.conf entry is...
>>> ANY<TAB>^(TICY70) (KNES) (..)(..)(..)
>>> <TAB>PIPE<TAB>-close /usr/local/ldm/agents/storeAndIngestFile.py
>>> /data_store/polar/(\3:yyyy)(\3:mm)\3/\4/\1_\2_\3\4\5_(seq).%Y%m%d%H TICX99
>>>
>>> I also ran the Python script from the command line, and it runs without
>>> errors.
>>>
>>> I'm thinking a clue might be the presence of "-close" in the cmd entries
>>> above, but I can't see anything wrong with the pqact.conf entry.
>>>
>>> Also, multiple products/patterns that use this PIPE action and script
>>> are all failing with these errors.
>>>
>>> I appreciate any input!
>>>
>>> Matt
>>>
>>>
>>> --
>>> Matt Foster
>>> Sr. Application Development Met.
>>> NOAA Affiliate - KBR
>>>
>>>
>>> _________________________________________________________
>>> NOTE: All exchanges posted to NSF Unidata maintained email lists are
>>> made publicly available through the web. Users who post to any of the
>>> lists we maintain are reminded to remove any personal information that
>>> they do not want to be made public.
>>>
>>> NSF Unidata ldm-users Mailing List
>>> (ldm-users@xxxxxxxxxxxxxxxx)
>>> For list information, to unsubscribe, or change your membership options,
>>> visit: https://mailinglists.unidata.ucar.edu/listinfo/ldm-users/
>>>
>>
>
> --
> Matt Foster
> Sr. Application Development Met.
> NOAA Affiliate - KBR
>
>
>