Re: [lead-all] Testing status

To: Everette Joseph <ejoseph@xxxxxxxxxx>
Subject: Re: [lead-all] Testing status
From: Suresh Marru <smarru@xxxxxxxxxxxxxx>
Date: Wed, 07 Mar 2007 19:18:29 -0500

Hi Everette,

When were these workflows submitted yesterday or today? If its today, 10workflows were submitted, 7 finished fine, 1 wrf went crazy, 2 arerunning fine now they should finish up some time soon.

But as I was saying the workflow successful message we are seeing mightbe misleading. The workflow engine sends a success message if WRF tellsit is done, we are not checking if it has finished with good output orbad output. But we need to get those bad outputs experiment informationfrom users and pass those data files to the met. group to refine theconfigurations. Not sure what he notification problem could be, willlook forward to see their bug reports.


Thanks,
Suresh

Everette Joseph wrote:

LEAD - Master Mailing List
We submitted runs between 15z and 21z per discussion on Monday.Failure rate seemed high and there were some weird experiences withoutput and notification--- your email explained the latter -- Ithink. Our testers should be sending reports to "contact us".
EJ

Suresh Marru wrote:
LEAD - Master Mailing List

Hi All,

Just want to update on how test runs has been so far this week.
The workflows are running good for most of the configurations, wecertainly came down quite a bit from 100% success rate but now sureexactly how much, hopefully not below 90%. With good amount oftesting from all of you we have a found a problem which prevented youfrom launching a workflow this morning. Restarts of couple ofservices helped and everything is fine now but we still did notresolve the problem and will continue to look why the unscheduled andunwanted down time happened.
If you have noticed yesterday between morning and 3pm (EST) workflowsfailed due to some change in a software. We reverted that change andall failed workflows were resurrected late in the evening andfinished successfully but not sure if they did a forecast or a postcast. The consequence of this will be duplicates of data in yourworkspace for the experiments created yesterday.
We have seen some WRF runs going unstable in past 2 days, not sure ifthey gave workflow success or failed status. I did not completelyunderstand the reasons Dan Weber and Brian Jewett have analyzed butyou guys might, which is "Looking at the plots Brian generated fromthe earlier run, convection starts on the southern boundary, whichusually means doom for a forecast, especially if it is on an inflowcondition (which is the case for the run). No convection was visiblein the the ADAS analysis during the forecast area. It seems that themain problem is the placement of the lateral boundary in a regionthat is mountainous and where convection developed. This is the oneof the problems with limited domain forecasts, such as the 5km201x201x51 grid" -- We still need to parse this to english and thenmake the changes to data search or namelists.
All of this aside, please continue to use our resource reservationsat NCSA from 1pm to 6pm (EST) and test and report problems as usual.
Happy Testing,
Suresh
_______________________________________________
LEAD-All mailing list
LEAD-All@xxxxxxxxxxx
http://www.caps.ou.edu/mailman/listinfo/lead-all


==============================================================================
To unsubscribe leadusers, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================

References:
- Testing status
  - From: Suresh Marru