Metgrid problem (stall)

Forum dedicated to older versions of EMS package (WRFEMS v3.2, v3.1 or older). Support is user-to-user based, so please help others if you can.
Post Reply
Antonix
Posts: 260
Joined: Fri Oct 16, 2009 8:53 am

Metgrid problem (stall)

Post by Antonix » Thu Apr 14, 2011 2:03 pm

Hello everyone.

I'm running the wrfems 8 cores on one machine but I have a problem.

when I run the model, it crashes (stalls) when running metgrid.

does its job ... but will not proceed.
proceeds only if I type Ctlr + c, passing through the subsequent phases.

is very special as a thing, and not always the case.

It does not happen in the benchmark.

I tried to install from scratch but the model still does.

you have any ideas??


I am attaching the two situations:

WRF EMS Program ems_prep (V3.2.1.5.17.beta) started on TDM at Thu Apr 14 13:53:17 2011 UTC

The WRF EMS Says: "Who's Awesome? You're Awesome!"

* Initial and Boundary Condition Data Set


EMS DEBUG : Most Current Date - 2011041406

EMS DEBUG : Available Dates - 2011041406
EMS DEBUG : Available Dates - 2011041400
EMS DEBUG : Available Dates - 2011041318
EMS DEBUG : Available Dates - 2011041312

DATA SET : gfsptile
CONFIG FILE : /NMM/wrfems/conf/grib_info/gfsptile_gribinfo.conf
DESCRIPTION : GFS Model 0.5 degree personal tile - Isobaric coordinate - 3hourly (variable size)
VERTICAL COORD : press
VTABLE NAME : Vtable.GFS

METGRID TABLE : METGRID.TBL.NMM

TIME DEPENDENT : Yes
AVAIL CYCLES : 00 06 12 18
BEST AVAIL HOUR : No
DATA SET DELAY : 03 Hours
ALLOWED AGE : 00 Days

SIMULATION START: Thu Apr 14 06:00:00 2011 UTC
SIMULATION END : Fri Apr 15 06:00:00 2011 UTC

DATA SET DATE : Thu Apr 14 06:00:00 2011 UTC
DATA SET INIT HR: Thu Apr 14 06:00:00 2011 UTC
INIT FCST HR : 00
FINL FCST HR : 24
DATA PERIOD : 24 Hours
BC frequency : 03 Hourly

REQUESTED TILES : NA
AVAILABLE FILES :
DATA USED FOR : Initial and Boundary Conditions
LOCAL FILE NAME : /NMM/wrfems/runs/it/grib/YYMMDDCC.gfs.tCCz.pgrb2fFF
ARCHIVE DIR :

METHOD HOST LOCATION
----------------------------------------------------------------
http emsdata1.comet.ucar.edu /cgi-bin/ptile_ems.pl?YYMMDDCC.gfs.tCCz.pgrb2fFF
http emsdata2.comet.ucar.edu /cgi-bin/ptile_ems.pl?YYMMDDCC.gfs.tCCz.pgrb2fFF
http soostrc.comet.ucar.edu /cgi-bin/ptile_ems.pl?YYMMDDCC.gfs.tCCz.pgrb2fFF


I. WRF EMS ems_prep Model Initialization Summary

Initialization Start Time : Thu Apr 14 06:00:00 2011 UTC
Initialization End Time : Fri Apr 15 06:00:00 2011 UTC
Boundary Condition Frequency : 180 Minutes
Initialization Data Set : gfsptile
Boundary Condition Data Set : gfsptile
Static Surface Data Sets : None
Land Surface Data Sets : None

II. Search out requested files for WRF model initialization


* Locating gfsptile files for model initial and boundary conditions

* All requested gfsptile files are available for model initialization


Excellent! - Your master plan is working!


III. Create the WPS NMM intermediate format files

* Processing gfsptile files for use as model initial and boundary conditions - Dyn-o-mite!!

NMM core intermediate file processing completed in 4.16 seconds


IV. Horizontal interpolation of the intermediate files to the computational domain


now...if i digit "ctrl+c" the model regularly carry

thanks for the help

Antonix
Posts: 260
Joined: Fri Oct 16, 2009 8:53 am

Re: metgrid problem (stall)

Post by Antonix » Thu Apr 14, 2011 2:04 pm

wrf ems 3.2 and Centos 5.5 final ;)

wrfems
Posts: 8
Joined: Mon Jun 07, 2010 7:36 pm
Location: Boulder, Colorado - USA
Contact:

Re: metgrid problem (stall)

Post by wrfems » Fri Apr 15, 2011 5:38 pm

Greetings Antonix,

I am aware of this problem as others have reported it, although I have only been able to replicate it recently.

It appears that the problem lies in metgrid. Specifically, MPI_Finalize is being called before all the communication has completed. This issue has appeared in EMS V3.2 beta because I switched to running
metgrid in distributed memory.

I've made a change to the metgrid code and tested it overnight without any failures. The update will
be available in the next beta release, which should occur very soon.

BTW - Do not use the same ems_install.pl routine used for EMS V3.1 or you will have problems.

Antonix
Posts: 260
Joined: Fri Oct 16, 2009 8:53 am

Re: metgrid problem (stall)

Post by Antonix » Fri Apr 15, 2011 9:41 pm

Hi robert

Thanks for your attention!

wait for your changes!
I'll try to download the 3.1 version that works perfectly!

Thanks again for your work

leroygr
Posts: 14
Joined: Mon Sep 05, 2011 12:13 pm

Re: Metgrid problem (stall)

Post by leroygr » Mon Sep 05, 2011 12:34 pm

Hello everybody!

I try to run a 1 month simulation with CFSR historical reanalysis with WRF-EMS 3.2.1.5.34.beta. There is a problem in the preprocessing, probably during metgrid: the program doesn't write all the necessary met*.nc files to run the simulation properly... Here is the metgrid log:

Code: Select all

WARNING: In METGRID.TBL, FLAG_SST is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SM000010 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SM010040 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SM040100 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SM100200WARNING: In METGRID.TBL, FLAG_SST is given as a flag more than once.
 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_ST000010 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_ST010040 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_ST040100 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_ST100200 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM000 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM005 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM020 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM040 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM160 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM300 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT000 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT005 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT020 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT040 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT160 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT300 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SM000010 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SM010040 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SM040100 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SM100200 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_ST000010 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_ST010040 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_ST040100 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_ST100200 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM000 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM005 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM020 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM040 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM160 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILM300 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT000 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT005 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT020 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT040 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT160 is given as a flag more than once.
WARNING: In METGRID.TBL, FLAG_SOILT300 is given as a flag more than once.
Processing domain 1 of 2
 Processing 2010-03-01_00:00
    /home/ilab/wrfems/runs/Belgian_Test_case/wpsprd/CFSR
INFORM: GHT at level 200100.000000 already exists; leaving it alone.
INFORM: LANDSEA at level 200100.000000 already exists; leaving it alone.
INFORM: PSFC at level 200100.000000 already exists; leaving it alone.
INFORM: RH at level 200100.000000 already exists; leaving it alone.
INFORM: Going to create the field ST
INFORM: ST at level 10.000000 already exists; leaving it alone.
INFORM: ST at level 200.000000 already exists; leaving it alone.
INFORM: Couldn't find ST000007 at level 200100.000000 to fill level 7.000000 of ST.
INFORM: Couldn't find ST007028 at level 200100.000000 to fill level 28.000000 of ST.
INFORM: ST at level 100.000000 already exists; leaving it alone.
INFORM: Couldn't find ST100255 at level 200100.000000 to fill level 255.000000 of ST.
INFORM: Going to create the field SM
INFORM: SM at level 10.000000 already exists; leaving it alone.
INFORM: SM at level 200.000000 already exists; leaving it alone.
INFORM: Couldn't find SM000007 at level 200100.000000 to fill level 7.000000 of SM.
INFORM: Couldn't find SM007028 at level 200100.000000 to fill level 28.000000 of SM.
INFORM: SM at level 100.000000 already exists; leaving it alone.
INFORM: Couldn't find SM100255 at level 200100.000000 to fill level 255.000000 of SM.
INFORM: Going to create the field SW
INFORM: Couldn't find SW000010 at level 200100.000000 to fill level 1.000000 of SW.
INFORM: Couldn't find SW010040 at level 200100.000000 to fill level 2.000000 of SW.
INFORM: Couldn't find SW040100 at level 200100.000000 to fill level 3.000000 of SW.
INFORM: Couldn't find SW100200 at level 200100.000000 to fill level 4.000000 of SW.
INFORM: Couldn't find SW000010 at level 200100.000000 to fill level 1.000000 of SW.
INFORM: Couldn't find SW010200 at level 200100.000000 to fill level 2.000000 of SW.
INFORM: Going to create the field SOIL_LAYERS
INFORM: Going to create the field SOILM
INFORM: Couldn't find SOILM000 at level 200100.000000 to fill level 0.000000 of SOILM.
INFORM: Couldn't find SOILM005 at level 200100.000000 to fill level 5.000000 of SOILM.
INFORM: Couldn't find SOILM020 at level 200100.000000 to fill level 20.000000 of SOILM.
INFORM: Couldn't find SOILM040 at level 200100.000000 to fill level 40.000000 of SOILM.
INFORM: Couldn't find SOILM160 at level 200100.000000 to fill level 160.000000 of SOILM.
INFORM: Couldn't find SOILM300 at level 200100.000000 to fill level 300.000000 of SOILM.
INFORM: Going to create the field SOILT
INFORM: Couldn't find SOILT000 at level 200100.000000 to fill level 0.000000 of SOILT.
INFORM: Couldn't find SOILT005 at level 200100.000000 to fill level 5.000000 of SOILT.
INFORM: Couldn't find SOILT020 at level 200100.000000 to fill level 20.000000 of SOILT.
INFORM: Couldn't find SOILT040 at level 200100.000000 to fill level 40.000000 of SOILT.
INFORM: Couldn't find SOILT160 at level 200100.000000 to fill level 160.000000 of SOILT.
INFORM: Couldn't find SOILT300 at level 200100.000000 to fill level 300.000000 of SOILT.
INFORM: Going to create the field SOIL_LEVELS
INFORM: Going to create the field PRES
INFORM: PRES at level 200100.000000 already exists; leaving it alone.
INFORM: GHT at level 200100.000000 already exists; leaving it alone.
INFORM: LANDSEA at level 200100.000000 already exists; leaving it alone.
INFORM: PSFC at level 200100.000000 already exists; leaving it alone.
INFORM: RH at level 200100.000000 already exists; leaving it alone.
INFORM: Going to create the field ST
INFORM: ST at level 10.000000 already exists; leaving it alone.
INFORM: ST at level 200.000000 already exists; leaving it alone.
INFORM: Couldn't find ST000007 at level 200100.000000 to fill level 7.000000 of ST.
INFORM: Couldn't find ST007028 at level 200100.000000 to fill level 28.000000 of ST.
INFORM: ST at level 100.000000 already exists; leaving it alone.
INFORM: Couldn't find ST100255 at level 200100.000000 to fill level 255.000000 of ST.
INFORM: Going to create the field SM
INFORM: SM at level 10.000000 already exists; leaving it alone.
INFORM: SM at level 200.000000 already exists; leaving it alone.
INFORM: Couldn't find SM000007 at level 200100.000000 to fill level 7.000000 of SM.
INFORM: Couldn't find SM007028 at level 200100.000000 to fill level 28.000000 of SM.
INFORM: SM at level 100.000000 already exists; leaving it alone.
INFORM: Couldn't find SM100255 at level 200100.000000 to fill level 255.000000 of SM.
INFORM: Going to create the field SW
INFORM: Couldn't find SW000010 at level 200100.000000 to fill level 1.000000 of SW.
INFORM: Couldn't find SW010040 at level 200100.000000 to fill level 2.000000 of SW.
INFORM: Couldn't find SW040100 at level 200100.000000 to fill level 3.000000 of SW.
INFORM: Couldn't find SW100200 at level 200100.000000 to fill level 4.000000 of SW.
INFORM: Couldn't find SW000010 at level 200100.000000 to fill level 1.000000 of SW.
INFORM: Couldn't find SW010200 at level 200100.000000 to fill level 2.000000 of SW.
INFORM: Going to create the field SOIL_LAYERS
INFORM: Going to create the field SOILM
INFORM: Couldn't find SOILM000 at level 200100.000000 to fill level 0.000000 of SOILM.
INFORM: Couldn't find SOILM005 at level 200100.000000 to fill level 5.000000 of SOILM.
INFORM: Couldn't find SOILM020 at level 200100.000000 to fill level 20.000000 of SOILM.
INFORM: Couldn't find SOILM040 at level 200100.000000 to fill level 40.000000 of SOILM.
INFORM: Couldn't find SOILM160 at level 200100.000000 to fill level 160.000000 of SOILM.
INFORM: Couldn't find SOILM300 at level 200100.000000 to fill level 300.000000 of SOILM.
INFORM: Going to create the field SOILT
INFORM: Couldn't find SOILT000 at level 200100.000000 to fill level 0.000000 of SOILT.
INFORM: Couldn't find SOILT005 at level 200100.000000 to fill level 5.000000 of SOILT.
INFORM: Couldn't find SOILT020 at level 200100.000000 to fill level 20.000000 of SOILT.
INFORM: Couldn't find SOILT040 at level 200100.000000 to fill level 40.000000 of SOILT.
INFORM: Couldn't find SOILT160 at level 200100.000000 to fill level 160.000000 of SOILT.
INFORM: Couldn't find SOILT300 at level 200100.000000 to fill level 300.000000 of SOILT.
INFORM: Going to create the field SOIL_LEVELS
INFORM: Going to create the field PRES
INFORM: PRES at level 200100.000000 already exists; leaving it alone.
INFORM: Field LANDSEA.mask does not have a valid mask and will not be checked for missing values
INFORM: Field LANDSEA.mask does not have a valid mask and will not be checked for missing values
 Processing 2010-03-07_12:00
    /home/ilab/wrfems/runs/Belgian_Test_case/wpsprd/CFSR
INFORM: GHT at level 200100.000000 already exists; leaving it alone.
INFORM: LANDSEA at level 200100.000000 already exists; leaving it alone.
INFORM: PSFC at level 200100.000000 already exists; leaving it alone.
INFORM: RH at level 200100.000000 already exists; leaving it alone.
Timeout of 4 minutes expired; job aborted
[0]0:Return code = 0, signaled with Interrupt
[0]1:Return code = 0, signaled with Interrupt
Any idea?

Thanks

Greg

smartie
Posts: 94
Joined: Sat May 21, 2011 7:34 am

Re: Metgrid problem (stall)

Post by smartie » Mon Sep 05, 2011 5:45 pm

leroygr wrote:Hello everybody!

I try to run a 1 month simulation with CFSR historical reanalysis with WRF-EMS 3.2.1.5.34.beta. There is a problem in the preprocessing, probably during metgrid: the program doesn't write all the necessary met*.nc files to run the simulation properly... Here is the metgrid log:
This might be a timeout due the new mpich in 3.2
in FILE: Prep_metgrid.pm ~line FILE: Prep_metgrid.pm ~line 160 try increasing
my $maxtime = 9999;

to accommodate yr metgrid processing wall clock time.

David

leroygr
Posts: 14
Joined: Mon Sep 05, 2011 12:13 pm

Re: Metgrid problem (stall)

Post by leroygr » Mon Sep 12, 2011 11:45 am

Thanks a lot, it worked!

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest