allocated memory

All issues/questions about EMS v3.4 package, please ask here.
Post Reply
Larf
Posts: 16
Joined: Tue Dec 02, 2014 10:03 am
Location: Hamburg

allocated memory

Post by Larf » Mon Nov 23, 2015 2:06 pm

Hi there,
I'm having difficulties modelling an area with longer time series than 2 weeks.

Background:
we're trying to model from cfsr grib2 files, that are stored at our extern hdd. That's because we're not allowed to have internet access with a Linux machine. The wrfems is linked to the grib files. prep works fine.
We need to model a 10 years time series to have a long term condition statistic at a site of interest. I therefore define the meso-area with about 100x100 tiles of about 4 km and nest it down twice by a third (to domain2) and fifth (to domain3) ratio. (I also tried several other things, but that was the base case.) The setup always crashes after an hour or so. I sent the crash reports to Robert, but unfortunately he seems busy releasing. :-)

The RAM usage never seems to be climbing higher than a certain level (appr. 5GB (of 256GB!)) I also tried to portion into smaller chunks by usage of numtile (=19 (of 20 cores, one for the system)

The shell always utters the same error "...run failed due to segmentation Fault on your system. This failure is typically caused when the EMS attempt to access a region of memory that has not been allocated. Most often ... due to array bounds..."

Is there anything else I can do?

meteoadriatic
Posts: 1512
Joined: Wed Aug 19, 2009 10:05 am

Re: allocated memory

Post by meteoadriatic » Mon Nov 23, 2015 7:33 pm

Hi, segmentation fault usually has nothing with memory, but it can.

Please look at log directory inside domain and especially files rsl.error.xxxx that should appear there after model crashes. That file will probably give you closer information why segmentation fault occured. I presume it is because of cfl violations, but you really need to look there first.

Larf
Posts: 16
Joined: Tue Dec 02, 2014 10:03 am
Location: Hamburg

Re: allocated memory

Post by Larf » Tue Nov 24, 2015 10:16 am

Thanks again, meteoadriatic,
I don't have such file, but I have lots of wrfm.00XX.err- and wrfm.00XX.out-files all stating something like

63318944 bytes allocated

then there is the run_real.log stating the same, nothing else conspicuous to me.


Those were the settings:

&share
wrf_core = 'ARW'
max_dom = 3
start_date = '2010-10-15_00:00:00'
end_date = '2011-02-13_18:00:00'
interval_seconds = 21600
io_form_geogrid = 2
opt_output_from_geogrid_path = '/home/lzieren/wrf/wrfems/runs/hhmast6/static/'
debug_level = 0
/

&geogrid
parent_id = 1, 1, 2
parent_grid_ratio = 1, 3, 5
i_parent_start = 1, 36, 32
j_parent_start = 1, 36, 32
e_we = 100, 88, 126
e_sn = 100, 88, 126
geog_data_res = '30s', '30s', '30s'
dx = 4700
dy = 4700
map_proj = 'lambert'
ref_lat = 53.518
ref_lon = 10.113
truelat1 = 53.518
truelat2 = 53.518
stand_lon = 10.113
ref_x = 50
ref_y = 50
geog_data_path = '/home/lzieren/wrf/wrfems/data/geog'
opt_geogrid_tbl_path = '/home/lzieren/wrf/wrfems/runs/hhmast6/static/'
/

&ungrib
out_format = 'WPS'
prefix = 'FILE'
/

&metgrid
fg_name = CFSR
io_form_metgrid = 2
opt_output_from_metgrid_path = '/data1/runs/hhmast6/wpsprd'
opt_metgrid_tbl_path = '/data1/runs/hhmast6'
process_only_bdy =
/

&mod_levs
press_pa = 201300, 200100, 100000, 95000, 90000, 85000, 80000, 75000, 70000, 65000, 60000, 55000, 50000, 45000, 40000, 35000, 30000, 25000, 20000, 15000, 10000, 5000, 1000
/

&domain_wizard
grib_data_path = 'null'
grib_vtable = 'Vtable.GFS'
dwiz_center_over_gmt = true
dwiz_desc = hhmast6
dwiz_gridpt_dist_km = 4.7
dwiz_latlon_linecolor = -8355712
dwiz_latlon_space_in_deg = 10
dwiz_map_horiz_scrollbar_pos = 697
dwiz_map_scale_pct = 12.5
dwiz_map_vert_scrollbar_pos = 0
dwiz_mpi_command = null
dwiz_name = hhmast6
dwiz_show_political = true
dwiz_user_rect_x1 = 1049
dwiz_user_rect_x2 = 1089
dwiz_user_rect_y1 = 190
dwiz_user_rect_y2 = 220
dwiz_modis = false
dwiz_tcvitals = null
dwiz_bigmap = Y
dwiz_lakes = false
/

Can I post anything else?

Larf
Posts: 16
Joined: Tue Dec 02, 2014 10:03 am
Location: Hamburg

Re: allocated memory

Post by Larf » Thu Dec 10, 2015 1:58 pm

Anyone?
I can't try fixing anymore, since I had to give the work station back (it was only borrowed to see if wrf is of use for us)
however, your advices are still appreciated.

smartie
Posts: 94
Joined: Sat May 21, 2011 7:34 am

Re: allocated memory

Post by smartie » Thu Dec 10, 2015 9:50 pm

Are you directly initialising a 4700m grid from CFSR data as the namelist implies? You probably need an intermediate grid at say 25-15km if not two intermediate grids. Also your domains at 100X100, 88X88 and 126X126 are on the small size for comfort. You need to get the dimensions closer to 200X200. Circulations entering the domain need time/space to spin up. Small/badly placed grids can generate spurious waves that can cause a crash (even without obvious error messages).

meteoadriatic
Posts: 1512
Joined: Wed Aug 19, 2009 10:05 am

Re: allocated memory

Post by meteoadriatic » Thu Dec 10, 2015 10:14 pm

Another common thing is to try to increase EPSSM value inside run_dynamics.conf to 0.5 or more (even up to 1.0) from default value that is I believe 0.1. Of course, first thing would be to check if time step is too large.

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests