Use of Crash logs

Questions and other topics related to UEMS 15.
smartie
Posts: 94
Joined: Sat May 21, 2011 7:34 am

Re: Use of Crash logs

Post by smartie » Mon Jan 11, 2016 8:48 am

it should find 'localhost' or 127.0.0.1 (the 'loopback' ip).

Further thoughts:
-did you have wrfems installed previously and was that OK?
-I notice from the log files uems is installed in ~/ems_usrs/... did you accept the default settings on installation and who owns the installation?
Did you install as root/?

I've installed uems on two test machines with openSuSe 'harlequin' and all is well:

Code: Select all

% sudo ./uems_install.pl --install 

Code: Select all

~/wrf/uems
ownership assigned to user 'david' ie the home user

I think this is always a good idea rather than accept the defaults. You could have bad permissions...
Other than that perhaps someone with better Linux administration skills might be able to help (mine are strictly 'need to know')

Howard
Posts: 61
Joined: Wed Aug 14, 2013 3:13 am

Re: Use of Crash logs

Post by Howard » Mon Jan 11, 2016 8:49 pm

I have been working on the ssh connection ...
command
ssh localhost -p22 # trying to connect using port 22 press enter
Password:

Haven't set up a password ... my /etc/ssh/sshd_config file has
PasswordAuthentication no

which I thought would fix it ... so nearly there I think
platform - Opensuse desktop

Howard of the antipodes still in a fix

meteoadriatic
Posts: 1543
Joined: Wed Aug 19, 2009 10:05 am

Re: Use of Crash logs

Post by meteoadriatic » Mon Jan 11, 2016 9:01 pm

No ssh access is required, don't need to try to set that up unless you're doing clustering over network.

Do you have hostname set? What is output of command

Code: Select all

hostname
?

Howard
Posts: 61
Joined: Wed Aug 14, 2013 3:13 am

Re: Use of Crash logs

Post by Howard » Mon Jan 11, 2016 9:36 pm

Hi

Log of simulation run for run_real.log
fatal error MPI ... with a lot of MPI issues like cant contact localhost

nocpu config file has
MPICHECK = 0 (changed from 1 which I guess is the setting for using clusters)
should that be
MPICHECK =
and leave it null?

were looking at SSH because used for MPI
your comments? will post run_real.log

Howard

Howard
Posts: 61
Joined: Wed Aug 14, 2013 3:13 am

Re: Use of Crash logs

Post by Howard » Mon Jan 11, 2016 9:42 pm

terminal crash

Code: Select all

Starting UEMS Program ems_run (V15.52.3) on linux-meso at Mon Jan 11 21:37:26 2016 UTC


     I.  Preparing your EMS Run experience


         *  You are running the WRF ARW core. Hey Ho! Let's go! - model'n!

         *  Simulation start and end times:

              Domain         Start                   End            Parent

                1     2016-01-09_12:00:00     2016-01-10_00:00:00      
                2     2016-01-09_12:00:00     2016-01-10_00:00:00     1

         *  Simulation length will be 12 hours

         *  A large timestep of 100 seconds will be used for this simulation

         *  This is a 1-way nested simulation


    II.  Creating the initial and boundary condition files for the user domain(s)


         *  The WRF REAL program shall be run on the following systems and processors:

              3  processors on linux-meso     (1 tile per processor)

         *  Creating WRF initial and boundary condition files  - Failed (15)

            System Signal Code (SN) : 15 (Termination signal)


         !  UGH! Creation of model initial and boundary conditions failed!            
            
            I hate when this #%^!#!!% happens.  Hopefully nobody lost an eye!

         *  System information is available in static/ems_system.info


         !  Here's a little help from your friend at EMS world headquarters:

            The log files from your premature termination have been neatly bundled in log/2016011121.real_crash_logs.tgz

            Feel free to send them to a person who cares should you need some
            assistance in troubleshooting this problem.



    Your EMS party was busted at Mon Jan 11 21:37:42 2016 UTC - Ya know, 'cause stuff just happens

  The EMS Metaphysician says: "Think Globally, Model Locally!"

And the run_real.log

Code: Select all

Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(500)..............: 
MPID_Init(190).....................: channel initialization failed
MPIDI_CH3_Init(89).................: 
MPID_nem_init(320).................: 
MPID_nem_tcp_init(173).............: 
MPID_nem_tcp_get_business_card(420): 
MPID_nem_tcp_init(379).............: gethostbyname failed, linux-meso (errno 1)
This latest run is with MPCHECK = ... but the results are the same with 0 and 1 ...

Howard of the antipodes

Howard
Posts: 61
Joined: Wed Aug 14, 2013 3:13 am

Re: Use of Crash logs

Post by Howard » Mon Jan 11, 2016 9:43 pm

Hostname
linux-meso

... I am using Opensuse ... so have fixed the hostname

now
howard@linux-workstation ... so I thinking hostname is not the issue as
it still falls over in the ems_run ...while setting the initial boundary conditions

interesting the run_real.log file is now blank ... guess that means no issues and the hostname settings needed twinking

Howard

Howard
Posts: 61
Joined: Wed Aug 14, 2013 3:13 am

Re: Use of Crash logs

Post by Howard » Mon Jan 11, 2016 10:29 pm

Hi
during install I set it up as uems_user directory ... ( thought usr1 was a bit too generic)
... I noticed the latest version got all the environment variable set up in .cshr file so I dont think there is any issue
with the directory

Howard

meteoadriatic
Posts: 1543
Joined: Wed Aug 19, 2009 10:05 am

Re: Use of Crash logs

Post by meteoadriatic » Mon Jan 11, 2016 10:31 pm

Howard wrote:Hi

Log of simulation run for run_real.log
fatal error MPI ... with a lot of MPI issues like cant contact localhost

nocpu config file has
MPICHECK = 0 (changed from 1 which I guess is the setting for using clusters)
should that be
MPICHECK =
and leave it null?

were looking at SSH because used for MPI
your comments? will post run_real.log

Howard
No all this you're wasting time I believe on wrong place to look. EMS can run easy without any ssh running on computer, ssh can even be uninstalled, port 22 closed and so on, no need for that if you run it only on localhost and not use network with multiple computers.

I thought hostname will be issue but looks it is not.

Do you have anything strange on computer like selinux or similar security enforcing software?

smartie
Posts: 94
Joined: Sat May 21, 2011 7:34 am

Re: Use of Crash logs

Post by smartie » Mon Jan 11, 2016 11:08 pm

I agree ssh should have nothing to do with it.

Have you actually checked /etc/hosts to see if has the correct entries. What does

Code: Select all

% cat /etc/hosts
return?
I note that on my openSuSe machine I get

Code: Select all

# IP-Address  Full-Qualified-Hostname  Short-Hostname
#

127.0.0.1       localhost

# special IPv6 addresses
::1             localhost ipv6-localhost ipv6-loopback

fe00::0         ipv6-localnet
..............................
127.0.0.2       linux-vnhu.site linux-vnhu
Nearest thing I could find is
http://strc.comet.ucar.edu/library/list ... 01390.html
May you should do a completely clean install from root?
Last edited by smartie on Mon Jan 11, 2016 11:33 pm, edited 1 time in total.

Howard
Posts: 61
Joined: Wed Aug 14, 2013 3:13 am

Re: Use of Crash logs

Post by Howard » Mon Jan 11, 2016 11:22 pm

Updated hostname ... to less than 15characters
crash continues
errors in run_real.log ... continue theme of MPI issue

Code: Select all

Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(500)..............: 
MPID_Init(190).....................: channel initialization failed
MPIDI_CH3_Init(89).................: 
MPID_nem_init(320).................: 
MPID_nem_tcp_init(173).............: 
MPID_nem_tcp_get_business_card(420): 
MPID_nem_tcp_init(379).............: gethostbyname failed, workstation1 (errno 1)
namelist_real

Code: Select all

&time_control
 start_year                 = 2016, 2016
 start_month                = 01, 01
 start_day                  = 09, 09
 start_hour                 = 12, 12
 start_minute               = 00, 00
 start_second               = 00, 00
 end_year                   = 2016, 2016
 end_month                  = 01, 01
 end_day                    = 10, 10
 end_hour                   = 00, 00
 end_minute                 = 00, 00
 end_second                 = 00, 00
 interval_seconds           = 10800
 input_from_file            = T, T
 history_interval           = 60, 30
 history_outname            = 'wrfout_d<domain>_<date>'
 frames_per_outfile         = 1, 1
 io_form_history            = 2
 io_form_input              = 2
 io_form_restart            = 2
 io_form_boundary           = 2
 io_form_auxinput2          = 2
 restart                    = F
 output_ready_flag          = F
 auxhist1_interval          = 0, 0
 auxinput4_inname           = "wrflowinp_d<domain>"
 auxinput4_interval         = 360, 360
 io_form_auxinput4          = 2
 fine_input_stream          = 0, 2
 adjust_output_times        = T
 use_netcdf_classic         = T
/

&domains
 time_step                  = 100
 time_step_fract_num        = 0
 time_step_fract_den        = 10
 time_step_dfi              = 60
 max_dom                    = 2
 s_we                       = 1, 1
 e_we                       = 100, 202
 s_sn                       = 1, 1
 e_sn                       = 125, 301
 s_vert                     = 1, 1
 e_vert                     = 45, 45
 dx                         = 19200.0000, 6400.0000
 dy                         = 19200.0000, 6400.0000
 grid_id                    = 1, 2
 parent_id                  = 1, 1
 i_parent_start             = 1, 17
 j_parent_start             = 1, 13
 parent_grid_ratio          = 1, 3
 parent_time_step_ratio     = 1, 3
 feedback                   = 0
 smooth_option              = 0
 grid_allowed               = T, T
 max_dz                     = 1000.
 numtiles                   = 1
 nproc_x                    = -1
 nproc_y                    = -1
 hypsometric_opt            = 2
 num_metgrid_soil_levels    = 4
 num_metgrid_levels         = 27
 interp_type                = 2
 extrap_type                = 2
 t_extrap_type              = 2
 use_levels_below_ground    = T
 use_surface                = T
 lagrange_order             = 9
 zap_close_levels           = 500
 force_sfc_in_vinterp       = 2
 sfcp_to_sfcp               = T
 smooth_cg_topo             = T
 rh2qv_wrt_liquid           = T
 rh2qv_method               = 2
 p_top_requested            = 5000
 interp_method_type         = 2
 maxw_above_this_level      = 25000
 trop_horiz_pres_diff       = 7500
 maxw_horiz_pres_diff       = 7500
 use_maxw_level             = T
 use_trop_level             = T
 use_adaptive_time_step     = F
/

&dfi_control
 dfi_opt                    = 0
/

&physics
 cu_physics                 = 11, 0
 cudt                       = 5, 5
 kfeta_trigger              = 1
 cu_diag                    = 0
 mp_physics                 = 2, 2
 do_radar_ref               = 1
 shcu_physics               = 0, 0
 bl_pbl_physics             = 1, 1
 bldt                       = 0, 0
 grav_settling              = 0, 0
 topo_wind                  = 0, 0
 ysu_topdown_pblmix         = 0
 scalar_pblmix              = 1, 1
 tracer_pblmix              = 1, 1
 sf_sfclay_physics          = 1, 1
 isfflx                     = 1
 sf_surface_physics         = 2, 2
 num_land_cat               = 24
 num_soil_cat               = 16
 num_soil_layers            = 4
 surface_input_source       = 1
 usemonalb                  = T
 rdmaxalb                   = T
 rdlai2d                    = F
 ua_phys                    = T
 sf_surface_mosaic          = 1
 mosaic_cat                 = 3
 sf_urban_physics           = 0, 0
 ra_lw_physics              = 1, 1
 ra_sw_physics              = 2, 2
 radt                       = 18, 18
 swint_opt                  = 1
 ra_call_offset             = 0
 slope_rad                  = 0, 0
 topo_shading               = 0, 0
 cu_rad_feedback            = T, T
 icloud                     = 1
 sst_skin                   = 0
 seaice_threshold           = 271
 seaice_albedo_opt          = 1
 seaice_albedo_default      = 0.65
/

&noah_mp
/

&dynamics
 non_hydrostatic            = T, T
 rk_ord                     = 3
 h_mom_adv_order            = 5, 5
 h_sca_adv_order            = 5, 5
 v_mom_adv_order            = 3, 3
 v_sca_adv_order            = 3, 3
 moist_adv_opt              = 1, 1
 moist_adv_dfi_opt          = 0
 scalar_adv_opt             = 1, 1
 momentum_adv_opt           = 1, 1
 tke_adv_opt                = 1, 1
 diff_opt                   = 1
 km_opt                     = 4
 km_opt_dfi                 = 1
 w_damping                  = 1
 diff_6th_opt               = 0, 0
 diff_6th_factor            = 0.25, 0.25
 damp_opt                   = 0
 zdamp                      = 5000., 5000.
 time_step_sound            = 0, 0
 smdiv                      = 0.1, 0.1
 emdiv                      = 0.01, 0.01
 epssm                      = 0.1, 0.1
/

&scm
 scm_force                  = 0
/

&fdda
 grid_fdda                  = 0
/

&tc
 insert_bogus_storm         = F
/

&fire
/

&bdy_control
 spec_bdy_width             = 5
 spec_zone                  = 1
 relax_zone                 = 4
 spec_exp                   = 0
 specified                  = T, F
 nested                     = F, T
/

&stoch
 skebs                      = 0
/

&grib2
/

&namelist_quilt
 nio_tasks_per_group        = 0
 nio_groups                 = 1
/

&diags
 p_lev_diags                = 0
/

&afwa
 afwa_diag_opt              = 0
/

&logging
 compute_slaves_silent      = T
 io_servers_silent          = T
 stderr_logging             = 0
/
ems_system.info

Code: Select all

           Basic System Information for workstation1
           
               System Date           : Mon Jan 11 23:14:23 2016 UTC
               System Hostname       : workstation1
               System Address        : Not Resolved
           
               System OS             : Linux
               Linux Distribution    : openSUSE 42.1 (x86_64)
           VERSION = 42.1
           CODENAME = Malachite
           # /etc/SuSE-release is deprecated and will be removed in the future, use /etc/os-release instead
               OS Kernel             : 4.1.13-5-default
               Kernel Type           : x86_64
           
           Processor and Memory Information for workstation1
           
               CPU Name              : Intel(R) Core(TM) i7-4820K CPU @ 3.70GHz
               CPU Instructions      : sandybridge
               CPU Type              : 64-bit
               CPU Speed             : 1204.23 MHz
           
               EMS Determined Processor Count
                   Physical CPUs     : 1
                   Cores per CPU     : 4
                   Total Processors  : 4
           
               Hyper-Threading       : Off
                 
               System Memory         : 31.3 Gbytes
           
           EMS Release Information for workstation1
           
               EMS Release           : 15.52.3,WRF3.7.1
               EMS Binaries          : x64
System address not resolved would appear to be a problem

Howard :D

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest