UEMS gives System Signal Code 4 Illegal Instruction during Benchmark

With soon inclusion of the NEMS NMM-B model into the EMS package, WRF EMS changes its name and becomes UEMS.
Post Reply
ryip
Posts: 3
Joined: Thu Aug 10, 2017 9:16 pm

UEMS gives System Signal Code 4 Illegal Instruction during Benchmark

Post by ryip » Thu Aug 10, 2017 9:50 pm

Hi, new WRF user here. I just installed it and trying to run the benchmark but it seems to give me an "Signal Code 4 Illegal Instruction" error. This is a fresh Centos 7 install.

Code: Select all

$ ems_run --domain 2

  Starting UEMS Program ems_run (V15.99.8) on 20e379678867 at Thu Aug 10 21:08:20 2017 UTC

which: no mail in (.:/home/uems/strc:/home/uems/strc/EMSbin:/home/uems/domwiz/bin:/home/uems/bin:/home/uems/util/grads/data/bin:/home/uems/util/bin:/home/uems/util/grads/data:/home/uems/util/mpich2/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/emsuser/.local/bin:/home/emsuser/bin)

     I.  Preparing your EMS Run experience


         *  You are running the WRF ARW core 18-19 August 2008 benchmark case - Always a wise decision

         *  Simulation start and end times:

              Domain         Start                   End            Parent

                1     2011-04-27_06:00:00     2011-04-28_12:00:00
                2     2011-04-27_06:00:00     2011-04-28_12:00:00     1

         *  Simulation length will be 30 hours

         *  A large timestep of 90 seconds will be used for this simulation

         *  This is a 1-way nested simulation


    II.  Creating the initial and boundary condition files for the user domain(s)


         *  The WRF REAL program shall be run on the following systems and processors:

              4  processors on 20e379678867     (1 tile per processor)

         *  Creating WRF initial and boundary condition files  - Failed (132)

            System Signal Code (SN) : 4 (Illegal Instruction)


         !  UGH! Creation of model initial and boundary conditions failed!

            I hate when this #%^!#!!% happens.  Hopefully nobody lost an eye!

         !  While perusing the log/run_real.log file I determined the following:

               It appears that your run failed due to an illegal instruction error on
               your system. This failure is typically caused when the EMS binaries being
               executed are compiled for a CPU architecture that is different from the
               one on this machine.

               If this is a stand-alone system then simply run the "sysinfo" command and
               note the "CPU Instructions" information provided.

               If you are running on a cluster, use the "netcheck" utility and note the
               the "CPU Instructions" information for each machine.

               After you have determined the appropriate binary set for your system then
               use the "--binonly" option to replace the EMS binaries:  %  ems_install.pl
               --binonly <taget CPU>

               More information may be found on the STRC site:

               http://strc.comet.ucar.edu

               or you can contact Robert.Rozumalski@noaa.gov to get a new set of
               binaries.


         !  Here are the last few lines from the run_real.log file:

             starting wrf task 0 of 4
             starting wrf task 2 of 4
             starting wrf task 1 of 4
             starting wrf task 3 of 4

            ===================================================================================
            = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
            = PID 5615 RUNNING AT 20e379678867
            = EXIT CODE: 132
            = CLEANING UP REMAINING PROCESSES
            = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
            ===================================================================================
            YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Illegal instruction (signal 4)
            This typically refers to a problem with your application.
            Please see the FAQ page for debugging suggestions



         *  Benchmark information is available in static/ems_benchmark.info


         !  Here's a little help from your friend at EMS world headquarters:

            The log files from your premature termination have been neatly bundled in log/2017081021.real_crash_logs.tgz

            Feel free to send them to a person who cares should you need some
            assistance in troubleshooting this problem.



    Your EMS party was busted at Thu Aug 10 21:09:07 2017 UTC - Ya know, 'cause stuff just happens

  As Shields and Yarnell loved to gesticulate: "Think Globally, Model Locally!"

Here is an output of my sysinfo:

Code: Select all

$ sysinfo

  Starting UEMS Program sysinfo (V15.99.8) on 20e379678867 at Thu Aug 10 21:11:14 2017 UTC



     *  Gathering information for localhost 20e379678867
        ------------------------------------------------------------------

          System Information for 20e379678867

              System Date           : Thu Aug 10 21:11:15 2017 UTC
              System Hostname       : 20e379678867
              System Address        : 172.17.0.4

              System OS             : Linux
              OS Kernel             : 3.10.0-514.26.2.el7.x86_64
              Kernel Type           : x86_64
              Linux Distribution    : CentOS Linux release 7.3.1611 (Core)

          Network Interface Information for 20e379678867

              Network Interface     : sh
              Interface Address     : None Assigned
              Address Resolves to   : Nothing
              Interface State       : Inactive

          Processor and Memory Information for 20e379678867

              CPU Name              : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
              CPU Instructions      : nehalem
              CPU Type              : 64-bit
              CPU Speed             : 1600 MHz

              EMS Determined Processor Count
                  Physical CPUs     : 1
                  Cores per CPU     : 4
                  Total Processors  : 4

              EMS.cshrc Specified Processor Count
                  Physical CPUs     : 1
                  Cores per CPU     : 4
                  Total Processors  : 4

              Hyper-Threading       : On

          Note: Attempting to use virtual "Hyper-threaded" CPUs while
          running the EMS may result in a degradation in performance.

              System Memory         : 5.6 Gbytes

          EMS User Information for emsuser on 20e379678867

              User  ID              : 1000
              Group ID              : 1000
              Home Directory        : /home/emsuser
              Home Directory Mount  : Local
              User Shell            : /bin/bash
              Shell Installed       : Yes
              Shell Login Files     :
              EMS.cshrc Sourced     :
              EMS.cshrc Port Range  : None Defined

          EMS Installation Information for 20e379678867

              EMS Release           : 15.99.8,WRF3.7.1
              EMS Home Directory    : /home/uems
              EMS Home Mount        : Local
              EMS User ID           : 1000
              EMS Group ID          : 1000
              EMS Binaries          : x64

              EMS Run Directory     : /home/uems/runs
              EMS Run Dir Mount     : Local
              EMS Run Dir User ID   : 1000
              EMS Run Dir Group ID  : 1000

              Run Dir Avail Space   : 1604.96 Gb
              Run Dir Space Used    : 8%

              EMS Util Directory    : /home/uems/util
Did I install it correctly via "perl uems_install.pl --install" ?

Thanks ahead for any help you can provide.

ryip
Posts: 3
Joined: Thu Aug 10, 2017 9:16 pm

Re: UEMS gives System Signal Code 4 Illegal Instruction during Benchmark

Post by ryip » Fri Aug 11, 2017 2:58 am

I don't think an Intel i7 920 is so old that it doesn't have the instructions WRF needs to complete the benchmark. Either the cpu is too old or I have installed the wrong binaries maybe. :?:

ryip
Posts: 3
Joined: Thu Aug 10, 2017 9:16 pm

Re: UEMS gives System Signal Code 4 Illegal Instruction during Benchmark

Post by ryip » Fri Aug 11, 2017 10:25 pm

I have solved my own issue, it seems i7 920 is too old and it doesn't have some of the instruction set needed for UEMS WRF. I tried it on a i7 4770K cpu and it works great.

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest