Benchmark Test case 27/04/2011 WRF-ARW

Questions and other topics related to UEMS 18.
Post Reply
ddale81
Posts: 11
Joined: Sun Feb 18, 2018 8:07 pm

Benchmark Test case 27/04/2011 WRF-ARW

Post by ddale81 » Sun Oct 27, 2019 8:41 pm

Hi, I'm running the test case to check the execution time. I attach Cpu and Ram usage screen and result benchmark.
Why can't I take more Ram? You can help me ?
I run auto installation via uems_install.pl file.
Thanks Ale
Image
https://ibb.co/QHY8x88

Code: Select all

 *  Just because it's all about you:

              Basic System Information for pv
              
                  System Date           : Sun Oct 27 20:37:55 2019
                  System Hostname       : pv
                  System Address        : None available
              
                  System OS             : Linux
                  Linux Distribution    : Ubuntu 18.04.3 LTS
                  OS Kernel             : 4.15.0-66-generic
                  Kernel Type           : 0-66-generic
              
              Processor and Memory Information for pv
              
                  CPU Name              : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
                  CPU Microarchitecture : sandybridge
                  CPU Type              : x86_64
                  CPU Speed             : 2.593 MHz
              
                  UEMS Determined Processor Count
                      Sockets           : 2
                      Cores per Socket  : 8
                      Total Cores       : 16
              
                  Hyper-Threading       : Off   
              
                  System Memory         : 96 Gbytes
              
              UEMS Release Information for pv
              
                  UEMS Release          : 19.8.1
                  UEMS WRF Release      : 4.1.2
                  UEMS Binaries         : x64
               
              A summary of nodes and processors used for the benchmark simulation:
               
                16 Processors on pv       
              --------------------------------
               
                  *  16 Total Processors
                  *  1  Tile per Processor
                  *  1 x 16 Domain Decomposition


           ☺  Benchmark simulation accomplished in 27 minutes 22 seconds

meteoadriatic
Posts: 1600
Joined: Wed Aug 19, 2009 10:05 am

Re: Benchmark Test case 27/04/2011 WRF-ARW

Post by meteoadriatic » Thu Oct 31, 2019 6:20 pm

wrf.exe will use RAM how much it needs to fit domain data and do calculations on it. So, depending on your domain grid size, number of vertical levels, physics used and so on, it will be more or less, but the rest of RAM will stay free; there will be no point of usurp the RAM and then not use it.

ddale81
Posts: 11
Joined: Sun Feb 18, 2018 8:07 pm

Re: Benchmark Test case 27/04/2011 WRF-ARW

Post by ddale81 » Tue Nov 05, 2019 8:37 am

Thanks for your answer Meteoadriatic . Ok for the Ram I understand the operation.
Another question for the test case ...
I have a Hp proliant dl380p g8 with ubuntu desktop installed and the test case is run in about 27 minutes .... seeing other results it seems a bit high as a result, can it affect the operating system? I have already tried to use 15/14 cores to leave free space to the kernel as suggested by Robert R. but I'm not going to improve the execution times

meteoadriatic
Posts: 1600
Joined: Wed Aug 19, 2009 10:05 am

Re: Benchmark Test case 27/04/2011 WRF-ARW

Post by meteoadriatic » Tue Nov 05, 2019 11:46 pm

It's completely OK/kernel independent.

Depending on CPU, you can also try even less number of threads. Why? Because if you have significant turbo boost, it will kick in in full potential probably only when you use less cores. So depending on case to case, you might run faster at higher frequency / less cores used, than on lower frequency / more cores used. And it is not that simple because different number of cores get decomposed in patches differently (for example, 10 cores can decompose as 1x10, 10x1, 2x5 and 5x2. All those combinations will yield different speed as far as I can tell. But 11 cores can decompose only into 1x11 and 11x1 which are both not very optimal (better is 2x5 for example, or 3x3 if 9 cores are used, but also again highly case dependent). You can play with those numbers and see what gets you fastest runs.

Pro tip: don't wait for run to complete. Look into rsl.out.0000 file (tail -F rsl.out.0000) right at beginning of integration and watch for amount of seconds used for each time step. Tune your numbers in way so that this elapsed time for each time step is minimized. That way you need to run only few timesteps before you Ctrl-C the run, change some parameter and start over.

ddale81
Posts: 11
Joined: Sun Feb 18, 2018 8:07 pm

Re: Benchmark Test case 27/04/2011 WRF-ARW

Post by ddale81 » Wed Nov 06, 2019 7:30 am

tonight I try to change the parameters.

REAL_NODECPUS is a value always equal to WRFM_NODECPUS? or is it the value that is used to decompose the domain?


example:

to set how much to decompose?

REAL_NODECPUS = ( local:N set to 3 ? )

to set the number of cores to use:

WRFM_NODECPUS = ( local:N set to 12 ? )


it's correct ? then doing so I go to see rsl.out.0000 file and modify until I find the maximum for my hardware?


Thanks Ale

meteoadriatic
Posts: 1600
Joined: Wed Aug 19, 2009 10:05 am

Re: Benchmark Test case 27/04/2011 WRF-ARW

Post by meteoadriatic » Wed Nov 06, 2019 8:04 am

Those two lines control how many cores you will use. Below that section in the same file are setting for domain decomposition.

ddale81
Posts: 11
Joined: Sun Feb 18, 2018 8:07 pm

Re: Benchmark Test case 27/04/2011 WRF-ARW

Post by ddale81 » Mon Nov 11, 2019 7:52 pm

I followed the advice and binding on forum, I completed the fourth bank of Ram that was missing before, I would say that I arrived at a very good compromise. Place the two different tests.
Thanks for the informations.


before

Code: Select all

  *  Just because it's all about you:

              Basic System Information for ale
              
                  System Date           : Thu Nov  7 05:37:19 2019
                  System Hostname       : ale
                  System Address        : None available
              
                  System OS             : Linux
                  Linux Distribution    : Ubuntu 16.04.6 LTS
                  OS Kernel             : 4.15.0-66-generic
                  Kernel Type           : 0-66-generic
              
              Processor and Memory Information for ale
              
                  CPU Name              : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
                  CPU Microarchitecture : sandybridge
                  CPU Type              : x86_64
                  CPU Speed             : 2.594 MHz
              
                  UEMS Determined Processor Count
                      Sockets           : 2
                      Cores per Socket  : 8
                      Total Cores       : 16
              
                  Hyper-Threading       : Off   
              
                  System Memory         : 96 Gbytes
              
              UEMS Release Information for ale
              
                  UEMS Release          : 19.8.1
                  UEMS WRF Release      : 4.1.2
                  UEMS Binaries         : x64
               
              A summary of nodes and processors used for the benchmark simulation:
               
                12 Processors on ale      
              --------------------------------
               
                  *  12 Total Processors
                  *  4  Tiles per Processor
                  *  1 x 12 Domain Decomposition


           ☺  Benchmark simulation accomplished in 22 minutes 42 seconds
after

Code: Select all

  *  Just because it's all about you:

              Basic System Information for ale
              
                  System Date           : Mon Nov 11 19:49:31 2019
                  System Hostname       : ale
                  System Address        : None available
              
                  System OS             : Linux
                  Linux Distribution    : Ubuntu 16.04.6 LTS
                  OS Kernel             : 4.15.0-66-generic
                  Kernel Type           : 0-66-generic
              
              Processor and Memory Information for ale
              
                  CPU Name              : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
                  CPU Microarchitecture : sandybridge
                  CPU Type              : x86_64
                  CPU Speed             : 2.594 MHz
              
                  UEMS Determined Processor Count
                      Sockets           : 2
                      Cores per Socket  : 8
                      Total Cores       : 16
              
                  Hyper-Threading       : Off   
              
                  System Memory         : 128 Gbytes
              
              UEMS Release Information for ale
              
                  UEMS Release          : 19.8.1
                  UEMS WRF Release      : 4.1.2
                  UEMS Binaries         : x64
               
              A summary of nodes and processors used for the benchmark simulation:
               
                12 Processors on ale      
              --------------------------------
               
                  *  12 Total Processors
                  *  4  Tiles per Processor
                  *  1 x 12 Domain Decomposition


           ☺  Benchmark simulation accomplished in 14 minutes 51 seconds

Post Reply