Hydra

Questions and other topics related to UEMS 15.
Post Reply
robncyns
Posts: 6
Joined: Sat Apr 12, 2014 12:37 am

Hydra

Post by robncyns » Wed Jan 17, 2018 12:56 am

I have a fresh install of 15.x and when I'm trying to run a simulation over a Rocks Cluster I get the following error in the run_wrfm.log file:

bash: /export/usr2/uems/util/mpich2/bin/hydra_pmi_proxy: No such file or directory. I check the directory and the hydra_pmi-proxy file is most certainly there.

When I terminate the simulation the log file switches to this:

bash: /export/usr2/uems/util/mpich2/bin/hydra_pmi_proxy: No such file or directory
[mpiexec@rwmwx.com] Sending Ctrl-C to processes as requested
[mpiexec@rwmwx.com] Press Ctrl-C again to force abort
[mpiexec@rwmwx.com] HYDU_sock_write (utils/sock/sock.c:286): write error (Bad file descriptor)
[mpiexec@rwmwx.com] HYD_pmcd_pmiserv_send_signal (pm/pmiserv/pmiserv_cb.c:169): unable to write data to proxy
[mpiexec@rwmwx.com] ui_cmd_cb (pm/pmiserv/pmiserv_pmci.c:79): unable to send signal downstream
[mpiexec@rwmwx.com] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
[mpiexec@rwmwx.com] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:198): error waiting for event
[mpiexec@rwmwx.com] main (ui/mpich/mpiexec.c:344): process manager error waiting for completion

My etc/hosts.local look like this

# Added by rocks report host #
# DO NOT MODIFY #
# Add any modifications to #
# /etc/hosts.local file #

127.0.0.1 localhost.localdomain localhost
localhost.localdomain localhost
10.1.1.254 compute-0-0.local compute-0-0
10.1.1.1 rwmwx.local rwmwx
192.168.108.240 rwmwx.com

any thoughts?

Post Reply

Who is online

Users browsing this forum: Baidu [Spider] and 4 guests