Gentoo Logo
Gentoo Spaceship




Note: Due to technical difficulties, the Archives are currently not up to date. GMANE provides an alternative service for most mailing lists.
c.f. bug 424647
List Archive: gentoo-cluster
Navigation:
Lists: gentoo-cluster: < Prev By Thread Next > < Prev By Date Next >
Headers:
To: gentoo-cluster@g.o
From: "Brian Budge" <brian.budge@...>
Subject: openib, no /dev/infiniband
Date: Wed, 2 Jan 2008 13:39:37 -0800
Hi all -<br><br>I&#39;m new to infiniband and still getting my feet wet.&nbsp; I am admining a very small cluster of 5 nodes, and have recently installed infiniband HCAs.&nbsp; I have the infiniband modules built into the kernel, and I am using the openib-userspace package in the gentoo-science overlay.
<br><br>The strange thing with my situation is that I have infiniband working with openmpi on 4 of my 5 nodes, but the 5th one is a mystery.&nbsp; <br><br>All 4 working nodes have a /dev/infiniband directory that look roughly like this:
<br><br>crw-rw---- 1 root root 231,&nbsp; 64 Dec 31 09:13 issm0<br>crw-rw-rw- 1 root root 231, 224 Dec 31 09:13 ucm0<br>crw-rw---- 1 root root 231,&nbsp;&nbsp; 0 Dec 31 09:13 umad0<br>crw-rw-rw- 1 root root 231, 192 Dec 31 09:13 uverbs0
<br><br><br>But the 5th node doesn&#39;t, which could indicate the problem (it isn&#39;t completely the problem, as I tried making those nodes myself to match, but it doesn&#39;t help).&nbsp; I&#39;m just not sure what the difference is, because I installed them all the same way, they all have the same hardware, and they are all running the same kernel.
<br><br>All 5 nodes have the same thing in the /sys/class/infiniband directory.<br><br>Here&#39;s the mpirun I am trying:<br><br>mpirun -np 2 -mca btl self,openib -machinefile burn_machine_file ./loadtest<br>[burn-3][0,1,1][btl_openib_component.c:437:init_one_hca] error obtaining device context for mthca0 errno says No such file or directory
<br><br>--------------------------------------------------------------------------<br>WARNING: There were errors during IB HCA initialization on host &#39;burn-3&#39;.<br>--------------------------------------------------------------------------
<br>--------------------------------------------------------------------------<br>WARNING: There is at least on IB HCA found on host &#39;burn-3&#39;, but there is<br>no active ports detected. This is most certainly not what you wanted.
<br>Check your cables and SM configuration.<br>--------------------------------------------------------------------------<br>--------------------------------------------------------------------------<br>Process 0.1.1 is unable to reach 
0.1.0 for MPI communication.<br>If you specified the use of a BTL component, you may have<br>forgotten a component (such as &quot;self&quot;) in the list of <br>usable components.<br>--------------------------------------------------------------------------
<br><br>Any help would be appreciated!&nbsp; Thanks.<br><br>&nbsp; Brian<br><br>
Replies:
Re: openib, no /dev/infiniband
-- Bryan Green
Navigation:
Lists: gentoo-cluster: < Prev By Thread Next > < Prev By Date Next >
Previous by thread:
Oracle Grid/RAC?
Next by thread:
Re: openib, no /dev/infiniband
Previous by date:
Re: Porting RH gfs-deploy-tool to Gentoo
Next by date:
Re: openib, no /dev/infiniband


Updated Jun 17, 2009

Summary: Archive of the gentoo-cluster mailing list.

Donate to support our development efforts.

Copyright 2001-2013 Gentoo Foundation, Inc. Questions, Comments? Contact us.