Gentoo Archives: gentoo-cluster

From: Brian Budge <brian.budge@×××××.com>
To: gentoo-cluster@l.g.o
Subject: Re: [gentoo-cluster] openib, no /dev/infiniband
Date: Thu, 03 Jan 2008 00:07:43
Message-Id: 5b7094580801021606y6a65804ck115731926f2ba0a8@mail.gmail.com
In Reply to: Re: [gentoo-cluster] openib, no /dev/infiniband by Bryan Green
1 Hi Bryan -
2
3 I don't seem to have a 40-ib.rules in any of my /etc/udev/rules.d on any
4 node.
5
6 My /sys/class/infiniband directory contains mthca0, which contains:
7 > ls -la /sys/class/infiniband/mthca0/
8 total 0
9 drwxr-xr-x 3 root root 0 Jan 2 20:54 .
10 drwxr-xr-x 3 root root 0 Jan 2 20:54 ..
11 -r--r--r-- 1 root root 4096 Jan 2 21:07 board_id
12 lrwxrwxrwx 1 root root 0 Jan 3 00:01 device ->
13 ../../../devices/pci0000:20/0000:20:0a.0/0000:21:00.0
14 -r--r--r-- 1 root root 4096 Jan 2 21:07 fw_ver
15 -r--r--r-- 1 root root 4096 Jan 2 21:07 hca_type
16 -r--r--r-- 1 root root 4096 Jan 2 21:07 hw_rev
17 -rw-r--r-- 1 root root 4096 Jan 2 21:07 node_desc
18 -r--r--r-- 1 root root 4096 Jan 2 21:07 node_guid
19 -r--r--r-- 1 root root 4096 Jan 2 21:06 node_type
20 drwxr-xr-x 3 root root 0 Jan 2 21:07 ports
21 lrwxrwxrwx 1 root root 0 Jan 3 00:01 subsystem ->
22 ../../../class/infiniband
23 -r--r--r-- 1 root root 4096 Jan 2 21:07 sys_image_guid
24 --w------- 1 root root 4096 Jan 2 20:54 uevent
25
26 I don't have any ib modules loaded at all on any node. All of my kernel
27 modules are built into the kernel:
28
29 CONFIG_INFINIBAND=y
30 CONFIG_INFINIBAND_USER_MAD=y
31 CONFIG_INFINIBAND_USER_ACCESS=y
32 CONFIG_INFINIBAND_USER_MEM=y
33 CONFIG_INFINIBAND_ADDR_TRANS=y
34 CONFIG_INFINIBAND_MTHCA=y
35 CONFIG_INFINIBAND_MTHCA_DEBUG=y
36 # CONFIG_INFINIBAND_IPATH is not set
37 CONFIG_INFINIBAND_AMSO1100=y
38 # CONFIG_INFINIBAND_AMSO1100_DEBUG is not set
39 CONFIG_MLX4_INFINIBAND=y
40 CONFIG_INFINIBAND_IPOIB=y
41 # CONFIG_INFINIBAND_IPOIB_CM is not set
42 CONFIG_INFINIBAND_IPOIB_DEBUG=y
43 # CONFIG_INFINIBAND_IPOIB_DEBUG_DATA is not set
44 # CONFIG_INFINIBAND_SRP is not set
45 # CONFIG_INFINIBAND_ISER is not set
46
47
48 Thanks,
49 Brian
50
51 On Jan 2, 2008 2:11 PM, Bryan Green <bryan.d.green@××××.gov> wrote:
52
53 > "Brian Budge" writes:
54 > >
55 > > Hi all -
56 > >
57 > > I'm new to infiniband and still getting my feet wet. I am admining a
58 > very
59 > > small cluster of 5 nodes, and have recently installed infiniband HCAs.
60 > I
61 > > have the infiniband modules built into the kernel, and I am using the
62 > > openib-userspace package in the gentoo-science overlay.
63 > >
64 > > The strange thing with my situation is that I have infiniband working
65 > with
66 > > openmpi on 4 of my 5 nodes, but the 5th one is a mystery.
67 > >
68 > > All 4 working nodes have a /dev/infiniband directory that look roughly
69 > like
70 > > this:
71 > >
72 > > crw-rw---- 1 root root 231, 64 Dec 31 09:13 issm0
73 > > crw-rw-rw- 1 root root 231, 224 Dec 31 09:13 ucm0
74 > > crw-rw---- 1 root root 231, 0 Dec 31 09:13 umad0
75 > > crw-rw-rw- 1 root root 231, 192 Dec 31 09:13 uverbs0
76 > >
77 > >
78 > > But the 5th node doesn't, which could indicate the problem (it isn't
79 > > completely the problem, as I tried making those nodes myself to match,
80 > but
81 > > it doesn't help). I'm just not sure what the difference is, because I
82 > > installed them all the same way, they all have the same hardware, and
83 > they
84 > > are all running the same kernel.
85 >
86 > The '/dev/infiniband' subdir is created by the udev rules in
87 > '/etc/udev/rules.d/40-ib.rules'
88 >
89 > Does the '/sys/class/infiniband' directory exist?
90 > If so, what does it contain? What loaded modules with an 'ib_' prefix
91 > does
92 > lsmod report?
93 >
94 > -bryan
95 >
96 > --
97 > gentoo-cluster@g.o mailing list
98 >
99 >