forked from zfsonlinux/zfsonlinux.github.com
-
Notifications
You must be signed in to change notification settings - Fork 0
/
lustre-configure-single.html
157 lines (133 loc) · 6.51 KB
/
lustre-configure-single.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
<html>
<head>
<title>ZFS on Linux</title>
<meta name="keyword" content="zfs, linux"/>
<meta name="description" content="Single node Lustre configuration." />
<meta name="robots" content="all" />
</head>
<body>
<center>
<a href="index.html"><img title="Native ZFS on Linux" alt="Native ZFS on Linux" src="images/zfs-linux.png"></a>
<table width=80%>
<tr>
<td>
<p>To get familiar with configuring a zfs based Lustre filesystem it is
helpful to setup a small single node filesystem. This exact configuration
is designed to illustrate several of the available options. If you're not
already familiar with configuring a traditional ldiskfs based Lustre
filesystem you should review the
<a href="http://wiki.lustre.org/index.php/Lustre_Documentation">
Lustre documentation</a>.
<p>The filesystem will be named <em>lustre</em> and it will consist of:</p>
<ul>
<li>1 MGS: 128 MiB zfs file-backed 2-vdev mirror array</li>
<li>1 MDT: 128 MiB zfs file-backed 2-vdev mirror array</li>
<li>2 OSTs: 1 GiB zfs file-backed 3-vdev raidz2 array</li>
</ul>
<p>The first thing which you should do is ensure SELinux is disabled.
The filesystem types <em>ldiskfs</em>, <em>zfs</em>, and <em>lustre</em>
are not part of the default SELinux policy. This can cause mount
failures because SELinux is unaware that these filesystems support xattrs.
To disable SELinux edit <em>/etc/selinux/config</em> and change the
SELINUX line to SELINUX=disabled. Then reboot your system to remove
the existing policy.
<table bgcolor="eeeeee" width=100%><tr><td><pre>
$ cat /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
<b>SELINUX=disabled</b>
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
</td></tr></table></pre>
<p>Next configure the required Lustre services. This process will be
slightly different than that described in the official
<a href="http://wiki.lustre.org/index.php/Lustre_Documentation">
Lustre documentation</a>. Aside from the zfs specific changes these
packages contain Livermore's init scripts. They are used to provide
a traditional interface for starting/stopping services. Additionally,
they are integrated with the Linux
<a href="http://www.linux-ha.org/wiki/Heartbeat">Heartbeat package</a>
to provide automatic Lustre failover. While you do not have to use
the init scripts, this example does.</p>
<p>Create a 128 MiB MGS, a 128 MiB MDT, and two 1 GiB OSTs. To specify
zfs based services use the <em>--backfstype=zfs</em> option to override
the ldiskfs default value. When testing you can also include the
<em>--vdev-size</em> option to automatically create sparse files for
vdevs. This avoids the need to setup traditional loopback devices.
Next, for MDT and OST services you will need to explicitly set their
service index and the NID of the MGS host. Finally, you must list
the dataset name and the desired zfs pool configuration.
<table bgcolor="eeeeee" width=100%><tr><td><pre>
$ sudo mkfs.lustre --mgs --backfstype=zfs --vdev-size 131072 \
lustre-mgs/mgs mirror /tmp/lustre-mgs-disk0 /tmp/lustre-mgs-disk1
$ sudo mkfs.lustre --mdt --backfstype=zfs --vdev-size 131072 --index=0 --mgsnode=192.168.1.100@tcp \
lustre-mdt0/mdt0 mirror /tmp/lustre-mdt0-disk0 /tmp/lustre-mdt0-disk1
$ sudo mkfs.lustre --ost --backfstype=zfs --vdev-size 1048576 --index=0 --mgsnode=192.168.1.100@tcp \
lustre-ost0/ost0 raidz2 /tmp/lustre-ost0-disk0 /tmp/lustre-ost0-disk1 /tmp/lustre-ost0-disk2
$ sudo mkfs.lustre --ost --backfstype=zfs --vdev-size 1048576 --index=1 --mgsnode=192.168.1.100@tcp \
lustre-ost1/ost1 raidz2 /tmp/lustre-ost1-disk0 /tmp/lustre-ost1-disk1 /tmp/lustre-ost1-disk2
$ sudo zfs list
NAME USED AVAIL REFER MOUNTPOINT
lustre-mdt0 128K 90.9M 30K /lustre-mdt0
lustre-mdt0/mdt0 20K 90.9M 20K /lustre-mdt0/mdt0
lustre-mgs 128K 90.9M 30K /lustre-mgs
lustre-mgs/mgs 20K 90.9M 20K /lustre-mgs/mgs
lustre-ost0 127K 90.9M 29K /lustre-ost0
lustre-ost0/ost0 20K 90.9M 20K /lustre-ost0/ost0
lustre-ost1 130K 158M 30K /lustre-ost1
lustre-ost1/ost1 20K 158M 20K /lustre-ost1/ost1
</pre></td></tr></table>
<p>You may have noticed that we created a seperate zfs pool and dataset
for each of our services. While all the services can be created in a
single zfs pool this will result in the available free space being
over reported. Just like normal zfs filesystems each dataset may
use any available free space in the pool.</p>
<p>Now create a file called <em>/etc/ldev.conf</em>. It is used by
the init scripts and the failover infrastructure to control the Lustre
services. Minimally this file must contain the hostname where the service
should run, the service label, and the ldiskfs block device or zfs dataset
name. Make sure you update this file with <em>your</em> hostname.</p>
<table bgcolor="eeeeee" width=100%><tr><td><pre>
$ cat /etc/ldev.conf
# example /etc/ldev.conf
#
# local foreign/- label [md|zfs:]device-path [journal-path]
#
toss-2-0-amd64 - mgs zfs:lustre-mgs/mgs
toss-2-0-amd64 - mdt0 zfs:lustre-mdt0/mdt0
toss-2-0-amd64 - ost0 zfs:lustre-ost0/ost0
toss-2-0-amd64 - ost1 zfs:lustre-ost1/ost1
</pre></td></tr></table>
<p>Everything is now in place to start the Lustre services. You should
start the MGS, then MDT, and lastly the OSTs. Once everything is running
the Lustre filesystem can be mounted. A word of caution before starting
the services. The Lustre Orion code still requires some polish and you
may encounter a few bugs. It is not currently considered ready for
real production usage.</p>
<table bgcolor="eeeeee" width=100%><tr><td><pre>
$ sudo service lustre start mgs
Mounting lustre-mgs/mgs on /mnt/lustre/local/mgs
$ sudo service lustre start mdt0
Mounting lustre-mdt0/mdt0 on /mnt/lustre/local/mdt0
$ sudo service lustre start ost0
Mounting lustre-ost0/ost0 on /mnt/lustre/local/ost0
$ sudo service lustre start ost1
Mounting lustre-ost1/ost1 on /mnt/lustre/local/ost1
$ sudo mkdir -p /mnt/lustre/client
$ sudo mount -t lustre 192.168.1.100:/lustre /mnt/lustre/client
$ df -h /mnt/lustre/client
Filesystem Size Used Avail Use% Mounted on
192.168.1.100:/lustre
1.9G 75M 1.8G 5% /mnt/lustre/client
</pre></td></tr></table>
</td>
</td>
</table>
</center>
</body>
</html>