Local and Global Storage
Last updated
Last updated
Global storage is that storage available to all nodes. We will use NFS to export (share) the /home directory and the /share directory on the master node to all compute nodes. The /home directory is where user files will be stored, and /share will hold common files, such as applications.
We define local storage as that accessible only on the master node, or only on a particular compute node and therefore only usable by processes on that particular node.
Global storage will be located on the master node, and the master node will function as an NFS server. First install the nfs-kernel-server package:
If successful, the NFS server should be installed and supporting files created. In the /etc subdirectory, there will be a file called 'exports' which is used to configure access control for any directories which are to be exported.
Two lines need to be added to /etc/exports, one to export the user's /home directories, the other for the /share directory for shared applications:
You should end up with a file that looks something like this:
Next, the /share directory needs to be created if it doesn't already exist, and the drives can be exported with the exportfs command. 'exportfs' with no command line options will list exported drives for confirmation.
These steps are best done after pdsh is installed. See the pdsh section.
The compute nodes need to mount the exported directories when they boot. This can be done manually using the 'mount' command, at boot time by putting the mount into /etc/fstab, or via an automount mechanism where the directories are mounted as needed. For the sake of this exercise, we will put /share into fstab and configure user's home directories as an automount.
First, the mountpoints need to be created. /home should already exist on all nodes, but we need to create /share:
or
On one of the compute nodes, you can quickly test if the export from the master node works by mounting the drive manually:
The above should mount the master node's /share directory to the compute node. You can confirm a successful mount with the 'mount' command with no arguments. The mounted drive should also show up with a 'df' command.
You can then unmount the drive wtih the 'umount' command:
For any drives that we want mounted at boot, we can add them to /etc/fstab:
The 'mount -a' command will mount all drives in /etc/fstab, and can be used to confirm that the fstab file is set correctly. Once 'mount -a' seems to work, the node should be rebooted ('reboot') to confirm that the drive comes up on boot.
Once this is working on one node, you should be able to use pdsh or similar tool to set it up on all compute nodes.
Note that /home can also be mounted using /etc/fstab as shown in the commented out line above by uncommenting that line. As a teaching/learning exercise, we will mount it using the automounter in the next section.
The automounter will mount directories on demand, whenever a user tries to access them, and will unmount them once they have been unused for a set period of time. The first step is to install the 'autofs' package on all the compute nodes:
This method for installing autofs assumes the compute nodes have network access to the outside world. See the IP forwarding section.
The autofs package will install a number of configuration files under /etc. We need to edit the /etc/auto.master configuration file to set up the mount point and reference the /etc/auto.home file. One way to do this is on one of the compute nodes, say nano01, and once configured and tested, the configuration files can be copied to the other compute nodes.
On nano01, /etc/auto.master should include something like the following:
This file defines the mount point (/home), the map (/etc/auto.home) and other options (--timeout) for automount directories.
The /etc/auto.home file includes the map. Wildcards are permitted. In this case we will map /home/* on the local node to the master node /home directory of the same name. It is also possible to be more explicit, with each user home directory specified, one per line, replacing both the * and the & with the username.
Once this is tested and working on one node, pdcp or a similar tool can be used to copy the /etc/auto.master and /etc/auto.home files to the remaining compute nodes.
Setting up local scratch storage is optional. This is often done if local storage offers better performance than network storage. Applications that benefit from the increased performance can write to local storage during a job run, and then copy results out to global storage after the run is finished.