Node Management

The Operations staff at PlanetLab Central remotely maintains and administers your nodes. There is usually no need for you to become involved in day-to-day maintenance, although you may occasionally be asked to reboot or repair your nodes.

Because your site owns the nodes and their connectivity, however, you can exert some control over their operation. In particular, you may restrict the outbound bandwidth of your nodes, login to a restricted administrative account on the nodes, and switch the nodes into an administrative debug state.

Setting bandwidth limits

By default, your PlanetLab nodes are not bandwidth limited. You may update the outbound bandwidth limits of each of your nodes by clicking the update link under the node networks heading of the node page on the website. The bandwidth limits cap the average transmission rate of the nodes.

Logging in as site_admin

You may login to your own nodes remotely via SSH as the user site_admin. Before attempting to do so, ensure that you have already uploaded your public SSH key to the website and waited at least an hour for the key to propagate to your nodes. See the User's Guide for more information about creating and uploading an SSH key. To login to one of your nodes, specify site_admin as the login name, the path to your private key file (e.g. ~/.ssh/identity), and the node to login to (e.g. planetlab-1.cs.princeton.edu):

ssh -l site_admin -i ~/.ssh/identity planetlab-1.cs.princeton.edu

The site_admin account is not isolated in a VServer as all other accounts are. The site_admin user may execute a variety of administrative programs in /usr/local/planetlab/bin via sudo:

# See all processes (including those running in VServers)
sudo /usr/sbin/vps ax
# See /var/log/messages
sudo NOT CURRENTLY SUPPORTED Contact support@planet-lab.org if you need this functionality.
# See all network traffic
sudo /usr/sbin/tcpdump
# Shutdown (halt) the node in 5 minutes
sudo /sbin/shutdown -h 5 "Scheduled maintenance, please save your work and logout"
# Set my password for console login
sudo /usr/bin/passwd site_admin
# Disable my password for console login
sudo /usr/bin/passwd -d site_admin

The last two commands may be especially useful if your nodes are connected to monitors and keyboards, are located in a secured environment, and you wish to be able to login to them on their consoles. To login to a node on the console, specify site_admin as the user and the password you set with /usr/bin/passwd as in the example above.

Stopping or rebooting nodes

Before powering off or rebooting a node for scheduled maintenance, please try to notify all users of the node first by e-mailing the PlanetLab Users mailing list. To power off a node, login to the node as the user site_admin and execute:

sudo /sbin/shutdown -h 5 "Scheduled maintenance, please save your work and logout"

To reboot a node, execute:

sudo /sbin/shutdown -r 5 "Scheduled maintenance, please save your work and logout"

Wait for the node to power off or reboot by itself. Wait at least 15 minutes before deciding to manually power off or reboot a node.

As a last resort, if you cannot login to the node via SSH or the console, simply power off (or power cycle) the machine. If you do not have physical access to the nodes but have connected them to a remotely accessible power control unit (PCU), follow the instructions that came with your PCU for removing or cycling power to the outlets.

Boot States

After selecting an individual node from your site, you will be able to see what state PLC currently believes the node to be in. There are several recognized boot states:

  • Boot - The node is able to contact PLC, and should allow the creation of slices and for users to access these slices.
  • Safeboot/Diagnose - The node is running, but slice users are not permitted access. Only PlanetLab staff or site administrators can access the node. This is also called "Safe" mode. Used for emergency situations, described below.
  • Failboot/Debug - This mode is entered if the machine has some problem reaching 'Boot' mode, for instance due to network failure, hardware failure, or an integrity problem with the installed software.
  • Disabled - A node that is disabled, will remain in a 'safe' mode but will also be exempt from automated actions.  This state is appropriate when scheduled downtime is needed, or when replacing hardware, for instance a bad disk.
  • Install - When a machine is initially added to the PLC database, it is set to install mode. When booting, the machine will prompt a user to confirm that it is ok to wipe out the current content of the disk.
  • Reinstall - Like Install but no confirmation is requested.

Safeboot/Diagnose Mode

In an emergency situation, such as hardware failure or a verified security compromise, you may reboot your nodes into a secure administrative "diagnosis" state. Slices will not run in this state, and only staff at PlanetLab Operations will be able to access the nodes. Unless a hard drive has failed, slice data (or forensic evidence of compromise) will not be lost.

To place a node into safeboot or diagnose mode, visit the node's details page on the website and select Diagnose (or Safeboot) as its Boot State. Login to the node as the user site_admin and execute:

sudo /sbin/shutdown -r 5 "Emergency, please save your work and logout"

If you cannot login to the node via SSH or the console, you may send to the node a specially formed ICMP packet called the PlanetLab Ping Of Death (POD) that will immediately reboot it. Click Stop to attempt this action. You may only Stop nodes that are in Boot state, so if you placed the node into Diagnose state, put it back into Boot state before clicking Stop.