Replacing a Storage Node Chassis FRU |
Replacing the Hardware - Storage Node Chassis
This section describes how to replace a Lattus Storage Node chassis. The intent is to retain current hard disk drives and avoid having to rebuild an entire node. No decommissioning or repairs are required when replacing a chassis.
Replacing the Hardware
Prerequisites:
• An empty Storage node chassis
• Small tip Philps screwdriver
• Crash cart (VGA monitor and a USB keyboard) to be provided by the site.
Follow these steps to replace the storage node chassis:
1 – Identify the storage node to be work on, using the location LED provided by the CMC GUI if needed.
2 - Perform a Shut down of the storage node through the CMC GUI.
3 - Detach the safety on the power cables and unplug the power cables. If necessary, mark them.
4 - Unplug the network cables and mark them, if necessary.
5 - Safely unmount the node from the rack as previously described, and place it on a table.
Caution: Use two people to safely mount the node in the rack, or to unmount it.
IMPORTANT! - After you pull the node past the pull-safety, don't leave the node in the rack. Otherwise, the rack rails might break.
6 - Place the empty chassis next to the old one.
7 - Unscrew the 4 screws from the top plate and carefully slide it off.
8 - Unscrew the disks one at a time from the node and slide them out, using the handle.
9 - Carefully place the disks in the empty chassis, each in the same place as was in the old chassis.
IMPORTANT! - Be sure to put the securing screw the disks back in place.
10 - After all disks have been installed into the chassis, slide the top plate on the node and secure it with the 4 previously removed screws.
11 - Use two people to carefully mount the new storage node into the rack, in the same place as the old one.
For more information about mounting the storage node, refer to the instructions in the Lattus Installation Guide.
12 - Reconnect the power cables and their safety, BUT DO NOT plug in the ethernet cables at this time!
13 - Hook up a monitor and keyboard to the node and get into its BIOS by repeatedly pressing the "DEL" or "Delete" key repeatedly,
IMMEDIATLY after powering it on for about 40 seconds, each depression about a half second apart.
Note! - You may not see the American Megatrends logo after power on,
but you must start repeatedly pressing the DEL key anyway for about 40 seconds, each about a half second apart.
Note! - It will take quite a few minutes for the BIOS home screen to appear.
14 - From the main BIOS screen use the right arrow to get to the "Boot" menu. From there:
a - Use the down arrow key to scroll down in its menu selections until the "Hard Drive BBS Priorities" item is selected and hit enter.
b - From the list of disks in the storage node that will be presented, highlight "Boot Option #1" and press enter.
c - From the list of detected disks. Select the "P" disk with the lowest possible index number and press Enter.
For example, if the disk is showing anything other than a P0 or a P1 you will be selecting the lowest of those two that is listed.
The list will refresh and show the disk you selected.
Explanation: The two disks that begin with the letter P and have the lowest numerical index are the two mirrored Operating System disks.
In most cases, this will be "P0" and "P1". In rare cases, the disks will be "P1" and "P2".
d - From the list of disks in the storage node that are presented, highlight "Boot Option #2" and press enter.
e - Select the next P# in line for this disk. In other words if the Boot #1 you selected previously was "P0" you would select "P1" for this boot disk.
If the Boot #1 you selected previously was "P1" you would select "P2" for this boot disk.
15 - Hit the F10 key and select "Yes" to save.
16 - Reconnect the network cables. Disconnect the monitor and keyboard
17 - Push the power button to boot the node. After the node has been successfully booted, it will appear in the uninitialized devices list in the CMC.
(To view uninitialized devices in the CMC, navigate to Dashboard > Administration > Hardware > Servers > Unmanaged Devices > Uninitialized.)
Replacing the Node in the Software
Note: All MAC addresses must be entered in uppercase, unless otherwise specified.
Step(1) In the CMC, navigate to:
Dashboard > Administration > Hardware > Servers > Unmanaged Devices > Uninitialized.
Step(2) Note the IP address and MAC address of the new node.
Step(3) Open a new SSH session to the management controller node (and exit OSMI).
Step(4) Create a new SSH session to the IP address you noted earlier.
root@Controller1:~# ssh 10.10.1.247
root@10.10.1.247's password:
Welcome to Ubuntu 12.04.1 LTS (GNU/Linux 3.8.0-33-generic x86_64)
Step(5) Find the MAC addresses of the new node by running the command: "ip a"
root@nfsROOT:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:60:6e:44:98:1a brd ff:ff:ff:ff:ff:ff
inet 10.10.1.247/24 brd 10.10.1.255 scope global eth0
inet6 fe80::a60:6eff:fe44:981a/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 08:60:6e:44:98:1b brd ff:ff:ff:ff:ff:ff
Step(6) Note the new MAC addresses of eth0 and eth1 listed above and exit the storage node.
root@nfsROOT:~# exit
logout
Connection to 10.10.1.247 closed.
Step(7) Enter Qshell on the Controller Node.
root@Controller1:~# qshell
Welcome to qshell
Step(8) If the new machine is not already showing up under the CMC GUI in the FAILED section of Unmanaged Devices, enter the following QShell commands as shown and in the order listed for "In [#]:"
*** CMC GUI: Dashboard --> Administration --> Hardware --> Servers --> Unmanaged Devices --> Failed
In [1]: api=i.config.cloudApiConnection.find('main')
*** Get the "Name" listed for the replacement node from the CMC GUI:
Dashboard --> Administration --> Hardware --> Servers --> Unmanaged Devices --> Uninitialized
In [2]: machineguid=api.machine.find(name='PM-08:60:6E:44:98:1A')['result'][0]
In [3]: machineguid
Out[4]: '2c2f7c96-92d4-411f-9d62-f0d45624663d'
In [5]: deviceguid=api.machine.list(machineguid=machineguid)['result'][0]['deviceguid']
In [6]: deviceguid
Out[7]: '03b16686-b18f-401b-845b-05b47df9cbe6'
In [8]: api.device.updateModelProperties(deviceguid, status=str(q.enumerators.devicestatustype.FAILED))
Out[9]: {'jobguid': None, 'result': '03b16686-b18f-401b-845b-05b47df9cbe6'}
*** Now the machine should appear in the CMC GUI:
Dashboard --> Administration --> Hardware --> Servers --> Unmanaged Devices --> Failed
It should show as FAILED to be able to run Step 9.
Step(9) Use the new MAC address for eth0 as shown (found in step 2 and 5) to 'cleanup' and remove the machine from the unmanaged devices list:
In [10]: q.amplistor.cleanupMachine('08:60:6E:44:98:1A')
Out[11]: True
*** Now the machine dissappears from the CMC GUI:
Dashboard --> Administration --> Hardware --> Servers --> Unmanaged Devices --> Failed
*** Now the machine appears with the correct 'Name' in CMC GUI:
Dashboard --> Administration --> Hardware --> Servers --> Storage Nodes
Step(10) Update the machine object as follows:
In [12]: cloudapi=i.config.cloudApiConnection.find('main')
In [13]: machine_guid = cloudapi.machine.find(name='Storage4')['result'][0]
*** See comment just above Step (10) for the 'Name'
In [14]: machine = cloudapi.machine.getObject(machine_guid)
Step(11) Use the following commands to update the MAC addresses:
In [15]: print machine.nics[0].name
eth0
In [16]: print machine.nics[1].name
eth1
In [17]: machine.nics[0].hwaddr = '08:60:6E:44:98:1A'
In [18]: machine.nics[1].hwaddr = '08:60:6E:44:98:1B'
Step(12) Use the following command to save the new settings:
In [19]: q.drp.machine.save(machine)
Step(13) Update the device object as follows:
In [20]: device = cloudapi.device.getObject(machine.deviceguid)
In [21]: device.nicports[0].hwaddr
Out[22]: '30:85:A9:A5:4E:BE' <----- Should be the old MAC address, before the change is made below.
In [23]: device.nicports[0].hwaddr = '08:60:6e:44:98:1a'
In [24]: q.drp.device.save(device)
In [25]: q.manage.dhcpd.restart()
In [26]: exit()
Step(14) Reboot the node!!!!!!!
Step(15) Open a new SSH session from the Controller node and connect to the storage node again.
*** As a check of your work so far you can run the command: "ip a"
root@nfsROOT:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:60:6e:44:98:1a brd ff:ff:ff:ff:ff:ff
inet 10.10.1.247/24 brd 10.10.1.255 scope global eth0
inet6 fe80::a60:6eff:fe44:981a/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 08:60:6e:44:98:1b brd ff:ff:ff:ff:ff:ff
Step(16) Edit the following configuration file on the storage node:
root@nfsROOT:~# vi /opt/qbase3/cfg/qconfig/main.cfg
Replace the value seen at "nodename" with the new MAC-address of eth0
Note: The MAC address must entered without colons. Case does not matter.
(example: 08:60:6e:44:98:1a would be entered as 08606e44981a).
[main]
lastlogcleanup = 1412979001
domain = somewhere.com
nodetype = STORAGENODE
nodename = 08606e44981a
logserver_loglevel = 6
logserver_port = 9998
logserver_ip = 127.0.0.1
qshell_firstrun = False
machineguid = 2c2f7c96-92d4-411f-9d62-f0d45624663d
Step(17) Save and close the configuration file.
Step(18) Restart the maintenance agents on the storage node after making the changes to main.cfg.
In [1]: q.dss.maintenanceagents.restart()
In [2]: exit()
Step(19) Verify the ethernet and MAC address settings. View the 70-persistant-net.rules file and then run the lshw –C network command. Compare the data and make sure that the Eth0, Eth1 and MAC address configuration is correct.
# cat /etc/udev/rules.d/70-persistant-net.rules
# lshw –C network
Finished
Attachments |
This page was generated by the BrainKeeper Enterprise Wiki, © 2018 |