OVERVIEW
This info was pulled from Tim Slaton's dxwiki pages and is the proper way to restart OST. Important: If you just do a service ost stop and then service ost start as soon as the command prompt is back you will most likely have problems.
rm /var/DXi/processwatcher
/etc/rc.d/init.d/ost stop
- Wait until the ost service is stopped. You will see the following lines in the tsunami logs:
INFO - 09/12/11-19:26:51 - DXi_script DataPath(0) [qlog] - ost stopping
INFO - 09/12/11-19:26:51 - DXi_script DataPath(0) [qlog] - ost stopped
- Check if there are any open connections. Once OST stops you will need to verify there are no connections open via OST to the LSU’s. If there are connections open and you restart OST that can be a trigger for OST to go into LSU recovery-mode is when ostd is restarted. OST reads its config-file to find that the storage-server connection count open to the LSU and if it is set (non-zero) This will cause the recovery-mode.
- The DXi file /data/hurricane/conf/OSTStorageSrvrLsu.conf will contain an entry like this under each storage-server:
<StorageServer>
...
<srv_nconnect>2</srv_nconnect>
- In this example above "nconnect" was 2 when ostd had been shutdown. This will trigger ostd do go into LSU recovery-mode.
- Edit the ‘srv_nconnect’ to be 0 before restarting OST.
- Restart the ost service
/etc/rc.d/init.d/ost start
- Verify /snfs/common/data/NS_OSTD exists for a few minutes after startup.
- Verify ost is started before starting other services such as replicationd
- In the tsunami log you will see ost startedand verify “==INFO - XX/XX/11-XX:XX:XX - DXi_script DataPath(0) [qlog] - ost started==”.
- Replace processwatcher
touch /var/DXi/processwatcher