Backups are Failing (DRAFT)

Overview

In this scenario, the customer has called with this problem: Backups are failing

 
Problem Description: The customer attempts to disjoin a DXi from Active Directory and when he gets to the form that requests the administrator username and password, the field does not allow any input.
 
When the customer configures the system for segmentation, he is left with an inaccessible GUI or a GUI that does not reflect accurate information. 
 
Available Information: Tech Support has the logs from Samba, the OS, and the DXi software.
 
Tech Support’s Initial Thoughts: Tech Support tries to ssh to the system and remove it from the domain manually. (Remove the .cifs_configured file, refresh the GUI page, and add the system to a workgroup.) The Tech Support person does a grep –r for a particular string against the entire filesystem in an attempt to find the respective configuration file.
 
Questions from Tech Support: Why doesn’t the GUI allow input when disjoining from Active Directory?  Why does it happen sometimes and not others? What is the most appropriate way to help the customer? Should Service/Support have looked at something else in the Active Directory? If so, what and why?
 
 
--------------------------------------------------------------------------------------------

Next Steps: Responses from Engineering

Starting with Galaxy 1.4, disjoining is always a successful operation. See the “Solution for Galaxy 1.4 and later” section below for explanation. Most of the following discussion applies to Pre-Galaxy 1.4.

What should I look for?

Open the collect log and browse to directory node1-collection/nas-info and check the following files:
 

Why is there a failure to disjoin from ADS?

There are many reasons:

Why doesn’t the GUI allow input when disjoining from Active Directory?  Why does it happen sometimes and not others?

The short answer is that the CIFS configuration file (/etc/samba/smb.conf) has been tampered/corrupted and the ADS information is lost. Therefore, the GUI does not recognize the domain state (ADS or workgroup) of the CIFS server.

 

The proper procedure is to first use the DXi GUI to disjoin. This step is then optionally followed by logging on MMC tool to delete the DXi machine account from the domain. But some ADS administrators tend to do the reverse.

The following example explains how this happened in the past:

 

One of the common scenarios occurring at customer sites in the past involves experienced ADS administrators who are very familiar with the process of disjoining Windows systems from the ADS domain. Their common mistake is that the first thing they do is to log on the ADS server and use the MMC (Microsoft Management Console) to remove the DXi from the domain. This means they have changed the ADS database without properly informing the DXi system.

 

Therefore, the DXi system still believes it is in the domain and this is reflected on the GUI. Because the customers see the DXi still in the domain, they will disjoin it using the DXi GUI. As part of the disjoining process, the DXi will ask the ADS server to disable the DXi computer account. Because the ADS server has no record of the DXi (it was already deleted), it will fail the DXi request.

 

Once this situation happens, the disjoin process will fail but the GUI does not offer customers any way to repair the damage. Some customers may ssh to the DXi system and tamper with the smb.conf file. If the ADS information in this file is changed, the GUI will not display any input textboxes. This requires a support call to help fix the CIFS configuration.

Solutions

Manual solution for all Galaxy releases

Note that there is no need to look at the ADS. The DXi is in the unjoined state but its computer account is still active in the ADS domain. To completely remove it from the domain, the customers have to log on the ADS server and use the MMC tool to remove the DXi account.

 

Solution for Pre-Galaxy 1.2

See the manual procedure above. That’s the only solution.

Solution for Galaxy 1.2 and later

The manual solution still works. But the following syscli command can also be used for all Galaxy releases starting from 1.2: syscli --cifs --deactivate

Solution for Galaxy 1.4 and later

The manual solution still works. But starting with Galaxy 1.4, the disjoin operation has been redesigned to always succeed whether it is run from the GUI or the syscli command-line tool, even if the user enters a bogus username and password. The only difference is as follows:

 

In both cases the user has to use MMC tool to delete the DXi machine account from the domain.

 

Note: This redesign has been adopted to address the following valid scenarios, which can easily arise if the DXi is moved from one company to another:

Solution for Galaxy 2.0

CT SAID: What is the response here?

 

 

 


This page was generated by the BrainKeeper Enterprise Wiki, © 2018