Please Upgrade to Continue Using Checklist
Article: 100043856
Last Published: 2019-03-12
Ratings: 1 0
Problem
Engineering has created a pre-upgrade checklist designed to address a number of known issues that could occur during an Appliance upgrade to 3.1.1-2. It is recommended by Veritas Engineering to review each step below before attempting the upgrade process.
This checklist is also attached to this article in PDF form.
Solution
NetBackup Appliance upgrade process
*** Please ensure that you have the latest version of this checklist. You may check with your Backline support for the latest version ***
a. The upgrade starts with a pre-flight check to see if the appliance is ready for upgrade (same code as AURA tool)
b. If pre-flight is successful, then it performs the following steps:
- Backup the appliance configuration in a checkpoint and copy the product configuration
- Performs upgrade. Upgrades the RPM packages, OS, Appliance and Application
- Restore the configuration from the backup taken
- Cleans up temporary files generated during upgrade process and completes the upgrade
- Runs post-upgrade self-test to make sure backups can be run using the server that is just upgraded.
c. If pre-flight fails, the upgrade process aborts.
Note: Pre-flight includes a number of checks like status of NetBackup services, partitions, mongodb, media-master communication etc.
Upgrade duration
Each upgrade process takes around 90-120 mins. In addition to the built-in preflight checks we may also have additional manual checks. These are the checks which were discovered post release. These manual pre-upgrade checks would take around 45-60 minutes depending on the state of an appliance. If we run into issues during the upgrade, we have the option to pause the rollback, troubleshoot and then continue the upgrade. Currently we allow 60 minutes for troubleshooting before automatic rollback starts. Rollback process takes around 45 minutes.
Total time to upgrade ranges between 2 hours 30 mins to 4 hours (approximately).
Secure Comm Pre Upgrade Checks
On all the Media Servers to be upgraded check the hostid certificate to prevent any issues with Secure Comm during and/or Post Upgrade.
Check if hostid certs were issued by current trusted CA cert
Cert Validation Steps:
>> Run this command on Master Server
nbcertcmd -listCAcertdetails
Note the Start Date Section of the output.
Example: Start Date: Jul 20 21:45:29 2018 GMT
>> Run the following command on Media Server
nbcertcmd -listCertDetails
Note the Expiry Date section of the output.
Example: Expiry Date: Jul 24 22:47:03 2019 GMT
Result: Subtract one year from the "Expiry Date" of Media Server cert and if that date is after the "Start Date" of Master CA cert then the validation is successful.
If the above validation fails please run the following Resolution.
Resolution:
- Login media server and elevate to root
- Check the folder /usr/openv/var/vxss/credentials to validate the certificate exists
n38-h45:/usr/openv/var/vxss/credentials# ll
total 8
-rw------- 1 root root 2491 Jan 30 18:40 5573bf78-76e9-4b2e-9863-33d44159bb3a
drw------- 2 root root 4096 Jan 30 18:40 keystore - Move the certificate to other folder (e.g. /tmp/certificate)
n38-h45:/usr/openv/var/vxss/credentials# mv 5573bf78-76e9-4b2e-9863-33d44159bb3a /tmp/certificate - Run the following two commands:
nbcertcmd -getCACertificate
nbcertcmd -getCertificate -force - Check the folder /usr/openv/var/vxss/credentials to validate the new host-id based certificate was re-created.
- Re-Run theCertValidation Stepsagain.
Note: Perform the following steps if the Cert Validation fails even after executing the previous resolution
On Master Server
- Login master server, create NetBackupCLI user "nbcliuser" by using CLISH menu Manage > NetBackupCLI > Create
- Elevate to root.
- Run the following command sequence..
n39-h63:/cat/cgd/vxss/credentials# bpnbat -login -loginType WEB
Authentication Broker: n39-h63
Authentication port [0 is default]:
Authentication type (NIS, NISPLUS, WINDOWS, vx, unixpwd, ldap): unixpwd
Domain: n39-h63
Login Name: nbcliuser
Password:
Operation completed successfully - Revoke the host-id based certificate of media server (n38-h19 is the media server hostname):
n39-h63:/cat/cgd/vxss/credentials# nbcertcmd -revokeCertificate -host n38-h19
Certificate revoke request processed successfully. - Dissociate the host mapping of the media server (n38-h19 is the media server hostname):
n39-h63:/cat/cgd/vxss/credentials# nbcertcmd -dissociatehost -host n38-h19
Request to dissociate host from host ID is successful.
On Media Server
- Check the folder /usr/openv/var/vxss/credentials to validate the certificate exists
n38-h45:/usr/openv/var/vxss/credentials# ll
total 8
-rw------- 1 root root 2491 Jan 30 18:40 5573bf78-76e9-4b2e-9863-33d44159bb3a
drw------- 2 root root 4096 Jan 30 18:40 keystore - Move the certificate to other folder (e.g. /tmp/certificate)
n38-h45:/usr/openv/var/vxss/credentials# mv 5573bf78-76e9-4b2e-9863-33d44159bb3a /tmp/certificate - Run the following two commands:
nbcertcmd -getCACertificate
nbcertcmd -getCertificate -force - Check the folder /usr/openv/var/vxss/credentials to validate the new host-id based certificate was re-created.
Re-Run theCert Validation Stepsagain.
Pre Upgrade Checks
Before starting with the upgrade, execute the following steps. If you encounter a failure in any of these steps, please contact Veritas support. Please note that this is the complete list, although not all of them may be applicable for all the environments.
- Ensure that you have the correct version of 3.1.1-2 upgrade rpm.
https://www.veritas.com/support/en_US/article.100041985. Validate MD5/SHA-1 check sum - Confirm the last backup ran successfully on the Appliance to be upgraded.
- Confirm that no backup jobs are scheduled during the upgrade period.
- Confirm all NBU processes can be stopped (bp.kill_all) and restarted (bp.start_all) successfully.
- Reboot the Appliance before starting the upgrade. Make sure it completes the reboot without any problem.
- Run Support> Test Software (confirm nothing failed).
Note: Disk storage might report down if this command is run immediately after a reboot. Re-run the command after 2-3 minutes of a successful reboot. - Run Support> Test Hardware (confirm nothing failed)
- Run Monitor> Hardware ShowHealth (check for any failures)
- Run the following commands from root shell (on the Media Server) for Media Server upgrade
nbcertcmd -getCACertificate
nbcertcmd -getCertificate -force
Note:It is recommended to perform the resolution listed in the above section "Secure Comm Pre Upgrade Checks" irrespective of the cert validation results. - Follow the checks as specified with article for IPSec, OST /Third party plugins, HA and Disk firmware. Not all checks described in the article would apply as it depends on specific customer environment.
https://www.veritas.com/content/support/en_US/doc/113205077-130377097-0/v113630230-130377097 - Check the current MSDP capacity. If the appliance is running on >=80% of their capacity on MSDP, don't start the upgrade until you have additional space with MSDP. Depending on the MSDP content, conversion logic (MD5 to SHA256) requires additional 20-25% of disk space. The disk space would be reclaimed after the conversion is complete. Please contact Veritas Support for additional information and next steps for upgrade.
- Check available space to avoid "No enough space for the 'LVM' request" issue during the upgrade. If the filesystems are full, then we need to cleanup.
Execute "df -lh" to check "Avail" and Used %, pay attention to those high number in Used%, like greater than 80/90%.
Execute "lvdisplay | grep INACTIVE" to see whether there is any "INACTIVE".
Please contact VERITAS support before beginning upgrade with a reference to note id: 100040469 - On a 5220/5230 appliances, run thelvs command to check if /log partition has ~185G space.
If not, adjust it with the following steps.Re-Size the /log partition to 185G with the following commands:>> In the CLISH menu, Stop the infraservices :
Support> InfraServices
Entering InfraServices view...
InfraServices> Stop All
>> Elevate to root user, and stop the services those use '/log' filesystem
# /usr/openv/netbackup/bin/goodies/netbackup stop
# /etc/init.d/nbappws stop
# /etc/init.d/sisidsagent stop
# /etc/init.d/sisipsagent stop
# /etc/init.d/syslog stop
# /etc/init.d/ntp stop
# /etc/init.d/smb stop
>> If the appliance is master, do the following in addition to the above steps
# /opt/VRTSpbx/bin/vxpbx_exchanged stop
>> Check for any other processes which may have handles open to files in /log:
# lsof | grep /log
Stop (kill -9) the processes with handles open to files in /log.
>>unmount /log
# umount /log
>> Resize the filesystem
# e2fsck -f /dev/system/log
# resize2fs /dev/system/log 185G
>> Resize the logical volume:
# lvresize --size 185G /dev/system/log
>> Mount /log:
# mount /log
>> Start all stopped services in the last steps
If required, please contact VERITAS support and refer to ET:3937193.
-
If upgrading the Media Server, ensure that there is access to Master Server and Master server Admin console so that it is possible to perform "Secure Comm" token regeneration, if required. -
Check if /etc/sysconfig/network-scripts/route-* exists before the upgrade. If the file(s) exists indicates that static routes are added on devices.>> Elevate to the root
Add all the static routes into /etc/sysconfig/static-routes before the upgrade with the following format
any net xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx gw xxx.xxx.xxx.xxx dev xxx
Example:
any net 10.99.192.0 netmask 255.255.240.0 gw 10.242.48.3 dev bond2
>> Restart network service
service network restart
Note: Please check and validate with your Network administrator that the entries are all in correct format with the file.
Also take a backup of those files. Copy/Note the CLISH outputs of Network>Show Status and Network>Gateway Show. For any assistance please reach out to Veritas support with a reference to note id: 100042876
-
Check if WanOpt is disabled. If it is disabled, do the following as workaround to make the upgrade successful.
Elevate to root user.
Edit "/opt/IMAppliance/appex/etc/config", update accif="" to accif=" " (Add a space between the two quotation marks) and save the file.
Note: if you see something else for accif when WanOpt is disabled, please contact Veritas Support.
-
If FCR is configured, make sure handlehelper process is not running or msdp partition is not used by handlehelper before upgrade by running the following commands# fuser -uvm /msdp/cat
# fuser -uvm /msdp/data/dp1/pdvol
If FCR is configured and handlehelper process is using the msdp partition, the unmount vxfs partition will fail, so stop the process. If you need any assistance on this check, please contact VERITAS support with a reference to ET3952940
-
Check if STIG is enabled. Applicable if the version from which it is being upgraded is 3.1 onwards, ignore for any previous version. Please contact Veritas Support with a reference to note id: 100041852
-
Check if add-ons are installed. If they are, then we need to rollback the add-ons before upgrade. The upgrade from 3.1 to 3.1.1 will fail if add-on was installed and not rollback on 3.1 appliance. Please contact Veritas Support with a reference to ET3945463
-
Check permissions on /tmp to prevent "permission issue"
Go to maintenance mode, check the permission on /tmp by running the "ls -al /". If not same as "drwxrwxrwt.", then execute: chmod 1777 /tmp
-
Backup below files before upgrade, as they may be needed under some situations:/msdp/data/dp1/pdvol/etc/pdregistry.cfg
/msdp/data/dp1/pdvol/etc/puredisk/controller.conf
/msdp/data/dp1/pdvol/etc/puredisk/vpfsd_config.json
Also backup MSDP configuration before upgrade:
nbdevconfig -getconfig -storage_server <MSDP server name> -stype PureDisk > msdp.config
-
For the Media Server upgrades, check media server name is consistent (either short name or FQDN) withbp.conf of Master Server, Media Server and it also matches with the server name under secure comm UI. NBU Master Security Management > Host Management > Mapped Host Names/IP
If you need any assistance, please contact Veritas Support with a reference to article id 100041470. - Before upgrade, enable verbose log in /usr/openv/netbackup/bp.conf: "VERBOSE = 5", this helps with additional debug messages with NetBackup logs if any issues experienced during the upgrade. Also comment out PREFFERRED_NETWORK entry if exists with bp.conf. Revert to required settings after successful upgrade.
- There might be a 'split brain issue' during the upgrade.
Please Contact Veritas Support if the scenario is applicable to the environment with a reference to article id 100033639. - If the Master Server is currently upgraded, ensure NetBackup catalog is appropriately backed up.
- If the Maser Server is currently upgraded, Install hotfix: ET3941845 (NBDB database update failure) before the upgrade .Please contact Veritas Support with a reference to note id:https://www.veritas.com/support/en_US/article.100042241.
Without this hotfix installation upgrade to 3.1.1 might fail.
Note: As part of hotfix installation, if you are not aware of EMM database password create a new password with the command:
Example: /usr/openv/db/bin/nbdb_admin -dba nbusql
Re-Run the hotfix script with updated password. - Before starting a major upgrade (3.0 → 3.1.1), disable mongod before the upgrade. There are scenarios where mongod is corrupted after a failed upgrade and rollback successful.
Note: Run these commands only after pre-upgrade reboot and all the remaining checks are complete. Run these just before the start of upgrade. Don't reboot the appliance after turning off the following services and start of upgrade.
chkconfig mongod off
chkconfig crond off - In case of upgrade failure, after successful rollback run the following commands:
chkconfig mongod on
chkconfig crond on
Start the mongod and crond servicesIf the upgrade is successful, we don't need to enable and start mongod & crond. Upgrade process takes care of starting the required services.
- Check that all InfraServices are running with the following command:
Main_Menu > Support > InfraServices > Show All
If any are not running, start them - Run the Appliance Upgrade Readiness Analyzer Tool (AURA)
https://www.veritas.com/support/en_US/article.100040055
If all the reports from Hardware, Software tests listed above and AURA tool are good and all the pre-upgrade check list are checked, then start the upgrade process.
Post Successful Upgrade
1. After successful upgrade to NBA 3.1.1-2, install the latest version of EEB: 3942191 ( MSDP Conversion Performance Fixes) on Master and Media servers which has msdp pool configured.
Stop NetBackup Services before installing this EEB. (Support>Processes>NetBackup Stop )
Install the latest version of this EEB. ( Manage>Software>Install <EEB> )
Start NetBackup Services after EEB installation is complete. ( Support>Processes>NetBackup Start )
Please contact Veritas Support to obtain the latest version of the EEB. This EEB fixes performance issues related to MSDP conversions
2. Configure pass phrase for Disaster Recovery Package after the Master Server upgrade.With NetBackup Java Admin Console, Security Management -> Global Security Settings -> Disaster Recovery tab (lower bottom pane) . Update the passpharase.
3. If you had exported IPSec certificate before upgrade, import them back on the upgraded appliance using
Network>IPSec>Import CLISH menu.
4. Restore the bp.conf settings ( VERBOSE and PREFERED_NETWORK) to the values prior to upgrade. Restart the NetBackup services either using CLISH ( Support>Processes>NetBackup Stop , Support>Processes>NetBackup Start) or root shell ( netbackup stop, netbackup start )
Issues during upgrade
Appliance upgrade stops after reboot. Appliance does not boot up:
Contact Veritas support with incident number: ET3943926
Post-Upgrade Self-Test failure - Troubleshooting
If the post-upgrade self-test fails, upgrade pauses for one hour during which the following troubleshooting steps can be performed.
- Check if static route files are restored and are available on the host. Refer to the copies of this file that were backed up before upgrade.
Restart Network service (Service Network restart)
Restart NetBackup Services (bp.kill_all , bp.start_all ) and re-try the post-upgrade self-test
For any assistance, please contact Veritas Support with a reference to note id: 100042876 - Master server name in bp.conf on the media server does not match the entry in Security Management > Host Management > Mapped Host Names/IP Addresses ( with Java Admin Console ).
Ensure that Master Name is matching with bp.conf of Master, Media and with Mapped HostName with Security Management GUI. For any queries, Contact Veritas Support with a reference to note id: 100041470 - Check if there are any entries of Preferred Network with media server bp.conf
Comment out PREFERRED_NETWORK parameter in the bp.conf file on the server that is being upgraded. For any queries, Contact Veritas Support with a reference to note id: 100041468
Restart NetBackup Services on Media Server using bp.kill_all, followed by bp.start_all.
Check if there are any issues with Secure Comm certificates during the upgrade process
Unable to run NetBackup commands on the media server. ( Example: bptestbpcd and nbemmcmd fail to contact Master)
nbcertcmd -ping ( To verify if the Master server NBWMC NetBackup service is ready )
nbcertcmd -listAllCertificates ( To list all the certificates on the host )
nbcertcmd -listCertDetails ( To list details of all the certificates on the host )
nbcertcmd -getCACertificate (Refresh CA certs from Master onto the host )
nbcertcmd -getCertificate -force (Refresh hostid certs from Master onto the host )
Re-Run the command to test master connectivity: bptestbpcd -host <master_server> -verbose
How to disable Rollback when Upgrade is in Paused state due to Self-Test failure (major upgrade)
Move the file /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_os_installation.sh to another location.
# mv /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_os_installation.sh /log/rollback_after_os_installation.sh
Disable rollback
# systemctl disable rollback_after_os_installation.service
How to re-enable Rollback when Upgrade is in Paused state due to Self-Test failure (major upgrade)
Move the file /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_os_installation.sh back
# mv /log/rollback_after_os_installation.sh /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_os_installation.sh
Enable rollback
# systemctl enable rollback_after_os_installation.service
How to disable Rollback when Upgrade is in Paused state due to Self-Test failure (minor upgrade)
Move the file /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_reboot_minor.sh to another location.
# mv /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_reboot_minor.sh /log/rollback_after_reboot_minor.sh
Disable rollback
# systemctl disable rollback_after_reboot_minor.service
How to re-enable Rollback when Upgrade is in Paused state due to Self-Test failure (minor upgrade)
Move back the file /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_reboot_minor.sh
# mv /log/rollback_after_reboot_minor.sh /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_reboot_minor.sh
Enable rollback
# systemctl enable rollback_after_reboot_minor.service
mongodb is corrupted during the upgrade/rollback
Please reach out to Veritas Support with reference to article id 100039139
Logs to be collected after a failed upgrade attempt
DataCollect
/log/patch*
/var/log/messages
vxlogview -p 409 -t 08:00:00 ( Last 8 hours logs of NetBackup Appliance )
vxlogview -p 51216 -t 08:00:00 ( Last 8 hours logs of NetBackup )
/usr/openv/netbackup/bp.conf
cat /etc/nbapp-release
df -h
NetBackup logs if NetBackup issues found (including secure comms)
/usr/openv/wmc/webserver/logs/* (on master server)
/log/webgui/webserver/catalina.out
/log/mongodb/mongod.log
/var/log/mongodb/mongod.log
Attachments
Was this content helpful?
Rating submitted. Please provide additional feedback (optional):
Source: https://www.veritas.com/support/en_US/article.100043856
0 Response to "Please Upgrade to Continue Using Checklist"
Post a Comment