Please Upgrade to Continue Using Checklist

Article: 100043856

Last Published: 2019-03-12

Ratings: 1 0

Problem

Engineering has created a pre-upgrade checklist designed to address a number of known issues that could occur during an Appliance upgrade to 3.1.1-2. It is recommended by Veritas Engineering to review each step below before attempting the upgrade process.

This checklist is also attached to this article in PDF form.

Solution

NetBackup Appliance upgrade process

*** Please ensure that you have the latest version of this checklist. You may check with your Backline support for the latest version ***

a. The upgrade starts with a pre-flight check to see if the appliance is ready for upgrade (same code as AURA tool)

b. If pre-flight is successful, then it performs the following steps:
          - Backup the appliance configuration in a checkpoint and copy the product configuration

          - Performs upgrade. Upgrades the RPM packages, OS, Appliance and Application
          - Restore the configuration from the backup taken

          - Cleans up temporary files generated during upgrade process and completes the upgrade
          - Runs post-upgrade self-test to make sure backups can be run using the server that is just upgraded.

c. If pre-flight fails, the upgrade process aborts.

Note: Pre-flight includes a number of checks like status of NetBackup services, partitions, mongodb, media-master communication etc.

Upgrade duration

Each upgrade process takes around 90-120 mins. In addition to the built-in preflight checks we may also have additional manual checks. These are the checks which were discovered post release. These manual pre-upgrade checks would take around 45-60 minutes depending on the state of an appliance. If we run into issues during the upgrade, we have the option to pause the rollback, troubleshoot and then continue the upgrade. Currently we allow 60 minutes for troubleshooting before automatic rollback starts. Rollback process takes around 45 minutes.

Total time to upgrade ranges between 2 hours 30 mins to 4 hours (approximately).

Secure Comm Pre Upgrade Checks

On all the Media Servers to be upgraded check the hostid certificate to prevent any issues with Secure Comm during and/or Post Upgrade.

Check if hostid certs were issued by current trusted CA cert

Cert Validation Steps:

>> Run this command on Master Server

     nbcertcmd -listCAcertdetails

     Note the Start Date Section of the output.

     Example: Start Date: Jul 20 21:45:29 2018 GMT

>> Run the following command on Media Server

    nbcertcmd -listCertDetails

    Note the Expiry Date section of the output.

    Example: Expiry Date: Jul 24 22:47:03 2019 GMT

Result: Subtract one year from the "Expiry Date" of Media Server cert and if that date is after the "Start Date" of Master CA cert then the validation is successful.

If the above validation fails please run the following Resolution.

Resolution:

  1. Login media server and elevate to root
  2. Check the folder /usr/openv/var/vxss/credentials to validate the certificate exists
    n38-h45:/usr/openv/var/vxss/credentials# ll
    total 8
    -rw------- 1 root root 2491 Jan 30 18:40 5573bf78-76e9-4b2e-9863-33d44159bb3a
    drw------- 2 root root 4096 Jan 30 18:40 keystore
  3. Move the certificate to other folder (e.g. /tmp/certificate)
    n38-h45:/usr/openv/var/vxss/credentials# mv 5573bf78-76e9-4b2e-9863-33d44159bb3a /tmp/certificate
  4. Run the following two commands:
    nbcertcmd -getCACertificate
    nbcertcmd -getCertificate -force
  5. Check the folder /usr/openv/var/vxss/credentials to validate the new host-id based certificate was re-created.
  6. Re-Run theCertValidation Stepsagain.

Note: Perform the following steps if the Cert Validation fails even after executing the previous resolution

On Master Server

  1. Login master server, create NetBackupCLI user "nbcliuser" by using CLISH menu Manage > NetBackupCLI > Create
  2. Elevate to root.
  3. Run the following command sequence..
    n39-h63:/cat/cgd/vxss/credentials# bpnbat -login -loginType WEB
    Authentication Broker: n39-h63
    Authentication port [0 is default]:
    Authentication type (NIS, NISPLUS, WINDOWS, vx, unixpwd, ldap): unixpwd
    Domain:  n39-h63
    Login Name: nbcliuser
    Password:
    Operation completed successfully
  4. Revoke the host-id based certificate of media server (n38-h19 is the media server hostname):
    n39-h63:/cat/cgd/vxss/credentials# nbcertcmd -revokeCertificate -host n38-h19
    Certificate revoke request processed successfully.
  5. Dissociate the host mapping of the media server (n38-h19 is the media server hostname):
    n39-h63:/cat/cgd/vxss/credentials# nbcertcmd -dissociatehost -host n38-h19
    Request to dissociate host from host ID is successful.

On Media Server

  1. Check the folder /usr/openv/var/vxss/credentials to validate the certificate exists
    n38-h45:/usr/openv/var/vxss/credentials# ll
    total 8
    -rw------- 1 root root 2491 Jan 30 18:40 5573bf78-76e9-4b2e-9863-33d44159bb3a
    drw------- 2 root root 4096 Jan 30 18:40 keystore
  2. Move the certificate to other folder (e.g. /tmp/certificate)
    n38-h45:/usr/openv/var/vxss/credentials# mv 5573bf78-76e9-4b2e-9863-33d44159bb3a /tmp/certificate
  3. Run the following two commands:

    nbcertcmd -getCACertificate
    nbcertcmd -getCertificate -force

  4. Check the folder /usr/openv/var/vxss/credentials to validate the new host-id based certificate was re-created.

Re-Run theCert Validation Stepsagain.

Pre Upgrade Checks

Before starting with the upgrade, execute the following steps. If you encounter a failure in any of these steps, please contact Veritas support. Please note that this is the complete list, although not all of them may be applicable for all the environments.

  1.  Ensure that you have the correct version of 3.1.1-2 upgrade rpm.
        https://www.veritas.com/support/en_US/article.100041985. Validate MD5/SHA-1 check sum
  2.  Confirm the last backup ran successfully on the Appliance to be upgraded.
  3.  Confirm that no backup jobs are scheduled during the upgrade period.
  4.  Confirm all NBU processes can be stopped (bp.kill_all) and restarted (bp.start_all) successfully.
  5.  Reboot the Appliance before starting the upgrade. Make sure it completes the reboot without any problem.
  6.  Run Support> Test Software (confirm nothing failed).
        Note: Disk storage might report down if this command is run immediately after a reboot. Re-run the command after 2-3 minutes of a successful reboot.
  7.  Run Support> Test Hardware (confirm nothing failed)
  8.  Run Monitor> Hardware ShowHealth (check for any failures)
  9.  Run the following commands from root shell  (on the Media Server) for Media Server upgrade
    nbcertcmd -getCACertificate
    nbcertcmd -getCertificate -force
        Note:It is recommended to perform the resolution listed in the above section "Secure Comm Pre Upgrade Checks" irrespective of the cert validation results.
  10.  Follow the checks as specified with article for IPSec, OST /Third party plugins, HA and Disk firmware. Not all checks described in the article would apply as it depends on specific customer environment.
    https://www.veritas.com/content/support/en_US/doc/113205077-130377097-0/v113630230-130377097
  11.  Check the current MSDP capacity. If the appliance is running on >=80% of their capacity on MSDP, don't start the upgrade until you have additional space with MSDP. Depending on the MSDP content, conversion logic (MD5 to SHA256) requires additional 20-25% of disk space. The disk space would be reclaimed after the conversion is complete.  Please contact Veritas Support for additional information and next steps for upgrade.
  12.  Check available space to avoid "No enough space for the 'LVM' request" issue during the upgrade. If the filesystems are full, then we need to cleanup.
    Execute "df -lh" to check "Avail" and Used %, pay attention to those high number in Used%, like greater than 80/90%.
    Execute "lvdisplay | grep INACTIVE" to see whether there is any "INACTIVE".
    Please contact VERITAS support before beginning upgrade with a reference to note id: 100040469
  13.  On a 5220/5230 appliances, run thelvs command to check if /log partition has ~185G space.
    If not, adjust it with the following steps.Re-Size the /log partition to 185G with the following commands:

      >> In the CLISH menu, Stop the infraservices :

       Support> InfraServices

       Entering InfraServices view...

       InfraServices> Stop All

       >> Elevate to root user, and stop the services those use '/log' filesystem

        # /usr/openv/netbackup/bin/goodies/netbackup stop

        # /etc/init.d/nbappws stop

        # /etc/init.d/sisidsagent stop

        # /etc/init.d/sisipsagent stop

        # /etc/init.d/syslog stop

        # /etc/init.d/ntp stop

        # /etc/init.d/smb stop

       >> If the appliance is master, do the following in addition to the above steps

        # /opt/VRTSpbx/bin/vxpbx_exchanged stop

       >> Check for any other processes which may have handles open to files in /log:

        # lsof | grep /log

        Stop (kill -9)  the processes with handles open to files in /log.

        >>unmount /log

        # umount /log

        >> Resize the filesystem

        # e2fsck -f /dev/system/log

        # resize2fs /dev/system/log 185G

        >> Resize the logical volume:

        # lvresize  --size 185G /dev/system/log

        >> Mount /log:

        # mount /log

       >> Start all stopped services in the last steps

       If required, please contact VERITAS support and refer to ET:3937193.


  14. If upgrading the Media Server, ensure that there is access to Master Server and Master server Admin console so that it is possible to perform "Secure Comm" token regeneration, if required.


  15. Check if /etc/sysconfig/network-scripts/route-* exists before the upgrade. If the file(s) exists indicates that static routes are added on devices.

       >> Elevate to the root

       Add all the static routes into /etc/sysconfig/static-routes before the upgrade with the following format

       any net xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx gw xxx.xxx.xxx.xxx dev xxx

       Example:

       any net 10.99.192.0 netmask 255.255.240.0 gw 10.242.48.3 dev  bond2

       >> Restart network service

         service network restart

       Note: Please check and validate with your Network administrator that the entries are all in correct format with the file.

       Also take a backup of those files. Copy/Note the CLISH outputs of Network>Show Status and Network>Gateway Show. For any assistance please reach out to Veritas support with a reference to note id: 100042876

  16.  Check if WanOpt is disabled. If it is disabled, do the following as workaround to make the upgrade successful.

       Elevate to root user.

       Edit "/opt/IMAppliance/appex/etc/config", update accif="" to accif=" " (Add a space between the two quotation marks)  and save the file.

    Note: if you see something else for accif when WanOpt is disabled, please contact Veritas Support.


  17. If FCR is configured, make sure handlehelper process is not running or msdp partition is not used by handlehelper before upgrade by running the following commands

       # fuser -uvm /msdp/cat

       # fuser -uvm /msdp/data/dp1/pdvol

       If FCR is configured and handlehelper process is using the msdp partition, the unmount vxfs partition will fail, so stop the process. If you need any assistance on this check, please contact VERITAS support with a reference to ET3952940

  18.  Check if STIG is enabled. Applicable if the version from which it is being upgraded is 3.1 onwards, ignore for any previous version. Please contact Veritas Support with a reference to note id: 100041852

  19.  Check if add-ons are installed. If they are, then we need to rollback the add-ons before upgrade. The upgrade from 3.1 to 3.1.1 will fail if add-on was installed and not rollback on 3.1 appliance. Please contact Veritas Support with a reference to ET3945463

  20.  Check permissions on /tmp to prevent "permission issue"

       Go to maintenance mode, check the permission on /tmp by running the "ls -al  /". If not same as "drwxrwxrwt.", then execute: chmod 1777 /tmp


  21. Backup below files before upgrade, as they may be needed under some situations:

       /msdp/data/dp1/pdvol/etc/pdregistry.cfg

       /msdp/data/dp1/pdvol/etc/puredisk/controller.conf

       /msdp/data/dp1/pdvol/etc/puredisk/vpfsd_config.json

       Also backup MSDP configuration before upgrade:

       nbdevconfig -getconfig -storage_server <MSDP server name> -stype PureDisk > msdp.config


  22. For the Media Server upgrades, check media server name is consistent (either short name or FQDN) withbp.conf of Master Server, Media Server and it also matches with the server name under secure comm UI.  NBU Master Security Management > Host Management > Mapped Host Names/IP
    If you need any assistance, please contact Veritas Support with a reference to article id 100041470.
  23.  Before upgrade, enable verbose log in /usr/openv/netbackup/bp.conf:  "VERBOSE = 5", this helps with additional debug messages with NetBackup logs if any issues experienced during the upgrade. Also comment out PREFFERRED_NETWORK entry if exists with bp.conf.  Revert to required settings after successful upgrade.
  24.  There might be a 'split brain issue' during the upgrade.
    Please Contact Veritas Support if the scenario is applicable to the environment with a reference to article id 100033639.
  25.  If the Master Server is currently upgraded, ensure NetBackup catalog is appropriately backed up.
  26.  If the Maser Server is currently upgraded, Install hotfix: ET3941845 (NBDB database update failure) before the upgrade .Please contact Veritas Support with a reference to note id:https://www.veritas.com/support/en_US/article.100042241.
    Without this hotfix installation upgrade to 3.1.1 might fail.
    Note: As part of hotfix installation, if you are not aware of EMM database password create a new password with the command:
    Example: /usr/openv/db/bin/nbdb_admin -dba nbusql
    Re-Run the hotfix script with updated password.
  27.  Before starting a major upgrade (3.0 → 3.1.1), disable mongod before the upgrade. There are scenarios where mongod is corrupted after a failed upgrade and rollback successful.
    Note: Run these commands only after pre-upgrade reboot and all the remaining checks are complete. Run these just before the start of upgrade. Don't reboot the appliance after turning off the following services and start of upgrade.
    chkconfig  mongod off
    chkconfig  crond off
  28.  In case of upgrade failure, after successful rollback run the following commands:
    chkconfig mongod on
    chkconfig crond on
    Start the mongod and crond services

    If the upgrade is successful, we don't need to enable and start mongod & crond. Upgrade process takes care of starting the required services.

  29.  Check that all InfraServices are running with the following command:
    Main_Menu > Support > InfraServices > Show All
    If any are not running, start them
  30.  Run the Appliance Upgrade Readiness Analyzer Tool (AURA)
    https://www.veritas.com/support/en_US/article.100040055

   If all the reports from Hardware, Software tests listed above and AURA tool are good and all the pre-upgrade check list are checked, then start the upgrade process.

Post Successful Upgrade

1. After successful upgrade to NBA 3.1.1-2, install the latest version of EEB: 3942191  ( MSDP Conversion Performance Fixes) on Master and Media servers which has msdp pool configured.

Stop NetBackup Services before installing this EEB. (Support>Processes>NetBackup Stop )

Install the latest version of this EEB. ( Manage>Software>Install <EEB> )

Start NetBackup Services after EEB installation is complete.  ( Support>Processes>NetBackup Start )

Please contact Veritas Support to obtain the latest version of the EEB. This EEB fixes performance issues related to MSDP conversions

2. Configure pass phrase for Disaster Recovery Package after the Master Server upgrade.With NetBackup Java Admin Console, Security Management -> Global Security Settings -> Disaster Recovery tab (lower bottom pane) . Update the passpharase.

3. If you had exported IPSec certificate before upgrade, import them back on the upgraded appliance using

     Network>IPSec>Import CLISH menu.

4. Restore the bp.conf  settings ( VERBOSE  and PREFERED_NETWORK) to the values prior to upgrade. Restart the NetBackup services either using CLISH ( Support>Processes>NetBackup Stop , Support>Processes>NetBackup Start) or root shell ( netbackup stop, netbackup start )

Issues during upgrade

Appliance upgrade stops after reboot. Appliance does not boot up:

Contact Veritas support with incident number: ET3943926

Post-Upgrade Self-Test failure - Troubleshooting

If the post-upgrade self-test fails, upgrade pauses for one hour during which the following troubleshooting steps can be performed.

  1. Check if static route files are restored and are available on the host. Refer to the copies of this file that were backed up before upgrade.
    Restart Network service (Service Network restart)
    Restart NetBackup Services (bp.kill_all , bp.start_all ) and re-try the post-upgrade self-test
    For any assistance, please contact Veritas Support with a reference to note id: 100042876
  2. Master server name in bp.conf on the media server does not match the entry in Security Management > Host Management > Mapped Host Names/IP Addresses ( with Java Admin Console ).
    Ensure that Master Name is matching with bp.conf of Master, Media and with Mapped HostName with Security Management GUI. For any queries, Contact Veritas Support with a reference to note id: 100041470
  3. Check if there are any entries of Preferred Network with media server bp.conf
    Comment out PREFERRED_NETWORK parameter in the bp.conf file on the server that is being upgraded. For any queries, Contact Veritas Support with a reference to note id: 100041468
    Restart NetBackup Services on Media Server using bp.kill_all, followed by bp.start_all.

Check if there are any issues with Secure Comm certificates during the upgrade process

Unable to run NetBackup commands on the media server. ( Example: bptestbpcd and nbemmcmd fail to contact Master)

nbcertcmd -ping  ( To verify if the Master server NBWMC NetBackup service is ready )

nbcertcmd -listAllCertificates  ( To list all the certificates on the host )

nbcertcmd -listCertDetails ( To list details of all the certificates on the host )

nbcertcmd -getCACertificate   (Refresh CA certs from Master onto the host )

nbcertcmd -getCertificate -force  (Refresh hostid certs from Master onto the host )

Re-Run the command to test master connectivity: bptestbpcd  -host <master_server> -verbose

How to disable Rollback when Upgrade is in Paused state due to Self-Test failure (major upgrade)

Move the file /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_os_installation.sh to another location.

# mv /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_os_installation.sh   /log/rollback_after_os_installation.sh

Disable rollback

# systemctl disable rollback_after_os_installation.service

How to re-enable Rollback when Upgrade is in Paused state due to Self-Test failure (major upgrade)

Move the file /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_os_installation.sh back

# mv /log/rollback_after_os_installation.sh /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_os_installation.sh

Enable rollback

# systemctl enable rollback_after_os_installation.service

How to disable Rollback when Upgrade is in Paused state due to Self-Test failure (minor upgrade)

Move the file /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_reboot_minor.sh to another location.

# mv /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_reboot_minor.sh  /log/rollback_after_reboot_minor.sh

Disable rollback

# systemctl disable rollback_after_reboot_minor.service

How to re-enable Rollback when Upgrade is in Paused state due to Self-Test failure (minor upgrade)

Move back the file /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_reboot_minor.sh

# mv  /log/rollback_after_reboot_minor.sh /inst/patch/appliance/installed/<nba_version>/scripts/rollback_after_reboot_minor.sh

Enable rollback

# systemctl enable rollback_after_reboot_minor.service

mongodb is corrupted during the upgrade/rollback

Please reach out to Veritas Support with reference to article id 100039139

Logs to be collected after a failed upgrade attempt

DataCollect

/log/patch*

/var/log/messages

vxlogview -p 409 -t 08:00:00   (  Last 8 hours logs of NetBackup Appliance )

vxlogview -p 51216 -t 08:00:00 ( Last 8 hours logs of NetBackup )

/usr/openv/netbackup/bp.conf

cat /etc/nbapp-release

df -h

NetBackup logs if NetBackup issues found (including secure comms)

/usr/openv/wmc/webserver/logs/* (on master server)

/log/webgui/webserver/catalina.out

/log/mongodb/mongod.log

/var/log/mongodb/mongod.log

Attachments

Was this content helpful?

Rating submitted. Please provide additional feedback (optional):

jamersontrat1938.blogspot.com

Source: https://www.veritas.com/support/en_US/article.100043856

0 Response to "Please Upgrade to Continue Using Checklist"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel