Tips and Techniques to Increase iSeries Availability Seminar

3/21: Grand Rapids - WMSUG                       2001                                             3/22: Kalamazoo - I-94
 
---------------------------------------------------------------------------------------------------------------------------------------------


Comments by Shae Hinds,  Midwest Integrated Systems Resources:

Sue Baker of IBM's iSeries Advanced Technical Support team from Rochester, MN  presented Tips and Tricks for Increasing iSeries Availability to the Grand Rapids and Kalamazoo user groups in March.  Sue conducted a lively and detailed presentation concerning High Availability issues.  Both meetings were well attended (102 people) and Sue answered a lot of audience questions.  The handout contained extensive documentation, including comments, Redbook numbers, other IBM publications and web pages.  It is attached at the end of this article.  She also shared a number of specific features coming in V5R1 that cannot appear in print prior to general release  you had to be there!

High availability can mean different things based upon your business.  It can be as straightforward as shortening your backup time or as complex as 24X7 continuous access for mission critical applications.  Most outages on the AS/400-iSeries are considered "planned", including new software installation, PTFs, upgrades, saves, etc.  Unplanned outages and disasters comprise a very small percent of downtime. 

Sue stressed that High Availability does not equal Continuous Availability.  Your company must decide on your target availability percentage.  For example 90% = 36.5 days of downtime per year and 99.999% = 5.2 minutes downtime.  No surprise, the higher the availability requirement the higher the cost!

Sue described a number of techniques that can be used to speed backup time.  Some of them are listed below.  She also presented several scenarios illustrating concepts of "Clustering" and the detailed planning that must be done to accomplish it. 

"A Cluster is a collection of interconnected complete computers, used as a single, unified computing resource".  This collection could range from 2 systems configured to share the same DASD set to duplicate data sets replicated across multiple systems.  She also presented one configuration that illustrated parallel web servers to insure no internet transaction was lost.
 
Some Backup & Recovery tips:  
o Using the SAVE menu options provides the easiest but not fastest save/restores.
o Invest in new high performance save & restore devices
o Investigate concurrent & parallel save/restore techniques; using multiple tape devices.
o Use of OMITLIB, OMITOBJ and MEDDFN parameters allow named objects to be directed to a specific device.
o Insure optimum blocking parameter is on = USEOPTBLK(*YES) (can reduce BU time by as much as 30%)
o Investigate BRMS/400 monitoring & reporting product.
o Save while active on read-only objects, time to checkpoint depends on the number of objects involved, not size of object. Must use journaling for complete save.
o Consider using remote journaling feature with multiple systems.
o Consider "Clustering" for continuous availability
o Consider saving less by archiving historical data & doing incremental saves
o Initialize each backup tape with Unique Name, ONLY ONCE in its lifetime. 
o Throw out tapes if too many read/write errors occur.  An informational APR exists that documents allowable # of temp. errors per tape before being discarded.
o Change SAVOBJ command to report only UNSAVED object error messages INFTYPE(*ERR)
o IPL only when necessary, for upgrades or applying PTF's, not more frequently
o Clear QRPLOBJ manua