Rocket UniData and UniVerse Replication best practices: REPLOGGER
Part 3 of 3
This blog post offers best practices related to implementing and using Replication, as part of a High Availability/Disaster Recovery (HA/DR) strategy. In Part 3, I’ll focus on Replogger.
REPLOGGER is a BASIC program that runs on the server side that allows you to monitor replication performance. The information it collects goes into a single sequential file called rep_monitor.log, which lives in the account where the command is running. The information REPLOGGER collects describes how the replication system performs over a period of time at a group level. You can use this information to understand how the replication system responds to changing workloads. Since the amount of information in the file can be very large, we suggest that you use a JAVA tool called the ‘U2 Replication Analyzer’ to graph the result sets. The graphs are a helpful way to analyze and visualize the data. You can find the U2 Replication Analyzer tool on github.
At ECL or TCL, enter REPLOGGER you will be prompted for some information:
- The number of replication groups
- Even if this is a single server replication, the publisher and subscriber each count as one system – so there are two systems
- Do you want to clear previous logs?
- The capturing interval time, in seconds
- The maximum log file size, in megabytes
These responses can be entered directly via the command line – e.g. REPLOGGER 6 Y 5 100
The screen shot above shows the raw data that REPLOGGER collects in the client-side tool, including some RSN numbers and dates and times.
Above you can see Replication performance in a graph which shows peaks and troughs, and where things are loading.
Standard Naming Conventions
For system names, Replication defaults to the name ‘system_version’, which is not as useful as changing the names to something like ‘primary’ and ‘standby’.
Standard Naming: Replication Group Names
Internally Replication groups are numbered starting at 0, as Replication encounters them in the repconfig file. The internal group number is used in the output of several commands and tools in U2 Replication. So, you can take advantage of this to make the group names more readable. As a best practice, I suggest creating group names that combine the account, the type of replication level, and group number. In this example, payroll is the account, the “a” means it’s an account level group and “g0” means the file group number is 0.
payroll_ag0 , payroll_fg1 , payroll_fg2 , inventory_ag3, inventory_fg4 etc.
This can be helpful since these file names and groups tie back to some of our monitoring tools and are useful for diagnostics and debugging.
Replication Config Checker – Another Tool
CHECK.REP.CFG is currently installed as part of the ‘Monitor Phantoms’ deployment. We built this tool in response to problems customers can and have encountered with incorrect configurations. We are currently testing this tool and we welcome your feedback as we work on making the tool part of the standard UniVerse and UniData releases.
The Replication Config Checker will scan/check and report back before you turn Replication on:
- It checks to see if it can talk to all of the machines involved in Replication
- It will make sure you haven’t doubly named entries in the repsys file and in the UV and UD account file
- It will make sure the names that you specify actually exist in the UV and UD account file
- It checks some base configurations to make sure the values aren’t set too low
Produce a “Recovery Blueprint” – VERY IMPORTANT BEST PRACTICE
A Recovery Blueprint documents your recovery procedures or failover processes – specific to your implementation. The Blueprint describes your solution as it was built and answers all the implementation questions in advance The Blueprint will save valuable time during recovery. You can also use it to educate your staff.
Another highly recommended best practice is to perform a Live Failover at regular intervals. A good rule of thumb is at least twice a year. By following this best practice, you ensure the Recovery Blueprint remains up to date and you ensure that your staff are familiar with the recovery procedures.
Finally, we don’t recommend a completely automated failover since any decision to failover should be a conscious business decision.
I hope you found this series of posts useful. Also, if you want to listen to the entire Replication Best Practices webinar, please feel free listen and share within your organization.