Four tips for dealing with challenging backup environments

Tired Backup Admin

It wouldn’t be an exaggeration to say that Backup Administrators (along with Network and Systems Admins) are the unsung heroes of the IT world. Whether it’s finding new ways to get more out of existing hardware, making sure critical business data stays secure and up-to-date, or keeping the auditors happy, they help keep everything humming along smoothly.

But the Backup Admin’s job has changed a lot recently. Complexity has increased as companies add virtual tape libraries (VTLs) and virtual machines (VMs), and corporate acquisitions introduce additional backup technologies (and more VMs) to be managed—often in remote locations. Now add in reporting—every department wants a specialized report showing the metrics that matter most to them, finance wants individual departments to pay for the amount of storage they require, and don’t even get started on the compliance reporting the auditors require.

A good Backup Admin can still manage all of this, but everything takes longer and reduces the amount of time that can be spent managing the actual backup environment. How can anyone keep up? If you’re a Backup Admin and finding yourself dealing with a backup environment that’s spiraling out of control, here are a few things that might help restore your sanity.

Build a daily client backup status

Defining your daily backup status for a given client is key when you’re trying to report on backup solutions. First, decide which job information to use as a base. Your various backup applications will provide indicators such as the backup jobs and the events (schedulers’ results), and sometimes the job will have a concept of “parent jobs.”

The second step is to think about your backup window. A typical example would be 7:00 am to 7:00 am, which means that the daily backup status will not follow the calendar day. Another key point is the issue of a missed job, where the job didn’t start because the scheduler failed. In this situation there is no job, so a poor implementation might consider everything to be OK since there’s no job to report on. However, this should be marked as MISSED by checking on the scheduler rules or, if an external scheduler is used, by associating the external scheduler data with the client data in the backup product.

Finally, consider how you want to handle multiple jobs. If multiple jobs ran during the day, and some failed while others succeeded, do you consider this a success, or a failure, or a partial? As mentioned above, these questions need to be answered first before moving toward the implementation of a daily backup status. Once you’ve answered them, it becomes a simple matter of programming, getting the data, aggregating it, and saving the result per node or VTL, per day, in order to produce reports.

Set up reporting for individual business units

If you’re reading this, you’re probably dealing with a large number of PCs, servers, and databases. To you, many of them are just names on a screen, and it can be hard to understand the value of data on each machine. On the other hand, these names make a lot of sense to the users. A good practice is associating the backup clients with the company configuration management database (CMDB) extract in order to collect information such as business unit or application name. That way you can start reporting at the application or business unit level (or whatever logical entity you use in your company) and share the information with the end users.

There are numerous CMDB tools around, and it can be challenging to extract this data programmatically. A technique that often works is to get an extra copy of the CMDB in a CSV file, with columns such as hostname, business unit name, and application name. It’s then fairly simple to map this information to the storage or backup status per client you’re collecting, and can be shared with the end users, thus providing great value.

Report on your storage

Many managers and users want to know how much storage they’re using, and teams want to forecast future storage needs and anticipate additional storage purchase. When reporting on storage, consider keeping daily data points for all of the key elements. This usually means looking at raw data, then compressing, then de-duping (when applicable). The level of granularity should be as low as possible, starting with the storage itself, then going to lower levels such as storage pools, file systems, or shares (as applicable).

This data will only be useful after a few months of reporting, when it becomes possible to start showing trends. When considering the VTL or other deduplication devices, keeping track of the deduplication ratio over time is also important, as a degradation typically means not only extra storage costs per TB of raw data, but also extra processing cycles on the deduplication device.

Automate!

If all of the above sounds like a lot of manual work, there are software solutions that can help automate many of these processes and develop a proactive—not reactive—backup reporting strategy. With a tool like Rocket Servergraph you can see your entire backup environment using a single reporting view—even across different backup technologies. You get a better idea of what’s actually going on across your network, and scheduled (as well as real-time) reporting can help reduce the need to run reports manually. You can even segment reports by business unit, backup technology, or other options.

With a tool like Servergraph, management will be happy and you’ll get back to doing what you were hired to do. Want to learn more? Watch our archived Servergraph webinar below.

Contact us to learn more about Rocket Servergraph.