Cloud-era disaster recovery planning: Staff training, incident and media management

0
286
Oracle enhances customer experience platform with a B2B refresh

Source is ComputerWeekly.com

In the first in this series of articles on disaster recovery (DR) planning, we examined risk and business impact assessment as the initial building block, and then looked at development of the DR plan in detail in the second.

In this article, we look at staff awareness of disaster recovery plans and their training, as well as how to manage an incident.

Your people, incident management, the media

In the course of developing IT – or ICT – DR plans, additional activities must be developed, documented and approved before disaster recovery provision is complete.

These include the following and must also be factored into planning that involves the use of cloud technologies for disaster recovery:

  • Preparing and delivering awareness and training programmes to ensure all employees are prepared to respond in an emergency.
  • Preparing and documenting incident management (IM) plans to provide an initial response to an event.
  • Preparing for the time when it is necessary to deal with members of the media when they appear on site.

This article references the global standard for IT disaster recovery planning for end user organisations and can help develop and implement disaster recovery programmes. That standard is ISO/IEC 27031:2011, Information technology – security techniques – guidelines for information and communication technology readiness for business continuity.

As has been noted in previous articles in this series, disaster recovery provides strategies and procedures to help organisations protect their investments in IT systems and operating infrastructures. The essential mission for disaster recovery is to return IT operations to an acceptable level of performance to support mission-critical business functions, and to do this as quickly as possible following a disruptive event.

Figure 1 depicts the IT disaster recovery lifecycle, and is adapted from ISO/IEC 27031. It shows where awareness and training, incident management and media management fit into the overall disaster recovery lifecycle and framework. These three important activities will be discussed in greater detail in the course of this article.

Figure 1: The IT disaster recovery lifecycle

Awareness and training strategies

According to ISO 27031 Section 7.5, “a coordinated programme should be implemented to ensure that processes are in place to regularly promote ICT DR awareness in general, as well as assess and enhance competency of all relevant personnel key to the successful implementation of ICT DR activities”.

Even with a commitment to use cloud services for DR, awareness and training is still desirable so employees understand how cloud services will be used.

Start the process by identifying strategies and activities to raise awareness of IT disaster recovery. Perhaps the most important strategy is to secure senior management support and funding for DR programmes. Visible and frequently occurring endorsements from senior management will help raise awareness of and increase participation in the programme.

Next, engage the human resources (HR) department to help organise and conduct awareness activities, such as department briefings and messages on employee bulletin boards. Encourage HR to incorporate briefings on IT DR as well as business continuity in new employee orientation programmes.

Organisations with their own intranet can engage employees with specially prepared web pages on IT DR offering descriptions of what the programme does, frequently asked questions (FAQs), click-on links to forms and services, schedules and other useful materials. Adjust the content to discuss cloud services based on how they will be used. 

Building an awareness and training plan

When building awareness and training programmes, consider the following additional activities:

  • Conduct an awareness and training needs analysis.
  • Assess existing staff competencies and understanding.
  • Establish an ongoing awareness and training programme.
  • Establish record-keeping of staff training and awareness activities.
  • Establish competency levels for IT staff and how they should be maintained.
  • Develop and conduct training on technical recovery activities.
  • Develop and conduct training on emergency response activities, such as situation assessment and evacuation.
  • Develop and conduct training on specialised recovery, such as recovering to cloud DR services.
  • Develop and conduct training on return-to-normal activities.
  • Develop and conduct training on restoration of business systems and processes.
  • Conduct staff performance assessments post-disaster and re-evaluate training.
  • Establish continuous improvement programmes for awareness and training.
  • When working with one or more cloud service providers examine their training programmes to see if they can work with internally developed training activities.

Incident management planning

When an unplanned event occurs, incident management (IM) plans help to:

  • Assess the nature of the event.
  • Identify potential implications of the event if it increases (or decreases) in severity.
  • Establish lines of communication regarding the event.
  • Help assemble and launch trained response team(s) to handle the event.
  • Serve as a decision point for launching IT disaster recovery plans, business continuity plans, evacuation plans, fire emergency plans and other emergency response activities.

When cloud services are used, IM plans may be activated after some sort of alert from the cloud service provider. A disruption involving cloud services should be taken as seriously as an on-premise event. Cloud service providers may launch their own IM plan and it is essential to be able to communicate with the cloud provider’s support teams as soon as possible. This will ensure cloud users are aware of the impact on services dependent on the cloud.

Figure 2 provides a simplified view of how an incident management plan is used to launch a set of response activities to identify and assess the event, and make decisions on how a response will be initiated.

Figure 2: Steps in an incident response plan

 

In a cloud environment, the same steps are advisable, except that communications with the cloud provider must be launched as quickly as possible.

Once the event occurs and has been detected, three things need to happen quickly:

  • Ensure the safety of employees (for example, consider an evacuation).
  • The situation needs to be assessed.
  • Next steps need to be identified and launched.

Incident response plans address these and other time-critical activities following the onset of an incident. The following incident response plan outline lists additional useful activities:

  • Define the scope and objectives of the plan – that is, define the fundamental elements of the plan, what it is supposed to achieve, and what it addresses.
  • Define incident response assumptions and limitations – this defines activities the plan can and cannot initiate. These are important steps if a cloud provider is involved, because contact with the provider will be needed as quickly as possible.
  • Incident response teams, contact data and responsibilities – this section lists the names and contact data for individuals assigned to an incident response team. It may specify their duties and responsibilities, such as team leader, damage assessment specialist, liaison with first responders or evacuation coordinator.
  • Notification process steps – provides information about the incident to designated individuals as quickly as possible, especially cloud service providers. It defines who should be contacted, how quickly contact must be made, and the data to be communicated.
  • Damage assessment steps – here we define steps to take when assessing the damage from an incident. This will need to be coordinated with cloud providers.
  • Declaration process steps – defines criteria for the incident response team to either declare a disaster or provide information to designated individuals (for example, top management) so they can officially declare a disaster. This will need to be coordinated with cloud providers.
  • Escalation process steps – defines what to do if the severity of the incident increases (or scales back) by communicating to first responders, cloud service providers or others.
  • Decision to launch additional emergency activities – based on progress of the incident and assessments by experts, it may be necessary to launch additional activities such as escalating the response by the cloud provider.
  • Incident response plan deactivation steps – procedures for deactivating the plan and standing down the incident response team should be coordinated with cloud service providers.
  • Plan testing – periodic IM plan exercises are advised to ensure plan procedures are relevant and team members are properly trained and understand their roles and responsibilities. Coordinating testing with a cloud provider may be an issue, depending on the provider’s approach to testing.
  • Plan maintenance and review activities – schedule reviews and updates to plans to validate team member names and contact details, contacts for cloud providers, and that the plan is current.

Media management planning

Dealing with the media requires careful preparation and the realisation that the way the media reports on an incident can have a significant impact on the organisation.

The most important strategies for managing the media are:

  • Designate one or more senior employees as company spokespersons – these are the only people who should speak to the media.
  • Designate an area where the media can be positioned for interviews and comments.
  • Have pre-written press release forms that can be easily updated based on the incident.
  • Prepare a media policy describing what can and cannot be said.
  • Organise media training for designated spokespersons so they will present the best possible face to the media when being interviewed.

Summary

This article has examined IT disaster recovery awareness and training, developing an incident management plan, and how to manage the media during an incident, especially when cloud services are involved.

Source is ComputerWeekly.com

Vorig artikelData integration remains essential for AI and machine learning
Volgend artikelAs Automakers Add Technology to Cars, Software Bugs Follow