3. Incident Handling Process

collage network suculents

A key part of the work of many DSHCSs is incident response. Every help desk should clarify beforehand the steps and resources needed for dealing with a request for support.

The incident handling process is continuous, and generally consists of the following four stages:

Stage 1: Preparation
Stage 2: Detection and Analysis
Stage 3: Containment, Eradication and Recovery
Stage 4: Post-incident Activity

The incident handling process should not be limited to the containment, eradication and recovery stage: other steps should be taken, for example, in the preparation phase and post-incident activity. This process should be documented and organised so incident handlers always have each step in mind at every stage of the process and are all able to reduce the chances that the same incident may happen again.

It is very important for a DSHCS to have a common workflow that every staff member is familiar with, so we share duties and know what we do when an incident happens - how we handle it and how we work from phase to phase. This should be written, it should not be something that just happens, because that leads to some mistakes. This workflow has not only been used by us, but also by other CERTs, and it's agreed upon, and also it's framed in a way that it takes into consideration not only the kind of beneficiaries we are helping, but also our capabilities. So it helps any handler to navigate incidents from the detection to the recovery, but also to get prepared for it.

Hassen Selmi, Incident Response Lead, Access Now Digital Security Helpline (Interview, November 2021).

You can see below a flowchart representing Access Now Digital Security Helpline's incident handling workflow:

Access Now Digital Security Helpline's incident handling workflow

The incident handling process should also be adapted to the needs and threat model of a helpline's constituency. For example, a DSHCS should consider that attacks targeting civil society are usually sophisticated and aimed at people who are not prepared for such attacks. So a DSHCS will need to focus on the preparation and post-incident activity stages, to go beyond simple recovery and turn an incident into an opportunity to prevent similar attacks in the future.

We have seen a higher level of threats with a low level of security and authority: our help desks have no authority over the beneficiaries. We can recommend things, we can advise them to do things, but it's up to them if they do them or not. Most of the time they are working remotely, and it's really hard to execute exactly what we wish we could execute like other CERTs in other sectors would do, so we have to adapt this also.

Hassen Selmi, Incident Response Lead, Access Now Digital Security Helpline (Interview, November 2021).

Learn more

An example of an incident response plan is NIST (2012). Computer Security Incident Handling Guide. Check pages 21-44 for more information on the incident handling process.
Kral, Patrick (2021). Incident Handler's Handbook. SANS Institute. https://www.sans.org/reading-room/whitepapers/incident/incident-handlers-handbook-33901.

3.1 Preparation

The incident handling process starts when a request is received: at that point the handler notes down basic information on the case, assigning it a priority and an owner and acknowledging the reception of the request to the requester. Then there is a mandatory check to make sure that both the requester is within the DSHCS's list of beneficiaries and that the requested service can be provided by the DSHCS.

If the request falls outside the DSHCS's mandate, the case is closed, possibly by sending the requester a list of available alternative resources. If, on the other hand, the request is within the DSHCS's mandate, vetting of the beneficiary is a recommended second preliminary step. Each DSHCS should have a vetting policy and implement it in this step, making sure that every beneficiary is who they say they are before handling their request.

After checking the mandate and vetting, the incident handling process starts, but this also needs to be prepared. The preparation stage of the incident handling process also consists of creating the proper documentation that will allow handlers to promptly respond to a set of known incidents and train the staff to follow these instructions (see below, section 3.5 Documentation of Procedures).

If you look at the workflow diagram, it looks like it starts when the incident starts, but in reality and in practice the preparation phase should go on continuously before any incidents happen: it is a proactive phase.

Hassen Selmi, Incident Response Lead, Access Now Digital Security Helpline (Interview, November 2021).

The preparation stage also includes outreach campaigns and training for beneficiaries who want to better secure themselves or their organisation. During this phase a help desk may also work at networking by establishing relationships with service providers or creating partnerships with other helplines and defenders to improve their capacity to escalate cases that need collaboration to be solved.

The work spent on the preparation stage will determine the speed, efficiency and quality of a DSHCS's response.

3.2 Detection and Analysis

Ideally, the incident response process starts with the preparation stage. Yet, realistically it often begins at the second stage of the incident handling process, when a request for support is received.

The first step of this stage is detection, which is aimed at making sure that what the beneficiary is observing is an actual digital security incident, i.e. that the behaviour the requester has observed is abnormal.

The incident handler should ask the requester for all the information available: system and network log files, screenshots, error messages, antivirus reports, suspicious emails, perceived symptoms of changes in normal behaviour, and other evidence that may indicate that an event is a security incident. Incident handlers should be open to any possibility and not let any digital security incident go unconsidered.

If the person asking for support is in emotional distress, gathering the necessary information to determine the incident can be revictimizing. In these cases, all the data can be gathered with the help of a person appointed by the requester.

The following step in this stage is to analyse the incident to better understand what is happening and what its causes are. The more evidence available, the better the insight the incident handler can provide.

Different pieces of evidence may be symptomatic of the same particular incident or different ones. Making correlations between pieces of evidence in the wrong way can lead to misinterpretation of the facts. A helpful way to prevent this is to conduct the analysis collectively in regular incident discussion meetings.

A tool to start analysing some of the most common digital security issues affecting civil society is the Digital First Aid Kit, a free resource to help rapid responders to troubleshoot the most common types of digital emergencies.

As in the previous stage, incident handlers should remember to record all relevant information and document every step taken.

3.3 Containment, Eradication and Recovery

Once they have confirmed that what the beneficiary is facing is a digital security incident, the incident handler will move on to the containment, eradication and recovery stage. The first step is containment - a "stopping the bleeding" intervention to ensure the attacker can't have further access to the beneficiary's digital assets. The incident handler should promptly provide the instructions for containment to limit the damages quickly.

Of course, the procedure required for containment is based on the type of asset that is under attack. For more information on the various procedures, a good shared resource is Access Now Helpline's Community Documentation.

Eradication is about removing anything the attacker may have added. This isn't always easy because malicious actors are usually very creative in devising new approaches for their attacks.

Afterwards, the recovery step is intended to restore the affected systems and take the necessary measures to prevent new incidents. Monitoring, therefore, becomes essential to detect any other methods an attacker can use and any further data exfiltration. Since civil society helplines often cannot monitor beneficiaries' assets directly, this step can be replaced by training the beneficiaries on how to do monitoring on their own.

Sometimes a case can't be closed by its owner due to a lack of time or capacity. In these cases involving other members of the DSHCS team can be required to outsource the handling of the case, or part of it, especially with analysis. This is one of the situations where networking and collaboration among DSHCSs can be particularly helpful.

3.4 Post-Incident Activity

The last stage of the incident handling process aims to gather what the incident handler has observed when working on the case. While already known opportunities to mitigate the digital security incidents might have been identified and provided to the beneficiary, some new ways of approaching an issue might also have been found, and should be documented.

The last steps are also part of the process. If you come up with a new solution or you find your documentation is not as efficient to navigate that incident as you wished, as a handler you are also requested to suggest solutions based on what you have seen. Sometimes you can also suggest this before the process itself, because you realize a process will not really work so it has to be improved.

Hassen Selmi, Incident Response Lead, Access Now Digital Security Helpline (Interview, November 2021).

These lessons learned will improve the help desk's documentation with a more creative and accurate approach to incident handling. We recommend not to delay this documentation after the case is closed, as little details tend to be forgotten.

Sometimes an incident can be connected to a series of attacks other groups of people should be warned about, so outreach and networking with partners are often needed at this stage in order to spread public alerts describing this kind of incident to potential targets.

3.5 Documentation of Procedures

The term "documentation" is quite extensive. It may refer to several elements, and if not defined clearly, it can lead to confusion. In incident response, there are two different kinds of documentation, both as important: the documentation of cases, on the one hand, and the documentation of procedures, on the other.

The documentation of cases, which usually happens through a ticketing system (See section 2.5 Infrastructure and Tooling in Chapter 2) or other secure platforms, consists in noting down all communications with the beneficiary as well as the way a handler decided on a technical solution for solving the case, what evidence they collected, what led them to make those suggestions and what resources they consulted. This makes it possible to track down how a case was solved and, if a new solution is found, will lead to the second kind of documentation.

The second kind of documentation, which is developed during the preparation stage of the incident handling process and reviewed during the whole cycle, is the documentation of procedures. This is the technical documentation which contains strategies to troubleshoot the incidents faced by our constituency.

In the work of a DSHCS, the documentation of procedures is critical to make sure that incidents are handled correctly and quality is assured. By documenting your procedures, your team can rely on a constantly updated knowledge base that will speed up their response. Therefore, the information that incident handlers rely upon must be accurate, up-to-date and easy to access.

In the preparation phase of the incident handling process we try to have a set of articles, or playbooks, that allow us to respond to a set of incidents that we know, that we understand, or that happened in our helpline or in other organisations or CERTs. So we try to always have them ready for us, train our handlers to follow them, and when an incident that meets the criteria of that article happens, this is where the handler should go. If there is documentation for that kind of incident, then the handler should follow it.

Hassen Selmi, Incident Response Lead, Access Now Digital Security Helpline (Interview, November 2021).

This chapter will focus on the different aspects to consider when creating the documentation of your incident handling procedures: the guiding principles, planning, platforms and formats, collaboration strategies and style guidelines.

The Basic Principles of Technical Documentation

The creation and maintenance of a DSHCS's technical knowledge base is an ongoing collaborative effort both within individual CERTs and helplines and in the community of digital security organisations for civil society at large. This collaborative effort has led to the adoption of some of the best practices established in the tech industry.

Whether it is an end-user guide for a phone app or a piece for the knowledge base included in a digital security helpline's ticketing system, every kind of technical documentation should be:

Participatory: it should include everyone who will be using it, so there should be clear ways to contribute to it and all changes should be tracked.
Current and updated: incorrect documentation can be more misleading than missing documentation.
Unique: there should be only one place where the documentation is maintained, to avoid inconsistencies between versions.
Discoverable: documentation needs to be found where it is needed.
Comprehensible for its end users: technical slang should be avoided.
Protected from unauthorised attempts at changing its content.
Easy to reproduce for other projects.
Easy to deploy to different formats: like websites, mobile apps or PDF files, among others.

Planning the Creation of New Documentation

A DSHCS can document technical solutions both for their incident handlers and for their beneficiaries, but sometimes writing can also be required for collaborations with partners, advocacy campaigns, media communications and so on. Especially in the case when a piece of documentation is addressed to basic users, it is always worth asking whether the specific solution you want to document hasn't already been presented by other reputable digital security websites. If so, instead of writing from scratch, you could, for example, link a good resource to your knowledge base.

Before you start writing, it is a good practice to explore existing documentation, both to make sure that you aren't duplicating efforts and that you have a clear idea of the technical solutions required to solve a particular incident.

Once you have a good idea of what you want to write, try to develop your new piece of documentation so it can be used in other cases and is not specific to a case you've just seen. To do so, you can answer the following questions:

Who are you addressing?

Will you send this piece of documentation to individual beneficiaries by email or will you publish an advisory on your website for everybody to read? You could also be writing for incident handlers working in other organisations, for someone running an advocacy campaign or even for a talk at a specialised conference.
What do you want to accomplish?

Would you like your incident handlers to find quick technical solutions for the cases their beneficiaries are facing? Or are you writing a template for messages you often send to your beneficiaries? Are you preparing a security advisory to warn all your constituency about a new kind of digital attack? Or would you like to prepare a public report that can be sent to the media?
What kind of content best meets your audience's needs?

Consider the background of your readers. Are they IT professionals or basic users? Do they need accurate technical details or simple step-by-step instructions with screenshots? Will you need to add pictures to your guide or would it be even better to create a video or an infographic?
How will your content be found by its audience?

Will you include this content in your ticketing system? Will it be published on your website? Are you creating a manual that will be turned into a printable PDF or an app for mobile devices? Will your content be available both online and offline?
Will the content be translated or localised?

Based on your intended audience, you may want to translate your content to the languages and cultural references that are most used by the people you would like to reach.

By answering these questions, you can define the content, style and format of your piece of documentation. For example:

If you need to warn your constituency about a new threat, you should write quickly and polish your message later.
If the budget and timeline are tight, you might choose to share a simple text with the relevant people as quickly as possible and think of a nicer format when resources are available.
If the audience is large and the topic complex, a short video with subtitles might be helpful.
If you are writing technical instructions for incident handlers, you should include technical details and make the documentation available in the same platform where your incident handlers document their cases (e.g. a ticketing system).
If you are writing documentation that can be used by other civil society organisations, it is a good idea to use simple language that can be easily translated and publish your documentation with a licence and in a format that can be re-used by others.

Platforms and Formats for Technical Documentation

The most common tool used to develop documentation under the mentioned guiding principles, both in the IT industry and in the movement offering digital protection to civil society, is git, a technology for version control. In most cases, it is used with Markdown, a simple markup language, and a static site generator to deploy the content to a searchable and user-friendly website.

Git for Version Control

Git is the most commonly used version control software for writing technical documentation collaboratively. Its main feature is to allow users to track the changes made to every file in a folder, so there is a record of every individual edit. It also allows reverting changes to a specific version, if needed.

Git makes collaboration easier by allowing to merge changes made by multiple people into one source. Another helpful feature of this software is the possibility to protect the collaborators' identities thanks to the option of creating private repositories that are only accessible to a selected group of users. Additionally, it allows to report issues, manage contributors, assign different roles, document the process, access analytics, etc.

Documentation managed in git repositories is usually hosted on third-party platforms like Github or Gitlab, or in self-hosted Gitlab instances. Some examples of git-based documentation developed collaboratively by civil society are:

Access Now Digital Security Helpline Community Documentation, which is deployed from this Gitlab.com repository.
The Digital First Aid Kit, which is deployed from this Gitlab.com repository.
SAFETAG - Security Auditing Framework and Evaluation Template for Advocacy Groups, which is deployed from this Github.com repository.

There are many resources online to learn how to use git. Look until you find the one that best suits your learning needs. The git - the simple guide can be a good starting point. Although git is not complex for a regular contributor, some experience is required to get familiar with its logic and commands.

Markdown for Writing

In all the examples above, documents are written in Markdown, a lightweight markup language created by Aaron Swartz and John Gruber in 2004 to enable people "to write using an easy-to-read and easy-to-write plain text format, and optionally convert it to structurally valid XHTML (or HTML)".

Markdown documents can be converted into many different formats, allowing for the creation of websites, mobile apps, e-books and PDFs starting from the same source.

It is worth noting that, although most projects led by the civil society community use Markdown, other markup languages are used for technical documentation, in particular AsciiDoc and reStructuredText (reST).

If you are new to Markdown, you can have a syntax cheat sheet at hand for reference:

Static Site Generators to Create Websites

To convert Markdown into searchable websites, static site generators like Jekyll, Gatsby or Metalsmith are commonly used.

Static site generators are an alternative to content management systems like WordPress or Drupal, where content is managed and stored in a database on the webserver. So instead of retrieving content from a database each time there is a request for web content, the static site generator deploys the entire website after each update and creates a tree of HTML files ready to visit.

A nice plus to this git-based infrastructure is that it is relatively simple to maintain. Static sites are robust against the attacks and trolling that are common in platforms like wikis - especially if they are open to editing by any user - or other web applications or dynamic websites, which require a lot of work to keep secure and make sure that content is not edited maliciously or by mistake.

Collaborative Documentation

By using a documentation infrastructure based on git, it also becomes possible for any other helpline or individual who has access to that git repository to use the same knowledge base to create their own website, mobile app, e-book, etc., and also to receive and submit updates to it.

This is made possible by the same architecture of git-hosting hubs like Gitlab or Github, which allows for making a copy of a project and submitting merge (or pull) requests to it after it has been changed in the copy, or "fork".

Given the limited amount of resources available in the civil society sphere to create technical documentation that is constantly updated, it has become an established practice to collaborate on shared technical documentation resources. This requires avoiding formats that are not easy to download and duplicate and are not subject to version control, like wikis, websites, documents hosted on Google Drive, or PDFs, and licensing content in a way that allows for collaboration and the creation of derivative works.

The collaboration approach also makes it possible to avoid duplicating efforts, as existing resources can be re-used instead of being written from scratch more than once.

Style Guidelines

Documentation of technical procedures for DSHCSs should be written in a language that is simple to read and inclusive, considering that often incident handlers are not English native speakers and that nobody is an expert on everything, especially in the civil society sphere.

In general, it's good to apply some basic rules that are recommended to all technical writers:

Write short sentences that sound natural and friendly.
Use common words as much as possible, don't use jargon or acronyms unless you really have to (and in that case, explain them at least once).
Remember to be inclusive of all genders by using gender-neutral words and pronouns.
Use active voice (actor + verb + target) as much as possible.
Lists are a good resource for visualising information quickly.
Link useful external resources in case the beneficiary needs more information on an issue.

There are many resources out there on how to write good technical documentation. What follows is just a short list:

A list of technical writing resources in Google's Technical Writing Courses for Engineers
Microsoft Style Guide
- Tips on Bias-Free Communication
- Writing Step-by-Step Instructions