---
layout: page
title: Redacting Sensitive Information
---
# Redacting Sensitive Information
In some countries, local requirements mean that requests need to contain personal information such as the address or ID number of the person asking for information. Usually requesters do not want this information to be displayed to the general public.
Alaveteli has some ability to deal with this through the use of Censor Rules.
The [theme](https://github.com/mysociety/derechoapreguntar-theme) we'll use as an example requires a National Identity Card Number and what's known as General Law in Nicaragua (Date of Birth, Domicile, Occupation and Marital Status).

## Identity Card Number
We'll start off by looking at the National Identity Card Number (ID Number from here). Its a good example of something that is relatively easy to redact. It's unique for each user, and it has a specified format to match against.
To send the ID Number to the authority we'll override the [initial request template](https://github.com/mysociety/alaveteli/blob/master/app/views/outgoing_mailer/initial_request.text.erb) (code snippet shortened):
<%= raw @outgoing_message.body.strip %>
-------------------------------------------------------------------
<%= _('Requestor details') %>
<%= _('Identity Card Number') %>: <%= @user_identity_card_number %>
When a request is made the user's ID Number is now added to the footer of the outgoing email.

At this point we haven't added any Censor Rules. When the authority replies it is unlikely that the responder will remove the quoted section of the email:

We could add a Censor Rule for the individual request, but as every request will contain a user's ID Number its better to add some code to do do it automatically.
To illustrate this we'll patch the `User` model with a callback that creates a Censor Rule when the user is created and updated.
# THEME_ROOT/lib/model_patches.rb
User.class_eval do
after_save :update_censor_rules
private
def update_censor_rules
censor_rules.where(:text => identity_card_number).first_or_create(
:text => identity_card_number,
:replacement => _('REDACTED'),
:last_edit_editor => THEME_NAME,
:last_edit_comment => _('Updated automatically after_save')
)
end
end
You can see the new Censor Rule in the admin interface:

Now the ID Number gets redacted:

It also gets redacted if the public body use the ID Number in the main email body:

A censor rule added to a user only gets applied to correspondence on requests created by that user. It does not get applied to annotations made by the user.
**Warning:** Redaction in this way requires the sensitive text to be in exactly the same format as the Censor Rule. If it differs even slightly, the redaction can fail. If the public body was to remove the hyphens from the number it would not be redacted:

**Warning:** Alaveteli also attempts to redact the text from any attachments. It can only do this if it can find the exact string, which is often not possible in binary formats such as PDF or Word.
Alaveteli can usually redact the sensitive information when converting a PDF or text based attachment to HTML:

This PDF does not contain the string in the raw binary so the redaction is _not_ applied when downloading the original PDF document:

## General Law
The General Law information is much harder to automatically redact. It is not as structured, and the information is unlikely to be unique (e.g. Domicile: London).
We'll add the General Law information to the [initial request template](https://github.com/mysociety/alaveteli/blob/master/app/views/outgoing_mailer/initial_request.text.erb) in the same way as the ID Number:
<%= _('Requestor details') %>:
<%-# !!!IF YOU CHANGE THE FORMAT OF THE BLOCK BELOW, ADD A NEW CENSOR RULE!!! -%>
===================================================================
# <%= _('Name') %>: <%= @user_name %>
# <%= _('Identity Card Number') %>: <%= @user_identity_card_number %>
<% @user_general_law_attributes.each do |key, value| %>
# <%= _(key.humanize) %>: <%= value %>
<% end %>
===================================================================
Note that the information is now contained in a specially formatted block of text.

This allows a Censor Rule to match the special formatting and remove anything contained within. This Censor Rule is global, so it will act on matches in all requests.
# THEME_ROOT/lib/censor_rules.rb
# If not already created, make a CensorRule that hides personal information
regexp = '={67}\s*\n(?:[^\n]*?#[^\n]*?: ?[^\n]*\n){3,10}[^\n]*={67}'
unless CensorRule.find_by_text(regexp)
Rails.logger.info("Creating new censor rule: /#{regexp}/")
CensorRule.create!(:text => regexp,
:allow_global => true,
:replacement => _('REDACTED'),
:regexp => true,
:last_edit_editor => THEME_NAME,
:last_edit_comment => 'Added automatically')
end

**Warning:** Redacting unstructured information is a very fragile approach, as it relies on authorities always quoting the entire formatted block.
In this case the authority has revealed the user's Date of Birth and Domicile:

Its really difficult to add a Censor Rule to remove this type of information. One suggestion might be to remove all mentions of the user's Date of Birth, but you would have to account for [every type of date format](http://en.wikipedia.org/wiki/Calendar_date#Date_format). Likewise, you could redact all occurrences of the user's Domicile, but if they a question about their local area (very likely) the request would become unintelligible.

The redaction has been applied but there is no way of knowing the context that the use of the sensitive word is used.
