In the literal sense, to anonymize is to express data related to entities or people, eliminating the reference to their identity. But how is it possible to remove that reference? What are the differences between data anonymization and pseudonymization? There are many doubts regarding this concept also known as data masking.
Anonymization is a data masking technique that irreversibly alters the data so that the interested party is not directly or indirectly identifiable.
The key concept in this case is irreversibility since, thanks to it, anonymized data is not considered personal data; they do not identify any person or entity nor is there the possibility of reversing the process to obtain the original data.
Pseudonymization is a process that allows an original data set (for example, an email) to be replaced by an alias or pseudonym. Pseudonymization is reversible as it masks data but allows re-identification by reversing the process through a conversion key. This masking technique can also become irreversible as long as the conversion key, which allows you to return to the original data, is destroyed.
Therefore, the main difference is that pseudonymization is reversible, while anonymization is not. As we will see later, this has important implications at regulatory level.
As detailed in the previous section, irreversibility is the main characteristic that differentiates one data masking technique from another. This differential characteristic has implications at the regulatory level since the General Data Protection Regulation (GDPR) establishes a different treatment for anonymized and pseudonymized data.
No. Anonymized data is outside the scope of the GDPR since it is only applicable to the processing of personal data that allows the identification of an individual directly or indirectly. If the data is anonymized so that the interested party is no longer identifiable (directly or indirectly), the requirements established in the regulations will not apply.
Yes. The GDPR still considers pseudonymized data as personal data. Why? As explained in recital 26: “… data that has been pseudonymised, which could be attributed to a natural person through the use of additional information, should be considered as information about an identifiable natural person”. Therefore, in this case the requirements established in the regulation are applicable.
Once the differences between the two techniques are understood, the choice between the two procedures will depend on each use case and ultimately, the objective pursued by masking sensitive data.
Anonymized data is outside the scope of data protection regulations. However, they do not allow the original data to be consulted again. For this reason, it is a data masking technique that adapts to use cases where the main objective is to guarantee privacy and avoid the exposure of personal data and where it is not necessary for the recipient of the information to know the original data. For example:
Thanks to the possibility of reversing the process to consult the original data, pseudonymization is the ideal technique for use cases in which the objective is to guarantee privacy and data protection, while the data is stored or sent.
Once the differences between both techniques are understood, the question is: what are the benefits of anonymizing or pseudonymizing personal data?
In addition to complying with the GDPR and thus mitigating regulatory risk, data anonymization protects information at the data level from cyber attacks or data leaks that are of great concern to companies today. What value does the information have in the hands of third parties if it is not possible to identify the personal data in it?
On the other hand, thanks to the different substitution methods that exist, ranging from blacklining to synthetic data, anonymization allows organizations to extract value from their information without putting personal data at risk. How? Enabling the application of data analytics or big data on information in which personal data has been previously anonymised.