Pseudonymization is a reversible data masking process that replaces a data set with an alias or pseudonym. The key is therefore reversibility; a factor that differentiates this masking method from anonymization.
The term pseudonymization is often incorrectly associated with the replacement of personal data by aliases or pseudonyms. However, this concept is not correct, since it is also possible to anonymize data by replacing them with pseudonyms. So how does pseudonymization differ from anonymization, if not by the use of a pseudonym? In the reversibility of the process.
Anonymization involves removing the data permanently, and then writing in the place where the data were located, aliases, pseudonyms or simple black strikethroughs. Pseudonymization, on the other hand, allows the same replacement process to be carried out, but offers the user the possibility of returning to the original data by reversing the process.
Therefore, both anonymization and pseudonymization can be applied by replacing personal data with pseudonyms. The only difference lies in the ability to revert to the original data, usually thanks to a key generated in the pseudonymization process.
This data masking process is ideal for situations where the objective is to keep the data protected, and access to the original data must be given to certain users on an occasional or frequent basis.
Let’s think of databases that accumulate sensitive customer, patient or employee data that need to be consulted periodically by business or HR employees. In this case, anonymization would not allow access to the original personal data, so it is necessary to opt for pseudonymization. The data will be protected, being replaced by pseudonyms or synthetic data until a user with access permissions requests its consultation, at which time the process will be reversed to show the original personal data.
The sending and/or sharing of documents or databases is another case in which pseudonymization becomes the best method of data protection due to its reversibility. Personal data are pseudonymized by the sender for sending. The recipient receives the pseudonymized documents and/or databases and will be able to access the original content thanks to the key generated in the initial process, which allows this process to be reversed.
Pseudonymization is notable for its reversibility and not for the use of pseudonyms. However, when is there value in replacing personal data with pseudonyms instead of asterisks or black lines?
The value of replacing personal data with pseudonyms is none other than maintaining the readability and context of the documents. Anyone can read and understand an anonymized or pseudonymized document where pseudonyms are provided instead of the original personal data. Thanks to this method of substitution, training and knowledge management become a much easier task ensuring compliance with the GDPR.
The need to protect customers’ or patients’ personal data before giving third parties access to documents or databases frequently arises in sectors such as legal or healthcare. Training takes on greater importance in these sectors of activity, where experience is crucial for good practice and must be transferred and shared among professionals.
By replacing personal data with pseudonyms, it is possible to create templates of original documents that have been created for a particular customer or employee. The original document will then display pseudonyms instead of the original personal data, thus guiding the user in replacing these pseudonyms with new data corresponding to customers or employees.
This process of automation and template generation does not necessarily involve human participation in the process, as there are applications that directly consume pseudonyms or tokens.
After analyzing the differences between anonymization and pseudonymization, along with the benefits of replacing personal data with pseudonyms, the question is how to do it automatically?
Nymiz is a data anonymization and pseudonymization tool that automates this process to protect personal data in both documents and databases. We allow our customers to customize as much as possible the data masking process by choosing between: