Anonymize your data with Symfony and the

DbToolsBundle

Simon Mellerin

Makina Corpus

  • Digital Service Company
  • ~50 people
  • working exclusively with FLOSS

Makina Corpus

  • Services
  • Products
  • Training

DbToolsBundle

Setting Up Anonymization

A GDPR-friendly workflow

Performance

Custom Anonymizers

Contribute

DbToolsBundle

Built from real-world needs

DbToolsBundle

A set of Symfony Console Commands

  • Backup

    
    								console db-tools:backup
    								
  • Restore

    
    								console db-tools:restore
    								
  • Anonymize

    
    								console db-tools:anonymize
    								
  • Display statistics

    
    								console db-tools:stats
    							

DbToolsBundle

Works on top of Doctrine DBAL with most popular Database vendors:

postgresql sqlite mariadb mysql sqlserver

DbToolsBundle

Setting Up Anonymization

A GDPR-friendly workflow

Performance

Custom Anonymizers

Contribute

Setting Up Anonymization

Anonymize? 🤔

Setting Up Anonymization

Map each table's column you want to anonymize with an Anonymizer.

  • Email
  • Password
  • Int
  • Float
  • String
  • Firstname
  • Lastname
  • Address
  • etc...

Setting Up Anonymization

On Doctrine Entities, from a list of various anonymizers


					namespace App\Entity;

					use Doctrine\ORM\Mapping as ORM;
					use MakinaCorpus\DbToolsBundle\Attribute\Anonymize;

					#[ORM\Entity()]
					#[ORM\Table(name: 'customer')]
					class Customer
					{
							#[ORM\Id]
							#[ORM\GeneratedValue]
							#[ORM\Column]
							private ?int $id = null;

							#[ORM\Column(length: 180, unique: true)]
							#[Anonymize(type: 'email')]
							private ?string $emailAddress = null;

							#[ORM\Column(length: 180, unique: true)]
							#[Anonymize(
								type: 'password',
								['password' => '123456789']
							)]
							private ?string $password = null;

							#[ORM\Column]
							#[Anonymize(
								type: 'integer',
								options: ['min' => 10, 'max' => 99]
							)]
							private ?int $age = null;

							#[ORM\Column(length: 255)]
							#[Anonymize(
								type: 'string',
								options: ['sample' => ['none', 'bad', 'good', 'expert']]
							)]
							private ?string $level = null;

							#[ORM\Column]
							private ?\DateTime $lastLogin = null;

							// ...
					}
					

Setting Up Anonymization

Check your config with:


					console db-tools:anonymization:dump-config
					

					Table: customer
					---------------

					----------- ------------ ---------------------------------------------------------------------------------
						Target      Anonymizer   Options
					----------- ------------ ---------------------------------------------------------------------------------
						email       email
						password    password
						lastname    lastname
						firstname   firstname
						age         age          min: 10, max: 99
						level       string       sample: [none, bad, good, expert]
					----------- ------------ ---------------------------------------------------------------------------------
				

DbToolsBundle

Setting Up Anonymization

A GDPR-friendly workflow

Performance

Custom Anonymizers

Contribute

A GDPR-friendly workflow


					user@prod:~$ console db-tools:restore --list
					

					user@preprod:~$ scp user@prod:/path/to/prod.dump /tmp/prod.dump
					

					user@preprod:~$ console db-tools:anonymize /tmp/prod.dump
					

					user@local:~$ scp user@preprod:/tmp/prod.dump /tmp/prod-anonymized.dump
					

					user@local:~$ console db-tools:restore --filename /tmp/prod-anonymized.dump
					

DbToolsBundle

Setting Up Anonymization

A GDPR-friendly workflow

Performance

Custom Anonymizers

Contribute

Performance

Try it on our benchmark app:

  • 4 DBAL connections (PostgreSQL, SQLite, MySQL, MariaDb)
  • the same Customer entities for each one with:
    • email
    • password
    • lastname
    • firstname
    • level (string among a sample of 4)
    • age
    • postal address

Performance

PostgreSQL SQLite MySQL MariaDb
100K ~9s ~10s ~32s ~24s
500K ~15s ~16s ~1m38s ~55s
1000K ~33s ~26s 😬 ~1m36s

Performance

  • Anonymizing with SQL only
  • One update query per table
  • Build with a complete query builder

DbToolsBundle

Setting Up Anonymization

Performance

Custom Anonymizers

Contribute

Custom Anonymizers


					namespace App\Anonymizer;

					use MakinaCorpus\DbToolsBundle\Anonymization\Anonymizer\AbstractAnonymizer;
					use MakinaCorpus\DbToolsBundle\Attribute\AsAnonymizer;

					#[AsAnonymizer(
							name: 'my_anonymizer', // a snake case string
							pack: 'my_app', // a snake case string
							description: 'Describe here if you want how your anonymizer works.'
					)]
					class MyAnonymizer extends AbstractAnonymizer
					{

						// ...
				

DbToolsBundle

Setting Up Anonymization

A GDPR-friendly workflow

Performance

Custom Anonymizers

Contribute

Contribute

  • 📢 · Share
    • Talk about it
    • Star on Github
  • 🪲 · Report
    • Missing documentation
    • Issues in code
  • 🧑‍💻 · Code

We need you to make the DbToolsBundle awesome!

Thanks, any question?

Resources: