This repository contains the code for the paper "Differentially Private Release of Hierarchical Origin/Destination Data with a TopDown Approach"
To run the code, you need to install a conda environment using the environment.yml file
conda env create -f environment.yml
Then, activate the environment
conda activate top-down
To generate the synthetic dataset run the shell files into /run_command/run_preprocess/
folder. Inside you will
find two shell files, one for the binary tree, the other for the random tree. You can change the parameters of the
synthetic dataset in the shell files, like the sparsity, the number of levels, the seed for the randomizer.
It is necessary to download the dataset from ISTAT website.
This files needs to be inserted into the /preprocess_data
directory.
Then, it is sufficient to run the python script
python preprocess_data/preprocess_ISTAT_data.py
This generates a data folder containing the datasets.
The experiment on the Italian dataset can be run using the shell file /run_command/Italy.sh
cd run_command
./Italy.sh
The experiments on the synthetic dataset can be run using the shell file /run_command/all_synthetic.sh
cd run_command
./all_synthetic.sh
The experiments will generate new results.pickle
files in the /results
folder. The results presented in the paper
are already present in the folder as results_1.pickle
. To plot the results, you can use the jupyter notebook in the
/notebooks
folder.