Skip to content
Snippets Groups Projects
Select Git revision
  • main
1 result

causal_nie

  • Clone with SSH
  • Clone with HTTPS
  • Reinforcement learning of causal variables using mediation analysis

    Installation & setup

    First, install pipenv in your preferred way (see documentation). Then, create a virtual environment for the project and install the needed packages into it by running pipenv install in the project root folder.

    Some scripts log data using Neptune. This requires either setting up a free account, or for testing purposes using the anonymous API token. To use the anonymous API, set the environment variable NEPTUNE_API_TOKEN='ANONYMOUS', and pass --neptune_project 'shared/onboarding' as an argument to the script. If you instead set up a Neptune user, set the API key to your personal key, and pass a personal project name to the script. When a script is run, it will print a URL to the online Neptune UI to standard output.

    Running scripts for the twostage experiment:

    It is recommended to start with this experiment due to the relatively shorter training time. The main way to interact with the code is through scripts located in the scripts/ folder. To run a script in the pipenv environment previously created, run pipenv run python -m scripts.experiment --arg val where experiment is the name of the script, and arguments can be passed as --arg val.

    Running scripts for the doorkey experiment

    The doorkey experiment is divided into three phases:

    • Train the policy
      πa\pi_a
      (scripts/doorkey_pia)
    • Train the causal variable
      Φ\Phi
      (scripts/doorkey_phi)
    • Train
      πb\pi_b
      (scripts/doorkey_pib) In addition, it is possible to specify a cross-entropy target for the second script so as to obtain

    The scripts were run on a cluster in parallel, however, since this implementation is specific to the cluster (machine names, directories, etc.), we will include a simplified version of the scripts which execute locally. The workflow is similar:

    • Load a resource dictionary
    • Load saved models/recorded statistics from disk
    • Run model for a number of episodes (typically ~1000). This step avoids tying up the same machine for exessive time but is not necesary
    • Save model and exit
    • Repeat

    Citing

    @article{herlau20,
    	title={Reinforcement Learning of Causal Variables using Mediation Analysis},
    	author={Herlau, Tue and Larsen, Rasmus},
    	journal={36th AAAI conference on artificial intelligence},
    	year={2022}
    }