Skip to content

asappresearch/emergent-comms-negotiation

Repository files navigation

"Emergent Communication through Negotiation"

Reproduce https://openreview.net/forum?id=Hk6WhagRW&noteId=Hk6WhagRW , "Emergent Communication through Negotation", ICLR 2018 anonymous submission.

To install

  • install pytorch 0.2, https://pytorch.org
  • download this repo, git clone https://github.com/asappinc/emergent_comms_negotiation

To run

python ecn.py [--disable-comms] [--disable-proposal] [--disable-prosocial] [--enable-cuda] [--term-entropy-reg 0.5] [--utterance-entropy-reg 0.0001] [--proposal-entropy-reg 0.01] [--model-file model_saves/mymodel.dat] [--name gpu3box]

Where options are:

  • --enable-cuda: use NVIDIA GPU, instead of CPU
  • --disable-comms: disable the comms channel
  • --disable-proposal: disable the proposal channel (ie agent can create proposals, but other agent cant see them)
  • --disable-prosocial: disable prosocial reward
  • --term-entropy-reg VALUE: termination policy entropy regularization
  • --utterance-entorpy-reg VALUE: utterance policy entropy regularization
  • --proposal-entropy-reg VALUE: proposal policy entropy regularization
  • --model-file models_saves/FILENAME: where to save the model to, and where to look for it on startup
  • --name NAME: this is used in the logfile name, just to make it easier to find/distinguish logfiles, no other purpose

Stdout layout

eg if we have:

   000000 4:4/0 7:5/5 9:4/4
                                      000000 4:5/0 6:1/5 7:2/4
   000000 4:0/0 7:0/5 9:1/4
                                      ACC
  r: 0.91

Then:

  • each of the first 4 lines here is the action of a single agent
  • the ACC line is the agent accepting previous proposal
  • each proposal line is laid out as:
  [utterance]   [utility 0]:[proposal 0]/[pool 0] ... etc ...
  • if the agents run out of time, last line will be [out of time]

One negotation is printed out every 3 seconds or so, using the training set; the other negotations are executed silently. There is no test set for now.

Results so far, summary

Agent sociability Proposal Linguistic Both None
Self-interested, random term >=0.80
Prosocial, random term ~0.91 ~0.83 ~0.96 >= 0.90

Notes:

  • prosocial runs all use termreg=0.5, uttreg=0.0001, propreg=0.01
  • self-interested run uses: termreg=0.05, uttreg=0.0001, propreg=0.005

Scenario details

Prop? Comm? Soc? Rend term? Term reg Utt reg Prop reg Subjective variance Reward Greedy ratios
Y Y Y Y 0.5 0.0001 0.01 Low ~0.96 term=0.7345 utt=0.7635 prop=0.8304
Y - Y Y 0.5 0.0001 0.01 Medium-High ~0.91 term=0.6965 utt=0.0000 prop=0.8741
- Y Y Y 0.5 0.0001 0.01 High ~0.83 term=0.6889 utt=0.7849 prop=0.8222
- - Y Y 0.5 0.0001 0.01 Very low >= 0.90 (climbing) term=0.7781 utt=0.0000 prop=0.6006
Y Y - Y 0.5 0.0001 0.01 Very High ~0.25 term=0.7467 utt=0.9284 prop=0.8137
Y Y - Y 0.05 0.0001 0.005 Very Low >= 0.80 (climbing) term=0.9820 utt=0.7040 prop=0.6523

Training curves

proposal, comms, prosocial

Three training runs, identical settings:

Proposal, no comms, prosocial

No proposal, comms, prosocial

No proposal, no comms, prosocial

Proposal, comms, no social

Run 1, same entropy regularization as prosocial graphs:

Run 2, with reduced entropy regularization:

Unit tests

  • install pytest, ie conda install -y pytest, and then:
py.test -svx
  • there are also some additional tests in:
python net_tests.py

(which allow close examination of specific parts of the network, policies, and so on; but which arent really 'unit-tests' as such, since neither termination criteria, nor success criteria)

Plotting graphs

Assumptions:

  • running the training on remote Ubuntu 16.04 instances
    • ssh access, as user ubuntu, to these instances
    • remote has home directory /home/ubuntu
    • logs are stored in subdirectory logs of current local directory
    • the location of logs relative to ~ is identical on local computer and remote computer

Setup/configuration:

  • copy instances.yaml.templ to ~/instances.yaml, on your own machine
    • configure ~/instances.yaml with:
      • name and ip of each instance (names are arbitrary)
      • the path to your private sshkey, that can access these instances

Procedure

  • run:
python merge.py --hostname [name in instances.yaml] [--logfile logs/log_20171104_1234.log] \
    [--title 'my graph title'] [--y-min 75 --y-max 85]

This will:

  • rsync the logs from the remote instance identified by --hostname
  • if --logfile is specified, load the results from that logfile
    • else, will look for the most recent logfile, ordered by name
  • plots the graph into /tmp/out-reward.png