PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
Multi-Robot Warehouse (RWARE): A multi-agent reinforcement learning environment
My solution to Project 3 - Collaboration and Competition using MADDPG
Code for paper 'Learning transferable cooperative behaviors in multi-agent teams' (ICML 2019)
Environment generation code for the paper "Emergent Tool Use From Multi-Agent Autocurricula"
This is the source code of "Efficient training techniques for multi-agent reinforcement learning in combatant tasks".