Release Notes

Release notes


  • v1.4 [20Q1]
    • 1.4.1:
      • tensorforce reintegrated (due to an incompatibility between tfagents and tensorforce, tensorforce must be explicitely activated by a call to agents.activate_tensorforce() )
      • upgrade to tfagents 0.3, tensorflow 2.0.1
      • kwargs for register_with_gym
    • 1.4.0: agent saving & loading (see intro Saving & loading a trained policy); lineworld as test environment included
  • v1.3 [19Q4]
    • 1.3.1: agent.score substituted by agent.evalute;
    • 1.3.0: migration to tensorflow 2.0; support for tensorforce and keras-rl suspended until support for tf 2.0 is available
  • v1.2 [19Q3]
    • 1.2.2: fix for CemAgent and SacAgent default backend registration
    • 1.2.1: SacAgent for tfagents preview; notebook on 'Agent logging, seeding and jupyter output cells'
    • 1.2.0: Agent.score
  • v1.1 [19Q3]
    • 1.1.23: CemAgent for keras-rl backend; DqnAgent, RandomAgent for tensorforce
    • 1.1.22: DuelingDqnAgent, DoubleDqnAgent with keras-rl backend
    • 1.1.21: keras-rl backend (dqn)
    • 1.1.20: #54 logging in jupyter notebook solved, doc updates
    • 1.1.19:
      • jupyter plotting performance improved
      • plot.ToMovie with support for animated gifs
    • 1.1.18: tensorforce backend (ppo, reinforce)
    • 1.1.11:
      • plot.StepRewards, plot.Actions
      • default_plots parameter (instead of default_callbacks)
  • v1.0.1 [19Q3]
    • api based on pluggable backends and callbacks (for plotting, logging, training durations)
    • backend: tf-agents, default
    • algorithms: dqn, ppo, random
    • plots: State, Loss (including actor-/critic loss), Steps, Rewards
    • support for creating a mp4 movie (plot.ToMovie)
  • v0.1 [19Q2]
    • prototype implementation / proof of concept
    • hard-wired support for Ppo, Reinforce, Dqn on tf-agents
    • hard-wired plots for loss, sum-of-rewards, steps and state rendering
    • hard-wired mp4 rendering

Design guidelines


  • separate "public api" from concrete implementation using a frontend / backend architecture (inspired by scikit learn, matplotlib, keras)
  • pluggable backends
  • extensible through callbacks (inspired by keras). separate callback types for training, evaluation and monitoring
  • pre-configurable, algorithm specific train & play loops

Class diagram


ClassDiagram