Skip to content
Release Notes
Release notes
v1.4 [20Q1]
1.4.1:
tensorforce reintegrated (due to an incompatibility between tfagents and
tensorforce, tensorforce must be explicitely activated
by a call to agents.activate_tensorforce() )
upgrade to tfagents 0.3, tensorflow 2.0.1
kwargs for register_with_gym
1.4.0: agent saving & loading (see intro
Saving & loading a trained policy );
lineworld as test environment included
v1.3 [19Q4]
1.3.1: agent.score substituted by agent.evalute;
1.3.0: migration to tensorflow 2.0;
support for tensorforce and keras-rl suspended until support for tf 2.0 is available
v1.2 [19Q3]
1.2.2: fix for CemAgent and SacAgent default backend registration
1.2.1: SacAgent for tfagents preview; notebook on 'Agent logging, seeding and jupyter output cells'
1.2.0: Agent.score
v1.1 [19Q3]
1.1.23: CemAgent for keras-rl backend; DqnAgent, RandomAgent for tensorforce
1.1.22: DuelingDqnAgent, DoubleDqnAgent with keras-rl backend
1.1.21: keras-rl backend (dqn)
1.1.20: #54 logging in jupyter notebook solved, doc updates
1.1.19:
jupyter plotting performance improved
plot.ToMovie with support for animated gifs
1.1.18: tensorforce backend (ppo, reinforce)
1.1.11:
plot.StepRewards, plot.Actions
default_plots parameter (instead of default_callbacks)
v1.0.1 [19Q3]
api based on pluggable backends and callbacks (for plotting, logging, training durations)
backend: tf-agents, default
algorithms: dqn, ppo, random
plots: State, Loss (including actor-/critic loss), Steps, Rewards
support for creating a mp4 movie (plot.ToMovie)
v0.1 [19Q2]
prototype implementation / proof of concept
hard-wired support for Ppo, Reinforce, Dqn on tf-agents
hard-wired plots for loss, sum-of-rewards, steps and state rendering
hard-wired mp4 rendering
Design guidelines
separate "public api" from concrete implementation using a frontend / backend architecture
(inspired by scikit learn, matplotlib, keras)
pluggable backends
extensible through callbacks (inspired by keras). separate callback types for training, evaluation and monitoring
pre-configurable, algorithm specific train & play loops
Class diagram