Nice little demo here of how to use TensorFlow to implement reinforcement learning. I didn’t get results anywhere near as good as indicated though. After a day and a half using my GTX 1070, I didn’t see the score get much above 40. It’s worth noting that training with the display on slows the process down by a factor of 2 or 3 – took me a while to realize this so a lot of the training was running slower than it should have been (about half) so maybe that’s the explanation.
I was seeing 50 to 70 iterations per second with display off, 25 per second with the display on. Another interesting thing is that it continuously chews up memory – I had to restart it a few times because it was up to 20GB!
The code implements the ideas in this paper from Deep Mind incidentally. Definitely worth a read.