Q-Learning with TensorFlow

BreakoutNice little demo here of how to use TensorFlow to implement reinforcement learning. I didn’t get results anywhere near as good as indicated though. After a day and a half using my GTX 1070, I didn’t see the score get much above 40. It’s worth noting that training with the display on slows the process down by a factor of 2 or 3 – took me a while to realize this so a lot of the training was running slower than it should have been (about half) so maybe that’s the explanation.

I was seeing 50 to 70 iterations per second with display off, 25 per second with the display on. Another interesting thing is that it continuously chews up memory – I had to restart it a few times because it was up to 20GB!

The code implements the ideas in this paper from Deep Mind incidentally. Definitely worth a read.

Advertisements

One thought on “Q-Learning with TensorFlow

  1. Pingback: Encouraging TensorFlow to use more cores | richards technotes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s