Python破解Flappy Bird游戲

猿友 2018-08-06 18:27:50 瀏覽數(shù) (4290)

反饋

本文轉(zhuǎn)載至知乎ID：Charles（白露未晞）知乎個人專欄

下載W3Cschool手機App，0基礎(chǔ)隨時隨地學(xué)編程>>戳此了解

導(dǎo)語

昨天在看GitHub上深度學(xué)習(xí)方面stars較高的開源項目，于是發(fā)現(xiàn)了這個有趣的內(nèi)容：

使用深度強化學(xué)習(xí)破解Flappy Bird游戲(深度Q-學(xué)習(xí))。

參考文獻(xiàn)

內(nèi)容主要參考自GitHub開源項目：

Using Deep Q-Network to Learn How To Play Flappy Bird

鏈接：

https://github.com/yenchenlin/DeepLearningFlappyBird

原理簡介

此項目參考了深度增強學(xué)習(xí)中的深度Q學(xué)習(xí)算法，并表明了此學(xué)習(xí)算法可以推廣到破解Flappy Bird游戲當(dāng)中。也就是說，項目是利用了Q-learning的變體進行訓(xùn)練的，其輸入是原始像素輸出是估計之后行動的數(shù)值函數(shù)。

PS：

若對深度強化學(xué)習(xí)感興趣，公眾號相關(guān)文件中也提供了一篇名為Demystifying Deep Reinforcement Learning的論文供大家學(xué)習(xí)，這也是原作者強烈推薦的論文。

網(wǎng)絡(luò)架構(gòu)：

在此之前的預(yù)處理為：

（1）灰度化圖像；

（2）圖像大小調(diào)整為80×80；

（3）每4幀畫面堆疊成一個80x80x4輸入數(shù)組。

網(wǎng)絡(luò)最終輸出結(jié)果為2×1的矩陣，用以決定小鳥是否行動。（也就是是否按屏幕咯~~~）

測試環(huán)境

電腦系統(tǒng)：Win10

Python版本：3.5.4

Python相關(guān)第三方庫：

TensorFlow_GPU版本：1.4.0

Pygame版本：1.9.3

OpenCV-Python版本：3.3.0

具體配置細(xì)節(jié)請參考相關(guān)網(wǎng)絡(luò)文檔?。。?/p>

運行演示

命令行窗口進入DeepLearningFlappyBird文件夾輸入py -3.5 deep_q_network.py回車運行即可：

結(jié)果如下：

更多參考文獻(xiàn)

（1） Mnih Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. Human-level Control through Deep Reinforcement Learning. Nature, 529-33, 2015.

（2） Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing Atari with Deep Reinforcement Learning. NIPS, Deep Learning workshop.

（3）Kevin Chen. Deep Reinforcement Learning for Flappy Bird Report | Youtube result.

鏈接：

https://youtu.be/9WKBzTUsPKc

（4）https://github.com/sourabhv/FlapPyBird

（5）https://github.com/asrivat1/DeepLearningVideoGames

HTML

0 人點贊