Tumgik
Studying DQN and A3C
I’ve been reading and studying the above algorithms in anticipation of implementing the game-playing agent for my project.
https://jaromiru.com/2016/09/27/lets-make-a-dqn-theory/
https://jaromiru.com/2017/02/16/lets-make-an-a3c-theory/
https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-8-asynchronous-actor-critic-agents-a3c-c88f72a5e9f2?fbclid=IwAR2eYgYjd3izbrxKmd1BzcVMLb7ekFfPIiiFUFSvWuBx4QoCe2GzCesfX4U
0 notes
The Discovery of A3C!
So... It’s been a while... 
While I don’t really have an excuse, I can assure you that it’s not been due to a lack of me working.
While the amount of work i’ve done has not been astronomical, The main reason i’ve not been posting is a mix of laziness and not really knowing what to say.
As of so far, I have been studying the basics of reinforcement learning. This includes : The k-armed bandit problem, action-value methods, algorithms to solve the explore-exploit dilemma (epsilon greedy, UCB1, Thompson sampling) and I have made a splash into MDPs. I have also (kind of) implemented a toy Tic-Tac-Toe agent that will never lose a game.
While this is all well and good, I don’t feel that I have spent my time well over the last month and a half. I feel as if I could be so much further into my project, especially since all that I have done so far is study, I havn’t even begun any kind of implementation of the final piece.
So in light of this fact, I started looking at different algorithms used to implement an RL agent that can play atari games. One of which I came across was the successor to DeepMind’s Deep Q-Learning Algorithm, A3C. 
I plan on attempting an implementation of this algorithm using tensorflow and I also plan to attempt making posts like this more regularly (Ideally every day but I doubt that will last long)
You can expect the next slew of posts to be about me struggling with ubuntu to get dependencies working and a whole host of other fun stuff
Stefan
0 notes
Here’s the link to my Proposal in Google Docs
0 notes
Submitted my Project Proposal
I recently submitted my project proposal for my Final Project, I decided against my original idea as it seemed to be a very hard problem to tackle, and in the words of my supervisor would be “better suited to a PhD student”.
The main reasoning behind his words where that the idea wasn’t very well developed (regarding previous research in the relevant fields) and in order to pull it off I would have to do a large amount of not only self-motivated learning (something i’m totally against but in the context of a 3rd year undergraduate project would be overkill), but cross-departmental inquiry as it would involve computing, sociology and psychology.
I instead decided to go for a project more in-line with work that I have done/am currently doing: developing a Reinforcement Learning agent to play Atari 2600 games. 
This, I feel, is a much better choice as it’s more focused and there is a large amount of previous research already done on the topic which I can use to propel my progress. 
Overall I’m quite excited about it. I look forward to learning more about RL and Machine Learning in general.
0 notes
First Thoughts (and Affective Computing)
Before I begin: While browsing the tensorflowtutorial blog here on tumblr, I found a mention of something called “Affective Computing” (”the study and development of systems and devices that can recognise, interpret, process, and simulate human affects ”). This seems to have some small relation to my project idea, in that, I want to (or see if it is possible to) detect and measure small intricacies in a user’s interaction with a simulation (either related to a real world problem, maybe political; something where you can get an effective reaction, or some kind of gameified version of that) in order to extract some metric of embedded bias born from their personal life experiences, to measure how this bias effects the decisions they make. 
I suppose the first problem in my project would be to make the software. To do that we need to think of what is needed of the software: some kind of metric of user bias, and a methodology to obtaining that data in order to implement it.
Metric of user bias
There are a few ways to go about this that I can think of off the top of my head; Classification, some non-discrete measurement or some form of psychometrics? (must look into this)
Methodology
There’s this paper on Affect Classification, where the researchers have three people read out lines, grouped into approving and disapproving sets, as if they were talking to children. They classified these spoken lines using “an optimal combination of six acoustic measurements” receiving results “using Gaussian probability models for each feature in isolation” i.e. they measured specific vocal features (Energy change, Spectral tilt) achieving accuracies upwards of 65%. I suppose the simulation would have to have an air of that about it, trying to get the user to react in a the same or similar way as they would with a real world problem, then measuring that reaction and classifying it. The only objection I have with this is that, as is noted in the paper, are these simulated reactions going to be accurate representations of real-world reactions? (Leading me to want to use real-world problems/data with alot of opposing opinions) 
Not quite sure yet, will do further research into these ideas.
0 notes
Project Ideation submitted
Submitted the my project ideation on Monday. I don’t think i’m 100% solid my Initial idea yet though. It seems very ‘conceptual’, and i’m not really sure how i’d go about doing it... Either way, there’s still some time to try and come up with solid foundations so I guess we shall see.
Also, I hope it doesn’t come across as pretentious? I sometimes feel I can get a bit lofty when I express topics i’m passionate about. xD
Anyway, here is the full document:
Stefan Roesch Project Ideation
Contexts
1. Research Project As I intend to do a PhD, I am mainly focused on improving my ability to do good research
2. Using games to articulate some problem I love games and would love to use them in some way to solve real word problems. I believe that they have a unique ability to not only provide entertainment, but articulate concepts, both intuitively and with great depth.
3. Interactive simulation of some problem to explore outcomes I want to not only define problems in an intuitive way, but use the fact that games can be very personal, in respect to user input, to understand how personal bias can affect an output. 
4. Exploring a technical intricacy While I understand that using games can be a good way to tap into an individual’s personal or social experiences. I would be interested in exploring this application with technical problems e.g. how one’s educational background can affect methods used to solve specific problems
5. Accessible Interfaces
Techniques
1. Machine Learning (Keras/Tensorflow or MATLAB) 
2. Reinforcement Learning 
3. Simulation 
4. Using Vulkan or some other graphics API
5. Artificial Intelligence
Inspirations
Deepmind’s Alpha Projects: In these projects, Google’s Deepmind use Reinforcement Learning to train a model to play both board games (Chess, Go) and some Atari 2600 games to a superhuman level. This is relevant to my choices as the technologies they use are machine learning based, and they are used to try and solve a problem, namely; intelligence.
Mini Metro: Mini Metro is a game where the aim, essentially, is to manage a tube system. The player does this through manipulating a tube map style interface. The main inspiration this game gave me was it’s simple art design, that is also very intuitive and familiar (at least to a Londoner) 
Python: Python is a programming language that is easy to use for people who aren’t computer scientists. I add this as an inspiration because it is tool that is used to do sophisticated computations, but the user is not necessarily highly literate in the intricacies of computing. E.g. Keras. Keras is a high level deep learning library, and it is very usable without a large amount of prior knowledge in the intricacies of machine learning; it can be used as a first step in the field. 
Initial Ideas: 
1. A research project using some form of Machine Learning to create an interactive simulation of real data to articulate some problem and to extract and process a users solution to it. This solution would be based on the users interactions made with the simulation and would highlight certain biases as to why the user took specific steps to solve the problem. The main interest is to provide insight into why people make the decisions they make given their situation. 
2. A Machine Learning model that would learn to play a specific game to a superhuman or near superhuman level. Providing strategies and play styles that a human might never come up with. This harks back to research done by the company Deepmind to create similar AI’s that play games better than humans. The point of interest here however would be to learn new ways of playing a game to a high level without this human bias, instead of “Solving Intelligence”. 
3. A “god game” that uses machine learning to teach NPCs certain instincts and have them learn and react to the player’s inputs in an interesting way. E.g. the AI is told that it needs to eat to survive, the player rewards the AI for doing some arbitrary task, the AI learns that through doing that task, they are rewarded with what they need to survive and thus starts to not only repeat that task, but maybe come up with interesting ways of performing that task with less effort.
0 notes
Project Ideation Started
I’m thinking some research project that uses Machine Learning to build an interactive simulation of some problem, and through the manipulation of this simulation you can come to a ‘custom’ solution to it
1 note · View note