Feb 28, 2018 2 min read Contest Theory

Contest Theory

What is Contest Theory? Contest theory is a tool to describe situation where agents compete with costly efforts to win a scare prize. Quick tips: - agent: economic agent abstraction, examples include workers, sports people etc. - costly efforts: This just means the agent has to make a decision. Ie they do have infinity effort to try to win. Think of the contest people tennis players. A player cannot give 110%, it is physically impossible.

Feb 28, 2018 4 min read Contest Theory

Discouragement Effect

What is the Discouragement Effect? The discouragement effect is when the future consequences of winning or losing the current contest leads to decreased effort Konrad (2012). This is due the future contest reducing the overall value of winning. Think about a tennis match with 3 sets. If a player loses the first set then to win the entire match they must win 2 sets compared to the other player only needing to win 1 match.

Feb 10, 2018 1 min read Reinforcement Learning

Dynamic Programming: Chapter 4 IRL

Introduction

Policy Evaluation (Prediction)

Algorithm

Example

Policy Improvement

Algorithm

Example

Policy Iteration

Algorithm

Example

Value Iteration

Algorithm

Example

Asynchronous Dynamic Programming

Dont sweep the entire state space.

Generalized Policy Iteration

evaluation then improvement then evaluation then improvement then evaluation

Feb 10, 2018 2 min read Reinforcement Learning

Finite Markov Decision Processes: Chapter 3 IRL

Introduction In Markov Decision Processes you have: * Agent: The decision maker / learner. The agent sends an action to the environment. * Environment: Everything that is not the agent. The environment sends a reward back to the agent. * Reward: The signal that agent tries to maximize. Example GridWorld Lets say we have a 5x5 grid. There are four possible actions: left, right, up, and down. If you reach the point (1,2) and move in any direction you recieve the reward of 10 and are moved to the point (5,2).

Feb 10, 2018 1 min read Reinforcement Learning

Monte Carlo Methods: Chapter 5 IRL

R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this: summary(cars) Including Plots You can also embed plots, for example:

Feb 10, 2018 4 min read Reinforcement Learning

Multiarmed Bandits: Chapter 2 IRL

Introduction This is going to be part of series where I illustrate examples and questions from the brilliant book by Sutton and Barto Sutton and Barto (1998). You can download the pdf version of the newly updated book online, just google it. I am planning on going through each chapter and illustrating 1 or 2 examples from each chapter. MultiArmed Bandits What on earth is a mutliarmed bandit? It might be easier to think of it as which pokie you choose to play on down at the local.

Feb 10, 2018 1 min read Reinforcement Learning

Temporal Difference Learning: Chapter 6 IRL

R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com. When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this: summary(cars) Including Plots You can also embed plots, for example:

Feb 2, 2018 3 min read Deep Learning

XOR Deep Learning Example

The Problem OpenAI recently released some open research questions. As a beginner in AI I decided to tackle the begineer ‘Warmups’ they have offered. You can view their blog post here: ⭐ Train an LSTM to solve the XOR problem: that is, given a sequence of bits, determine its parity. The LSTM should consume the sequence, one bit at a time, and then output the correct answer at the sequence’s end.

Jan 31, 2018 12 min read Data Analysis

Decision Trees

Have you been struggling to learn about what decision trees are? Finding it difficult to link pictures of trees with machine learning algorithms? If you answered yes to these questions then this post is for you. Decision trees are an amazingly powerful predictive machine learning method that all Data Analysts should know. When I was researching tree-based methods I could never find a hand worked problem. Most other souces simply list the maths, or show the results of a grown tree.

Jan 31, 2018 3 min read Data Analysis

Random Forests

Introduction Following on from the previous post about decision trees let us move on to Random Forests. Let us use the Soybean data from the ‘mlbench’ package. There are 35 features and 683 observations with 16 varieties of Soybean. Why care about Random Forests? Let us look at how our decision trees predict previous unseen data. First we will load the data in: library(mlbench) library(caret) data("BreastCancer") dim(BreastCancer) Let us now split the data up into a training and test data set.

Posts

Introduction

Policy Evaluation (Prediction)

Algorithm

Example

Policy Improvement

Algorithm

Example

Policy Iteration

Algorithm

Example

Value Iteration

Algorithm

Example

Asynchronous Dynamic Programming

Generalized Policy Iteration