Experimenting Algorithms to Win AWS DeepRacer

3 min readDec 16, 2022

AWS DeepRacer is a self-driving cloud based 3D racing simulator, fully autonomous 1/18th scale race car designed to teach machine learning concepts through hands-on experience. It’s a fun tool that can be used to learn about reinforcement learning, a type of artificial intelligence that involves training a model to make decisions in an environment by rewarding it for good behavior.

If you’re interested in winning on the AWS DeepRacer, using the right algorithms and techniques is key. Here are a few algorithms that you might want to consider when designing your model:

Q-learning: Q-learning is a popular algorithm for reinforcement learning that involves updating an estimate of the expected reward for each possible action in a given state. It’s based on the idea of the “Q-function,” which represents the maximum expected reward for a given action in a given state.

import numpy as np

# Initialize Q-table with all zeros
q_table = np.zeros([state_space_size, action_space_size])

# Hyperparameters
learning_rate = 0.1
discount_factor = 0.95

for episode in range(num_episodes):
  # Reset the environment at the start of each episode
  state = env.reset()

  # Run the episode
  done = False
  while not done:
    # Choose an action based on the current state
    action = choose_action(state, q_table)

    # Take the action and observe the next state and reward
    next_state, reward, done, _ = env.step(action)

    # Update the Q-value for the current state and action
    q_table[state, action] = (1 - learning_rate) * q_table[state, action] + learning_rate * (reward + discount_factor * np.max(q_table[next_state]))

    # Update the current state
    state = next_state

SARSA: SARSA (State-Action-Reward-State-Action) is another algorithm that’s often used in reinforcement learning. It involves updating the expected reward for a given action in a given state based on the reward received in the next state and the action taken in that state.

import numpy as np

# Initialize Q-table with all zeros
q_table = np.zeros([state_space_size, action_space_size])

# Hyperparameters
learning_rate = 0.1
discount_factor = 0.95

for episode in range(num_episodes):
  # Reset the environment at the start of each episode
  state = env.reset()

  # Choose an action based on the current state
  action = choose_action(state, q_table)

  # Run the episode
  done = False
  while not done:
    # Take the action and observe the next state and reward
    next_state, reward, done, _ = env.step(action)

    # Choose the next action based on the next state
    next_action = choose_action(next_state, q_table)

    # Update the Q-value for the current state and action
    q_table[state, action] = (1 - learning_rate) * q_table[state, action] + learning_rate * (reward + discount_factor * q_table[next_state, next_action])

    # Update the current state and action
    state = next_state
    action = next_action

Deep Q-network (DQN): A DQN is a type of neural network that’s specifically designed for reinforcement learning. It’s based on the Q-learning algorithm, but it uses a neural network to approximate the Q-function instead of storing it in a table.

import tensorflow as tf

# Define the model
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(units=64, activation='relu', input_shape=[state_space_size]))

By experimenting with different algorithms and techniques, you can find the one that works best for your DeepRacer model. In addition to these algorithms, there are also many other techniques you can use to improve the performance of your model, such as fine-tuning the hyperparameters and using data to identify areas for improvement. Good luck!

Experimenting Algorithms to Win AWS DeepRacer

Written by Daniel Dominguez