Tuning Reward Functions for AWS DeepRacer
The key to success in autonomous racing is finding the right balance of speed and accuracy. One important way to achieve this balance is through the use of reward functions, which define the goals of your racing agent and the rewards it receives for reaching those goals.
Tuning the parameters of your reward function can have a significant impact on your DeepRacer’s performance.
1 Position on track:
The parameters x and y describe the position of the vehicle in meters, measured from the lower-left corner of the environment.
2 Heading:
The heading parameter describes the orientation of the vehicle in degrees, measured counter-clockwise from the X-axis of the coordinate system.
Type: float
Range: -180:+180
Heading direction, in degrees, of the agent with respect to the x-axis of the coordinate system.
The first element refers to the closest waypoint behind the agent.
The second element refers the closest waypoint in front of the agent. Max
is the length of the waypoints list.
3Waypoints:
The waypoints parameter is an ordered list of milestones placed along the track center.
Type: [int, int]
Range: [(0:Max-1),(1:Max-1)]
The zero-based indices of the two neighboring waypoint
closest to the agent's current position of (x, y)
.
The distance is measured by the Euclidean distance from the center of the agent.
4Track width:
The track_width parameter is the width of the track in meters.
5Distance from center line:
The distance_from_center parameter measures the displacement of the vehicle from the center of the track.
Type: float
Range: 0:~track_width/2
Displacement, in meters, between the agent center and the track center. The observable maximum displacement occurs when any of the agent’s wheels are outside a track border and, depending on the width of the track border, can be slightly smaller or larger than half the track_width
.
6All wheels on track:
The all_wheels_on_track parameter is a boolean which is true if all four wheels of the vehicle are inside the track borders, and false if any wheel is outside the track.
Type: Boolean
Range: (True:False)
A Boolean flag to indicate whether the agent is on-track or off-track. It’s off-track (False) if any of its wheels are outside. It’s on-track (True) if all of the wheels are inside track borders.
7Speed:
The speed parameter measures the observed speed of the vehicle, measured in meters per second.
8Steering angle:
The steering_angle parameter measures the steering angle of the vehicle, measured in degrees.
Type: float
Range: -30:30
Steering angle, in degrees, of the front wheels from the center line of the agent. The negative sign (-) means steering to the right and the positive (+) sign means steering to the left. The agent center line is not necessarily parallel with the track center line as is shown in the following illustration.
Overall, tuning your reward function parameters is an important part of optimizing your AWS DeepRacer’s performance. By defining your goals clearly, experimenting with different reward values, and using negative rewards to discourage undesirable behavior, you can fine-tune your reward function to help your DeepRacer achieve the best possible results on the track.