File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -450,17 +450,14 @@ predefined timestep preprocessors to add a reward.
450450 observation[' panda_tcp_pos' ])
451451 return np.clip(1.0 - goal_distance, 0 , 1 )
452452
453- reward = rewards.ComputeReward(
454- goal_reward,
455- validation_frequency = timestep_preprocessor.ValidationFrequency.ALWAYS )
453+ reward = rewards.ComputeReward(goal_reward)
456454
457455 panda_env.add_timestep_preprocessors([reward])
458456
459457 ``ComputeReward `` is a timestep preprocessor that computes a reward based on a callable that takes
460458an observation and returns a scalar which is added to the timestep. The callable ``goal_reward ``
461459computes a reward based on the distance between the robot's end-effector and the ball's pose
462- observation which we added above. This reward is computed for every timestep. Alternatively rewards
463- may also be computed only at the end of an epiode.
460+ observation which we added above.
464461
465462
466463Domain Randomization
Original file line number Diff line number Diff line change @@ -92,10 +92,7 @@ def goal_reward(observation: spec_utils.ObservationValue):
9292
9393 # ComputeReward is a timestep preprocessor that accepts a callable which computes
9494 # a scalar reward based on the observation and adds it to the timestep.
95- # We configure the validation frequency so this reward is computed for every timestep.
96- reward = rewards .ComputeReward (
97- goal_reward ,
98- validation_frequency = timestep_preprocessor .ValidationFrequency .ALWAYS )
95+ reward = rewards .ComputeReward (goal_reward )
9996
10097 # Instantiate props
10198 ball = Ball ()
You can’t perform that action at this time.
0 commit comments