Compute the Reward for Epidemic Simulation — reward

This function calculates the reward based on case prediction errors, reproduction number deviation, and intervention costs.

Usage

reward_fun(episimdata, episettings, actions, ii, jj)

Arguments

episimdata: A data frame containing epidemic simulation data.
actions: A matrix containing intervention actions, with a column specifying the cost of non-pharmaceutical interventions (NPI).
ii: The current time step in the simulation.
jj: The index of the action scenario being evaluated.
alpha: A penalty factor for case prediction errors.
ovp: A penalty value applied if the predicted cases exceed a certain threshold.
C_target: The target number of cases for the given day.
C_target_pen: The case threshold above which the penalty `ovp` is applied.
R_target: The target effective reproduction number.

Value

A numeric value representing the reward for the given time step.

Details

The function computes the absolute error between predicted and target cases (`C_err_pred`) and the deviation of the reproduction number (`R_err_pred`). If the predicted cases exceed `C_target_pen`, an additional penalty (`ovp`) is applied. The final reward incorporates these penalties along with the intervention cost.