Compute the Reward for Epidemic Simulation
reward_fun.RdThis function calculates the reward based on case prediction errors, reproduction number deviation, and intervention costs.
Arguments
- episimdata
A data frame containing epidemic simulation data.
- actions
A matrix containing intervention actions, with a column specifying the cost of non-pharmaceutical interventions (NPI).
- ii
The current time step in the simulation.
- jj
The index of the action scenario being evaluated.
- alpha
A penalty factor for case prediction errors.
- ovp
A penalty value applied if the predicted cases exceed a certain threshold.
- C_target
The target number of cases for the given day.
- C_target_pen
The case threshold above which the penalty `ovp` is applied.
- R_target
The target effective reproduction number.
Details
The function computes the absolute error between predicted and target cases (`C_err_pred`) and the deviation of the reproduction number (`R_err_pred`). If the predicted cases exceed `C_target_pen`, an additional penalty (`ovp`) is applied. The final reward incorporates these penalties along with the intervention cost.