Skip to contents

This function computes a reward value based on epidemiological simulation data, penalties for exceeding target thresholds, and the cost of non-pharmaceutical interventions (NPIs).

Usage

reward_fun_wd(episimdata, episettings, actions, ii, jj)

Arguments

episimdata

A data frame containing epidemiological simulation data. The function expects specific columns: "C" for cases, "Re" for reproduction number, and "Deaths".

actions

A data frame containing the costs of non-pharmaceutical interventions (NPIs). It expects a column named "cost_of_NPI".

ii

An integer specifying the row index in episimdata to use for reward calculation.

jj

An integer specifying the row index in actions to use for cost retrieval.

alpha

A numeric value representing the weight applied to case error in the reward calculation.

alpha_d

A numeric value representing the weight applied to death error in the reward calculation.

ovp

A numeric value for the penalty applied when cases exceed C_target_pen.

dovp

A numeric value for the penalty applied when deaths exceed D_target_pen.

C_target

A numeric target for the number of cases.

C_target_pen

A numeric threshold for the penalty on cases.

D_target

A numeric target for the number of deaths.

D_target_pen

A numeric threshold for the penalty on deaths.

Value

A numeric value representing the calculated reward.

Details

The reward is computed as: $$reward = -\alpha * |C - C_{target}| - penalty_{C} - NPI_{cost} - \alpha_d * |Deaths - D_{target}| - penalty_{D}$$

Where:

  • \(penalty_{C}\) is applied if cases exceed C_target_pen.

  • \(penalty_{D}\) is applied if deaths exceed D_target_pen.

  • NPI_{cost} is retrieved from the actions data frame based on index jj.

Examples

reward_fun_wd(episimdata, alpha = 0.5, alpha_d = 0.7, ovp = 50, dovp = 30,
              C_target = 120, C_target_pen = 140, D_target = 7, D_target_pen = 12,
              actions = actions, ii = 1, jj = 2)
#> Error in reward_fun_wd(episimdata, alpha = 0.5, alpha_d = 0.7, ovp = 50,     dovp = 30, C_target = 120, C_target_pen = 140, D_target = 7,     D_target_pen = 12, actions = actions, ii = 1, jj = 2): unused arguments (alpha = 0.5, alpha_d = 0.7, ovp = 50, dovp = 30, C_target = 120, C_target_pen = 140, D_target = 7, D_target_pen = 12)