Data transformation functions
BEAVARs.TAtrans — Methodhf_ta = BEAVARs.TAtrans(dataHF_TA,varList_HF,trans_dictA)Applies transformations to a TimeArray based on a transformation dictionary
Arguments
dataHF_TA: TimeArray of the high-frequency data
varList_HF: list of variable names (Symbols) in the high-frequency data
trans_dictA: dictionary of transformations for each variableReturns
hf_ta: TimeArray with transformed high-frequency dataDescription
This function applies specified transformations to each variable in a TimeArray based on a provided transformation dictionary. The transformations are defined as follows:
- If the transformation value is 1, the variable is left unchanged.
- If the transformation value is 2, the variable is divided by 100.
- If the transformation value is 3, the natural logarithm of the variable is taken.
- If the transformation value is 7, the percentage change of the variable is calculated.
- If the transformation value is 8, the variable is multiplied by 100.
- If the transformation value is 9, the exponential of the variable is taken.
Example
Consider a TimeArray with variables A, B, and C. If the transformation dictionary specifies that A should be left unchanged (1), B should be divided by 100 (2), and C should be logged (3), the function will apply these transformations accordingly and return a new TimeArray with the transformed variables.
dataHF_TA = TimeArray(Date(2020,1,1):Month(1):Date(2020,6,1), rand(6,3), [:A, :B, :C])
trans_dictA = Dict(:A => 1, :B => 2, :C => 3)
varList_HF = [:A, :B, :C]
hf_ta = BEAVARs.TAtrans(dataHF_TA, varList_HF, trans_dictA)julia> dataHF_TA = TimeArray(Date(2020,1,1):Month(1):Date(2020,6,1), rand(6,3), [:A, :B, :C])
6×3 TimeArray{Float64, 2, Date, Matrix{Float64}} 2020-01-01 to 2020-06-01
┌────────────┬──────────┬──────────┬──────────┐
│ │ A │ B │ C │
├────────────┼──────────┼──────────┼──────────┤
│ 2020-01-01 │ 0.3502 │ 0.38563 │ 0.626134 │
│ 2020-02-01 │ 0.854189 │ 0.458228 │ 0.457359 │
│ 2020-03-01 │ 0.404973 │ 0.388638 │ 0.605352 │
│ 2020-04-01 │ 0.575055 │ 0.858383 │ 0.389653 │
│ 2020-05-01 │ 0.445191 │ 0.54553 │ 0.354785 │
│ 2020-06-01 │ 0.174958 │ 0.672888 │ 0.844214 │
└────────────┴──────────┴──────────┴──────────┘
julia> trans_dictA = Dict(:A => 1, :B => 2, :C => 3)
Dict{Symbol, Int64} with 3 entries:
:A => 1
:B => 2
:C => 3
julia> varList_HF = [:A, :B, :C]
3-element Vector{Symbol}:
:A
:B
:C
julia> hf_ta = BEAVARs.TAtrans(dataHF_TA, varList_HF, trans_dictA)
5×3 TimeArray{Float64, 2, Date, Matrix{Float64}} 2020-02-01 to 2020-06-01
┌────────────┬──────────┬────────────┬───────────┐
│ │ A │ B │ C │
├────────────┼──────────┼────────────┼───────────┤
│ 2020-02-01 │ 0.854189 │ 0.00458228 │ -0.782287 │
│ 2020-03-01 │ 0.404973 │ 0.00388638 │ -0.501945 │
│ 2020-04-01 │ 0.575055 │ 0.00858383 │ -0.942498 │
│ 2020-05-01 │ 0.445191 │ 0.0054553 │ -1.03624 │
│ 2020-06-01 │ 0.174958 │ 0.00672888 │ -0.169349 │
└────────────┴──────────┴────────────┴───────────┘BEAVARs.pseudo_oos — MethodBEAVARs.pseudo_oos(fdataHF_tab,fdataLF_tab,pseudoHF_beg_date,pseudoHF_end_date,ragged_beg_date,ragged_end_date,pubDelay,aggMix,model_type, set_strct, hyp_strct)
Generate a dictionary of LoopSetup structures for pseudo out-of-sample forecasting with ragged-edge dataArguments
fdataHF_tab: TimeArray of the full high-frequency data
fdataLF_tab: TimeArray of the full low-frequency data
pseudoHF_beg_date: Date of the beginning of the pseudo out-of-sample forecasting (high-frequency)
pseudoHF_end_date: Date of the end of the pseudo out-of-sample forecasting (high-frequency)
ragged_beg_date: Date of the beginning of the ragged edge (high-frequency)
ragged_end_date: Date of the end of the ragged edge (high-frequency)
pubDelay: publication delay in months between high-frequency and low-frequency data
aggMix: aggregation mix parameter (0=growth rates, 1=levels)
model_type: model type, output of makeSetup function
set_strct: BVARmodelSetup structure
hyp_strct: BVARmodelHypers structureReturns
vint_dict: dictionary of LoopSetup structures for each pseudo out-of-sample dateDescription
This function creates a dictionary of LoopSetup structures for pseudo out-of-sample forecasting with ragged-edge data. For each date in the pseudo out-of-sample range, it creates a balanced dataset up to that date, applies the ragged edge by setting the appropriate high-frequency data points to NaN, and constructs the corresponding LoopSetup structure.
High and low-frequency datasets
The full high- and low-frequency datasets, fdataHF_tab and fdataLF_tab are needed to be cut incrementally. 'pseudoHFbegdate' and 'pseudoHFenddate' give the beginning and end dates for which the new datasets will be created.
Ragged edge
The user defines the ragged edge in the high-frequency dataset by submitting the dates in which the ragged edge starts and ends, ragged_beg_date and ragged_end_date, respectively. These dates are defined as follows. ragged_beg_date is the first date after the end of the balanced dataset. ragged_end_date is the last date before the first row with NaNs.
Example
Consider the following array:
┌────────────┬───────────┬───────────┬──────────┐
│ │ A │ B │ C │
├────────────┼───────────┼───────────┼──────────┤
│ 2018-01-01 │ 0.113132 │ 0.643878 │ 0.712192 │
│ 2018-02-01 │ 0.508802 │ 0.187696 │ 0.926146 │
│ 2018-03-01 │ 0.0694366 │ 0.995718 │ 0.92898 │
│ 2018-04-01 │ 0.683238 │ 0.069832 │ 0.354071 │
│ 2018-05-01 │ 0.740803 │ 0.843847 │ 0.695057 │
│ 2018-06-01 │ 0.295119 │ 0.802446 │ 0.992353 │
│ 2018-07-01 │ 0.163745 │ 0.331479 │ 0.743982 │
│ 2018-08-01 │ 0.541091 │ 0.184965 │ 0.375754 │
│ 2018-09-01 │ 0.254278 │ 0.327575 │ 0.420596 │
│ 2018-10-01 │ 0.948557 │ NaN │ 0.823303 │
│ 2018-11-01 │ 0.256987 │ NaN │ NaN │
│ 2018-12-01 │ NaN │ NaN │ NaN │
└────────────┴───────────┴───────────┴──────────┘The ragged edge pattern starts at 2018-10-01 and ends at 2018-11-01.
Suppose that we want to estimate the model for in a pesudo-out-of-sample fashion for the dates 2018-05-01, 2018-06-01 and 2018-07-01, the function will create the following three arrays:
For `2018-05-01`: For `2018-06-01`: For `2018-07-01`:
┌────────────┬───────────┬───────────┬──────────┐ ┌────────────┬───────────┬───────────┬──────────┐ ┌────────────┬───────────┬───────────┬──────────┐
│ │ A │ B │ C │ │ │ A │ B │ C │ │ │ A │ B │ C │
├────────────┼───────────┼───────────┼──────────┤ ├────────────┼───────────┼───────────┼──────────┤ ├────────────┼───────────┼───────────┼──────────┤
│ 2018-01-01 │ 0.113132 │ 0.643878 │ 0.712192 │ │ 2018-01-01 │ 0.113132 │ 0.643878 │ 0.712192 │ │ 2018-01-01 │ 0.113132 │ 0.643878 │ 0.712192 │
│ 2018-02-01 │ 0.508802 │ 0.187696 │ 0.926146 │ │ 2018-02-01 │ 0.508802 │ 0.187696 │ 0.926146 │ │ 2018-02-01 │ 0.508802 │ 0.187696 │ 0.926146 │
│ 2018-03-01 │ 0.0694366 │ 0.995718 │ 0.92898 │ │ 2018-03-01 │ 0.0694366 │ 0.995718 │ 0.92898 │ │ 2018-03-01 │ 0.0694366 │ 0.995718 │ 0.92898 │
│ 2018-04-01 │ 0.683238 │ 0.069832 │ 0.354071 │ │ 2018-04-01 │ 0.683238 │ 0.069832 │ 0.354071 │ │ 2018-04-01 │ 0.683238 │ 0.069832 │ 0.354071 │
│ 2018-05-01 │ 0.740803 │ NaN │ 0.695057 │ │ 2018-05-01 │ 0.740803 │ 0.843847 │ 0.695057 │ │ 2018-05-01 │ 0.740803 │ 0.843847 │ 0.695057 │
│ 2018-06-01 │ 0.295119 │ NaN │ NaN │ │ 2018-06-01 │ 0.295119 │ NaN │ 0.992353 │ │ 2018-06-01 │ 0.295119 │ 0.802446 │ 0.992353 │
└────────────┴───────────┴───────────┴──────────┘ │ 2018-07-01 │ 0.163745 │ NaN │ NaN │ │ 2018-07-01 │ 0.163745 │ NaN │ 0.743982 │
└────────────┴───────────┴───────────┴──────────┘ │ 2018-08-01 │ 0.541091 │ NaN │ NaN │
└────────────┴───────────┴───────────┴──────────┘preserving the ragged edge pattern of the data.