Data transformation functions

BEAVARs.TAtransMethod
hf_ta = BEAVARs.TAtrans(dataHF_TA,varList_HF,trans_dictA)

Applies transformations to a TimeArray based on a transformation dictionary

Arguments

dataHF_TA: TimeArray of the high-frequency data
varList_HF: list of variable names (Symbols) in the high-frequency data
trans_dictA: dictionary of transformations for each variable

Returns

hf_ta: TimeArray with transformed high-frequency data

Description

This function applies specified transformations to each variable in a TimeArray based on a provided transformation dictionary. The transformations are defined as follows:

  • If the transformation value is 1, the variable is left unchanged.
  • If the transformation value is 2, the variable is divided by 100.
  • If the transformation value is 3, the natural logarithm of the variable is taken.
  • If the transformation value is 7, the percentage change of the variable is calculated.
  • If the transformation value is 8, the variable is multiplied by 100.
  • If the transformation value is 9, the exponential of the variable is taken.

Example

Consider a TimeArray with variables A, B, and C. If the transformation dictionary specifies that A should be left unchanged (1), B should be divided by 100 (2), and C should be logged (3), the function will apply these transformations accordingly and return a new TimeArray with the transformed variables.

dataHF_TA = TimeArray(Date(2020,1,1):Month(1):Date(2020,6,1), rand(6,3), [:A, :B, :C])
trans_dictA = Dict(:A => 1, :B => 2, :C => 3)
varList_HF = [:A, :B, :C]
hf_ta = BEAVARs.TAtrans(dataHF_TA, varList_HF, trans_dictA)
julia> dataHF_TA = TimeArray(Date(2020,1,1):Month(1):Date(2020,6,1), rand(6,3), [:A, :B, :C])
6×3 TimeArray{Float64, 2, Date, Matrix{Float64}} 2020-01-01 to 2020-06-01

┌────────────┬──────────┬──────────┬──────────┐
│            │ A        │ B        │ C        │
├────────────┼──────────┼──────────┼──────────┤
│ 2020-01-01 │   0.3502 │  0.38563 │ 0.626134 │
│ 2020-02-01 │ 0.854189 │ 0.458228 │ 0.457359 │
│ 2020-03-01 │ 0.404973 │ 0.388638 │ 0.605352 │
│ 2020-04-01 │ 0.575055 │ 0.858383 │ 0.389653 │
│ 2020-05-01 │ 0.445191 │  0.54553 │ 0.354785 │
│ 2020-06-01 │ 0.174958 │ 0.672888 │ 0.844214 │
└────────────┴──────────┴──────────┴──────────┘

julia> trans_dictA = Dict(:A => 1, :B => 2, :C => 3)
Dict{Symbol, Int64} with 3 entries:
  :A => 1
  :B => 2
  :C => 3

julia> varList_HF = [:A, :B, :C]
3-element Vector{Symbol}:
 :A
 :B
 :C

julia> hf_ta = BEAVARs.TAtrans(dataHF_TA, varList_HF, trans_dictA)
5×3 TimeArray{Float64, 2, Date, Matrix{Float64}} 2020-02-01 to 2020-06-01
┌────────────┬──────────┬────────────┬───────────┐
│            │ A        │ B          │ C         │
├────────────┼──────────┼────────────┼───────────┤
│ 2020-02-01 │ 0.854189 │ 0.00458228 │ -0.782287 │
│ 2020-03-01 │ 0.404973 │ 0.00388638 │ -0.501945 │
│ 2020-04-01 │ 0.575055 │ 0.00858383 │ -0.942498 │
│ 2020-05-01 │ 0.445191 │  0.0054553 │  -1.03624 │
│ 2020-06-01 │ 0.174958 │ 0.00672888 │ -0.169349 │
└────────────┴──────────┴────────────┴───────────┘
source
BEAVARs.pseudo_oosMethod
BEAVARs.pseudo_oos(fdataHF_tab,fdataLF_tab,pseudoHF_beg_date,pseudoHF_end_date,ragged_beg_date,ragged_end_date,pubDelay,aggMix,model_type, set_strct, hyp_strct)

Generate a dictionary of LoopSetup structures for pseudo out-of-sample forecasting with ragged-edge data

Arguments

fdataHF_tab: TimeArray of the full high-frequency data
fdataLF_tab: TimeArray of the full low-frequency data
pseudoHF_beg_date: Date of the beginning of the pseudo out-of-sample forecasting (high-frequency)
pseudoHF_end_date: Date of the end of the pseudo out-of-sample forecasting (high-frequency)
ragged_beg_date: Date of the beginning of the ragged edge (high-frequency)
ragged_end_date: Date of the end of the ragged edge (high-frequency)
pubDelay: publication delay in months between high-frequency and low-frequency data
aggMix: aggregation mix parameter (0=growth rates, 1=levels)
model_type: model type, output of makeSetup function
set_strct: BVARmodelSetup structure
hyp_strct: BVARmodelHypers structure

Returns

vint_dict: dictionary of LoopSetup structures for each pseudo out-of-sample date

Description

This function creates a dictionary of LoopSetup structures for pseudo out-of-sample forecasting with ragged-edge data. For each date in the pseudo out-of-sample range, it creates a balanced dataset up to that date, applies the ragged edge by setting the appropriate high-frequency data points to NaN, and constructs the corresponding LoopSetup structure.

High and low-frequency datasets

The full high- and low-frequency datasets, fdataHF_tab and fdataLF_tab are needed to be cut incrementally. 'pseudoHFbegdate' and 'pseudoHFenddate' give the beginning and end dates for which the new datasets will be created.

Ragged edge

The user defines the ragged edge in the high-frequency dataset by submitting the dates in which the ragged edge starts and ends, ragged_beg_date and ragged_end_date, respectively. These dates are defined as follows. ragged_beg_date is the first date after the end of the balanced dataset. ragged_end_date is the last date before the first row with NaNs.

Example

Consider the following array:

┌────────────┬───────────┬───────────┬──────────┐
│            │ A         │ B         │ C        │
├────────────┼───────────┼───────────┼──────────┤
│ 2018-01-01 │  0.113132 │  0.643878 │ 0.712192 │
│ 2018-02-01 │  0.508802 │  0.187696 │ 0.926146 │
│ 2018-03-01 │ 0.0694366 │  0.995718 │  0.92898 │
│ 2018-04-01 │  0.683238 │  0.069832 │ 0.354071 │
│ 2018-05-01 │  0.740803 │  0.843847 │ 0.695057 │
│ 2018-06-01 │  0.295119 │  0.802446 │ 0.992353 │
│ 2018-07-01 │  0.163745 │  0.331479 │ 0.743982 │
│ 2018-08-01 │  0.541091 │  0.184965 │ 0.375754 │
│ 2018-09-01 │  0.254278 │  0.327575 │ 0.420596 │
│ 2018-10-01 │  0.948557 │    NaN    │ 0.823303 │
│ 2018-11-01 │  0.256987 │    NaN    │    NaN   │
│ 2018-12-01 │    NaN    │    NaN    │    NaN   │
└────────────┴───────────┴───────────┴──────────┘

The ragged edge pattern starts at 2018-10-01 and ends at 2018-11-01.

Suppose that we want to estimate the model for in a pesudo-out-of-sample fashion for the dates 2018-05-01, 2018-06-01 and 2018-07-01, the function will create the following three arrays:

For `2018-05-01`:                                   For `2018-06-01`:                                   For `2018-07-01`:
┌────────────┬───────────┬───────────┬──────────┐   ┌────────────┬───────────┬───────────┬──────────┐   ┌────────────┬───────────┬───────────┬──────────┐   
│            │ A         │ B         │ C        │   │            │ A         │ B         │ C        │   │            │ A         │ B         │ C        │
├────────────┼───────────┼───────────┼──────────┤   ├────────────┼───────────┼───────────┼──────────┤   ├────────────┼───────────┼───────────┼──────────┤
│ 2018-01-01 │  0.113132 │  0.643878 │ 0.712192 │   │ 2018-01-01 │  0.113132 │  0.643878 │ 0.712192 │   │ 2018-01-01 │  0.113132 │  0.643878 │ 0.712192 │
│ 2018-02-01 │  0.508802 │  0.187696 │ 0.926146 │   │ 2018-02-01 │  0.508802 │  0.187696 │ 0.926146 │   │ 2018-02-01 │  0.508802 │  0.187696 │ 0.926146 │
│ 2018-03-01 │ 0.0694366 │  0.995718 │  0.92898 │   │ 2018-03-01 │ 0.0694366 │  0.995718 │  0.92898 │   │ 2018-03-01 │ 0.0694366 │  0.995718 │  0.92898 │
│ 2018-04-01 │  0.683238 │  0.069832 │ 0.354071 │   │ 2018-04-01 │  0.683238 │  0.069832 │ 0.354071 │   │ 2018-04-01 │  0.683238 │  0.069832 │ 0.354071 │
│ 2018-05-01 │  0.740803 │    NaN    │ 0.695057 │   │ 2018-05-01 │  0.740803 │  0.843847 │ 0.695057 │   │ 2018-05-01 │  0.740803 │  0.843847 │ 0.695057 │
│ 2018-06-01 │  0.295119 │    NaN    │    NaN   │   │ 2018-06-01 │  0.295119 │    NaN    │ 0.992353 │   │ 2018-06-01 │  0.295119 │  0.802446 │ 0.992353 │
└────────────┴───────────┴───────────┴──────────┘   │ 2018-07-01 │  0.163745 │    NaN    │    NaN   │   │ 2018-07-01 │  0.163745 │    NaN    │ 0.743982 │   
                                                    └────────────┴───────────┴───────────┴──────────┘   │ 2018-08-01 │  0.541091 │    NaN    │    NaN   │
                                                                                                        └────────────┴───────────┴───────────┴──────────┘

preserving the ragged edge pattern of the data.

source