R/gamBiCopSelect.R
gamBiCopSelect.RdThis function selects an appropriate bivariate copula family for given
bivariate copula data using one of a range of methods. The corresponding
parameter estimates are obtained by maximum penalized likelihood estimation,
where each Newton-Raphson iteration is reformulated as a generalized ridge
regression solved using the mgcv package.
gamBiCopSelect(
udata,
lin.covs = NULL,
smooth.covs = NULL,
familyset = NA,
rotations = TRUE,
familycrit = "AIC",
level = 0.05,
edf = 1.5,
tau = TRUE,
method = "FS",
tol.rel = 0.001,
n.iters = 10,
parallel = FALSE,
verbose = FALSE,
select.once = TRUE,
...
)A matrix or data frame containing the model responses, (u1,u2) in [0,1]x[0,1]
A matrix or data frame containing the parametric (i.e., linear) covariates.
A matrix or data frame containing the non-parametric (i.e., smooth) covariates.
(Similar to BiCopSelect from the
VineCopula package)
Vector of bivariate copula families to select from.
If familyset = NA (default), selection
among all possible families is performed. Coding of bivariate copula
families:
1 Gaussian,
2 Student t,
5 Frank,
301 Double Clayton type I (standard and rotated 90 degrees),
302 Double Clayton type II (standard and rotated 270 degrees),
303 Double Clayton type III (survival and rotated 90 degrees),
304 Double Clayton type IV (survival and rotated 270 degrees),
401 Double Gumbel type I (standard and rotated 90 degrees),
402 Double Gumbel type II (standard and rotated 270 degrees),
403 Double Gumbel type III (survival and rotated 90 degrees),
404 Double Gumbel type IV (survival and rotated 270 degrees).
If TRUE, all rotations of the families in familyset
are included.
Character indicating the criterion for bivariate copula
selection. Possible choices: familycrit = 'AIC' (default) or
'BIC', as in BiCopSelect from the
VineCopula package.
Numerical; significance level of the test for removing individual
predictors (default: level = 0.05).
Numerical; if the estimated EDF for individual predictors is
smaller than edf but the predictor is still significant, then
it is set as linear (default: edf = 1.5).
FALSE for a calibration function specified for
the Copula parameter or TRUE (default) for a calibration function
specified for Kendall's tau.
'FS' for Fisher-scoring (default) and
'NR' for Newton-Raphson.
Relative tolerance for 'FS'/'NR' algorithm.
Maximal number of iterations for
'FS'/'NR' algorithm.
TRUE for a parallel estimation across copula families.
TRUE prints informations during the estimation.
if TRUE the GAM structure is only selected once,
for the family that appears first in familyset.
Additional parameters to be passed to gam
gamBiCopFit returns a list consisting of
S4 gamBiCop-class object.
'FS' for Fisher-scoring and
'NR' for Newton-Raphson.
relative tolerance for 'FS'/'NR' algorithm.
maximal number of iterations for
'FS'/'NR' algorithm.
the estimation procedure's trace.
0 if the algorithm converged and 1 otherwise.
gamBiCop and gamBiCopFit.
require(copula)
set.seed(0)
## Simulation parameters (sample size, correlation between covariates,
## Student copula with 4 degrees of freedom)
n <- 5e2
rho <- 0.9
fam <- 2
par2 <- 4
## A calibration surface depending on four variables
eta0 <- 1
calib.surf <- list(
calib.lin <- function(t, Ti = 0, Tf = 1, b = 2) {
return(-2 + 4 * t)
},
calib.quad <- function(t, Ti = 0, Tf = 1, b = 8) {
Tm <- (Tf - Ti) / 2
a <- -(b / 3) * (Tf^2 - 3 * Tf * Tm + 3 * Tm^2)
return(a + b * (t - Tm)^2)
},
calib.sin <- function(t, Ti = 0, Tf = 1, b = 1, f = 1) {
a <- b * (1 - 2 * Tf * pi / (f * Tf * pi +
cos(2 * f * pi * (Tf - Ti))
- cos(2 * f * pi * Ti)))
return((a + b) / 2 + (b - a) * sin(2 * f * pi * (t - Ti)) / 2)
},
calib.exp <- function(t, Ti = 0, Tf = 1, b = 2, s = Tf / 8) {
Tm <- (Tf - Ti) / 2
a <- (b * s * sqrt(2 * pi) / Tf) * (pnorm(0, Tm, s) - pnorm(Tf, Tm, s))
return(a + b * exp(-(t - Tm)^2 / (2 * s^2)))
}
)
## 6-dimensional matrix X of covariates
covariates.distr <- mvdc(normalCopula(rho, dim = 6),
c("unif"), list(list(min = 0, max = 1)),
marginsIdentical = TRUE
)
X <- rMvdc(n, covariates.distr)
colnames(X) <- paste("x", 1:6, sep = "")
## U in [0,1]x[0,1] depending on the four first columns of X
U <- condBiCopSim(fam, function(x1, x2, x3, x4) {
eta0 + sum(mapply(function(f, x)
f(x), calib.surf, c(x1, x2, x3, x4)))
}, X[, 1:4], par2 = 4, return.par = TRUE)
if (FALSE) { # \dontrun{
## Selection using AIC (about 30sec on single core)
## Use parallel = TRUE to speed-up....
system.time(best <- gamBiCopSelect(U$data, smooth.covs = X))
print(best$res)
EDF(best$res) ## The first function is linear
## Plot only the smooth component
par(mfrow = c(2, 2))
plot(best$res)
} # }