R/gamBiCopSelect.R
gamBiCopSelect.Rd
This function selects an appropriate bivariate copula family for given
bivariate copula data using one of a range of methods. The corresponding
parameter estimates are obtained by maximum penalized likelihood estimation,
where each Newton-Raphson iteration is reformulated as a generalized ridge
regression solved using the mgcv
package.
gamBiCopSelect(
udata,
lin.covs = NULL,
smooth.covs = NULL,
familyset = NA,
rotations = TRUE,
familycrit = "AIC",
level = 0.05,
edf = 1.5,
tau = TRUE,
method = "FS",
tol.rel = 0.001,
n.iters = 10,
parallel = FALSE,
verbose = FALSE,
select.once = TRUE,
...
)
A matrix or data frame containing the model responses, (u1,u2) in [0,1]x[0,1]
A matrix or data frame containing the parametric (i.e., linear) covariates.
A matrix or data frame containing the non-parametric (i.e., smooth) covariates.
(Similar to BiCopSelect
from the
VineCopula
package)
Vector of bivariate copula families to select from.
If familyset = NA
(default), selection
among all possible families is performed. Coding of bivariate copula
families:
1
Gaussian,
2
Student t,
5
Frank,
301
Double Clayton type I (standard and rotated 90 degrees),
302
Double Clayton type II (standard and rotated 270 degrees),
303
Double Clayton type III (survival and rotated 90 degrees),
304
Double Clayton type IV (survival and rotated 270 degrees),
401
Double Gumbel type I (standard and rotated 90 degrees),
402
Double Gumbel type II (standard and rotated 270 degrees),
403
Double Gumbel type III (survival and rotated 90 degrees),
404
Double Gumbel type IV (survival and rotated 270 degrees).
If TRUE
, all rotations of the families in familyset
are included.
Character indicating the criterion for bivariate copula
selection. Possible choices: familycrit = 'AIC'
(default) or
'BIC'
, as in BiCopSelect
from the
VineCopula
package.
Numerical; significance level of the test for removing individual
predictors (default: level = 0.05
).
Numerical; if the estimated EDF for individual predictors is
smaller than edf
but the predictor is still significant, then
it is set as linear (default: edf = 1.5
).
FALSE
for a calibration function specified for
the Copula parameter or TRUE
(default) for a calibration function
specified for Kendall's tau.
'FS'
for Fisher-scoring (default) and
'NR'
for Newton-Raphson.
Relative tolerance for 'FS'
/'NR'
algorithm.
Maximal number of iterations for
'FS'
/'NR'
algorithm.
TRUE
for a parallel estimation across copula families.
TRUE
prints informations during the estimation.
if TRUE
the GAM structure is only selected once,
for the family that appears first in familyset
.
Additional parameters to be passed to gam
gamBiCopFit
returns a list consisting of
S4 gamBiCop-class
object.
'FS'
for Fisher-scoring and
'NR'
for Newton-Raphson.
relative tolerance for 'FS'
/'NR'
algorithm.
maximal number of iterations for
'FS'
/'NR'
algorithm.
the estimation procedure's trace.
0
if the algorithm converged and 1
otherwise.
gamBiCop
and gamBiCopFit
.
require(copula)
set.seed(0)
## Simulation parameters (sample size, correlation between covariates,
## Student copula with 4 degrees of freedom)
n <- 5e2
rho <- 0.9
fam <- 2
par2 <- 4
## A calibration surface depending on four variables
eta0 <- 1
calib.surf <- list(
calib.lin <- function(t, Ti = 0, Tf = 1, b = 2) {
return(-2 + 4 * t)
},
calib.quad <- function(t, Ti = 0, Tf = 1, b = 8) {
Tm <- (Tf - Ti) / 2
a <- -(b / 3) * (Tf^2 - 3 * Tf * Tm + 3 * Tm^2)
return(a + b * (t - Tm)^2)
},
calib.sin <- function(t, Ti = 0, Tf = 1, b = 1, f = 1) {
a <- b * (1 - 2 * Tf * pi / (f * Tf * pi +
cos(2 * f * pi * (Tf - Ti))
- cos(2 * f * pi * Ti)))
return((a + b) / 2 + (b - a) * sin(2 * f * pi * (t - Ti)) / 2)
},
calib.exp <- function(t, Ti = 0, Tf = 1, b = 2, s = Tf / 8) {
Tm <- (Tf - Ti) / 2
a <- (b * s * sqrt(2 * pi) / Tf) * (pnorm(0, Tm, s) - pnorm(Tf, Tm, s))
return(a + b * exp(-(t - Tm)^2 / (2 * s^2)))
}
)
## 6-dimensional matrix X of covariates
covariates.distr <- mvdc(normalCopula(rho, dim = 6),
c("unif"), list(list(min = 0, max = 1)),
marginsIdentical = TRUE
)
X <- rMvdc(n, covariates.distr)
colnames(X) <- paste("x", 1:6, sep = "")
## U in [0,1]x[0,1] depending on the four first columns of X
U <- condBiCopSim(fam, function(x1, x2, x3, x4) {
eta0 + sum(mapply(function(f, x)
f(x), calib.surf, c(x1, x2, x3, x4)))
}, X[, 1:4], par2 = 4, return.par = TRUE)
if (FALSE) { # \dontrun{
## Selection using AIC (about 30sec on single core)
## Use parallel = TRUE to speed-up....
system.time(best <- gamBiCopSelect(U$data, smooth.covs = X))
print(best$res)
EDF(best$res) ## The first function is linear
## Plot only the smooth component
par(mfrow = c(2, 2))
plot(best$res)
} # }