Ayuda
Ir al contenido

Dialnet


Essays on specification tests for conditional hazard and distribution models

  • Autores: Rui Cui
  • Directores de la Tesis: Miguel Angel Delgado González (dir. tes.)
  • Lectura: En la Universidad Carlos III de Madrid ( España ) en 2019
  • Idioma: español
  • Tribunal Calificador de la Tesis: Winfried Stute (presid.), Carlos Velasco Gómez (secret.), Javier Hidalgo (voc.)
  • Programa de doctorado: Programa de Doctorado en Economía por la Universidad Carlos III de Madrid
  • Materias:
  • Enlaces
  • Resumen
    • In this thesis, we discuss the application of principal component analysis (PCA) in specification testing for statistic models. Model check is essential because inferences from an incorrectly specified model can be very misleading. However, when we apply a model checking technique, say using a test statistic, only knowing whether the null model should be rejected provides little information. We also want to know when the model fails, which particular aspects of the data are responsible for such rejection. This raises the question of whether the information in the test can be partitioned into some pieces, each of which measures some distinctive aspect of the data. If possible, we will also study the significance of each piece. This will be much more informative and give us a detailed picture of the nature of the deviation and may suggest some sort of natural alternative. Such a partition can be obtained through principal component decomposition (PCD), the orthogonal decomposition of the test statistic and the pieces are called its principal components (PCs). These PCs play an important role in testing problem---they serve as ``special experts'' when detecting certain deviations and in many cases, one may expect that the main source of deviations only come from the first few.

      This thesis provides two PCD methods aimed at improving the efficiency of specification tests for conditional hazard and distribution models. The two methods are both applicable to a general class of models, e.g., the transformation models, however, we demonstrate them in the hazard and distribution regression models. The first PCD method deals with testing for a composite hypothesis when the null hypothesis depends on unknown parameters and it is introduced in chapter 1 with a particular interest in testing the parametric part of the Cox Proportional Hazard Model. While the second one, which we call conditional PCD method, is applicable to the goodness-of-fit problem of conditional models, for which we consider the conditional hazard model in chapter 2 and conditional distribution model in chapter 3.

      Chapter 1: Chapter 1 considers a specification test for the parametric part of proportional hazard models, namely, we fix the proportionality assumption and pay attention to the specification test of the covariate effect in the Cox model. A commonly used method for model checking of hazard models consists of using the martingale residuals, which, coming from the Doob-Meyer decomposition of the counting process, provide a basis for goodness-of-fit techniques for general hazard based models. For the Cox model, Lin, Wei and Ying (1993) suggested an important class of goodness-of-fit tests based on CUSUM process of the martingale residuals, including an omnibus test that is consistent against any misspecification and several special cases to investigate different features of the Cox model. In the case to test the specification of the parametric part, they constructed an omnibus Kolmogorov test based on a special case of the CUSUM martingale residual process. The purpose of this article is to derive PCA of their test. Since different PCs serve different purposes to detect particular departures from the null hypothesis, the obtained PCs can be used to design more powerful smooth and directional tests, therefore complement existing proposals.

      The functional PCD method has been widely applied in different testing problems to get the decomposition of the test statistic. However, the decomposition requires the eigenvalues and eigenfunctions of the empirical process. Although all stochastic processes in a particular space admit PCDs, not all of them have closed forms for the eigenvalues and eigenfunctions. Only a few processes with special covariance kernels have an analytical solution to the eigenproblem, such as the Brownian Motion, Brownian Bridge and martingales. Hence in order to be able to apply the PCD method, one needs to seek for these special processes. For example, Durbin and Knott (1972) discussed the specification test for the distribution function by using the standard empirical process. In this case, Donsker's theorem provides a Brownian Bridge limit, which makes PCD possible. Apart from the distribution model, the Brownian Motion and Brownian Bridge, together with their transformations, have been derived as limits in various model specification testing problems, which indicates a wide application of PCD. In the hazard models framework, the Doob-Meyer decomposition, by taking the difference of the counting process and its expected value, yields a martingale. Therefore, test statistics and further decompositions can be obtained based on the PCD of the martingale, e.g., Anh and Stute (2012).

      Even if we have a process with available eigensolutions, another problem arises when the hypothesis is composite, in which case the empirical process after estimation differs significantly from the process with the true value of parameters. The limit distribution of the process after estimation becomes complicated since it dependents on the true value of the parameter, the parametric form of the identifier and on the particular estimator. As for the covariance of the limit process, estimation usually causes a shift and destroys the previous special covariance structure. A lot of work has been done to deal with the estimation effect and to understand the nature of the process after estimation. Khmaladze (1981) proposed a martingale transformation method, which rules out the estimation effect and makes the resulting test asymptotically distribution free. This method has been applied in various model specification testing problems. See for example Koul and Stute (1999) for time series models, Delgado and Stute (2008) for conditional distributions, and Marzec and Marzec (1997) for conditional hazards. Rather than Khmaladze's idea to remove the estimation effect, Durbin, Knott and Taylor (1975, DKT hereafter) treated the estimation effect in a different way, where they faced up to the problem and investigated the nature of the estimation effect. Following on Durbin and Knott (1972), they developed a PCD method for the empirical process after estimation, based on a creative idea to construct the eigenfunctions of the process after estimation as linear combinations of the known eigenfunctions before estimation. The components of the Cram\'er-von Mises statistic, which are standard and asymptotically chi-square distributed, are then used for goodness-of-fit testing. Other papers that applied this method to deal with the estimation effect are Stute (1997) and Anh and Stute (2012), for testing parametric mean regression model and parametric hazard model, respectively. Although DKT's idea is simple and provides a powerful technique for the goodness-of-fit test of a composite hypothesis, it has not drawn much attention in view of such few applications. This might be due to a serious limitation of DKT's method: it only works if the unknown parameters are finite-dimensional and are estimated by the asymptotically efficient estimator. Therefore the existing method is not suitable for nonparametric or semiparametric models, or such models that the efficient estimation is not available.

      In chapter 1, motivated from the goodness-of-fit problem of the Cox model, which involves estimation of some finite-dimensional parameters and a nonparametric function, we follow DKT's idea and extend their method to accommodate any root n-consistent estimation of both parametric and nonparametric functions. We follow the suggestion of Lin, Wei and Ying (1993) by considering the CUSUM martingale residual process. The estimated process contains an estimation of a finite-dimensional regression parameter and a nonparametric function, hence DKT's method does not work. Whereas the PCD method we propose, for which we focus on the covariance kernel rather than the Fourier coefficients in DKT, provides a more general argument to accommodate any root n-consistent estimator of both parametric and nonparametric functions. Hence it is applicable in the Cox model and has a much larger range of application than DKT's. In the end, as expected, the limit Cram\'er-von Mises statistic can be decomposed into a weighted sum of independent chi-square components. Different types of tests can be constructed based on these components to improve efficiency. Finite sample performance of the proposed tests is illustrated in the context of a Monte Carlo experiment.

      Chapter 2:

      As we mentioned in chapter 1, the martingale residuals provide a basis for specification tests in hazard models. For checking the Cox model, Lin, Wei and Ying (1993) proposed a class of goodness-of-fit tests based on the CUSUM process of martingale residuals. In this article, we propose new goodness-of-fit tests for the Cox model based on some components of the CUSUM martingale process. These components are obtained through a conditional PCD method, which fills the gap of PCA in the conditional model testing problem and is the main contribution of this paper. The components of the CUSUM martingale process play a similar role with the traditional PCs of the empirical process, hence behave as building blocks for goodness-of-fit tests. The difference is that these components are stochastic processes rather than random variables. Therefore we call them component processes to be more precise. Specifically, it consists of two steps to obtain the component processes, (i) derive PCD of the individual martingale process conditional on the covariables, (ii) cumulatively sum up the obtained PCs in the first step w.r.t. the observations of the covariables. It turns out that the CUSUM martingale process can be decomposed into a decreasing weighted sum of the component processes. Since the PCD is in the time domain, the obtained components are sensitive when detecting certain deviations from the constant hazard ratio implied by the proportional hazard assumption, especially, higher-frequency deviations are more reflected in latter components. The omnibus test, which is based on the original CUSUM martingale process, down-weights the latter components heavily, thus, it has low power when detecting the high-frequency deviations. While we propose new goodness-of-fit tests, including tests based on each estimated component process and a Bonferroni test, which offset the power loss and outperform the omnibus test. Smooth tests that based on reweighted sums of a few components are also constructed. The finite sample performance of the tests is illustrated by mean of a Monte Carlo experiment.

      The conditional PCD method in this paper is applicable for any regression model that has a martingale interpretation, including conditional hazard models and transformation models. It also works for conditional distribution models, where the empirical process has a Brownian Bridge covariance structure. However, we focus on the Cox model in chapter 2 and discuss its application in the conditional distribution models in the next chapter.

      Chapter 3:

      In this article, we propose new specification tests for parametric conditional distribution models. The covariable can be a multidimensional vector and its distribution is unspecified. In the nonparametric testing literature, for example, Andrews (1997) proposed an omnibus test, which is consistent against all possible deviations, based on a CUSUM process of single event processes. His test is not distribution-free and is implemented by a parametric bootstrap. For the same question, Delgado and Stute (2008) provided a class of asymptotically distribution-free tests based on PCD of the multivariate empirical process. They used the Rosenblatt transformation to obtain independence between components of the empirical process and then applied Khmaladze martingale method to remove the effect caused by estimation. The distribution-free property together with the independence structure makes PCA for the conditional model possible.

      In this article, we conduct PCA for testing conditional distributions in a different way. We propose new goodness-of-fit tests based on some components of the CUSUM process in Andrews (1997). These components are obtained through the conditional PCD method, which has been used in chapter 2 for testing conditional hazard models. As we already mentioned, the obtained component processes provide a basis for goodness-of-fit tests. In fact, the CUSUM process can be decomposed into a decreasing weighted sum of its component processes. Since the PCD is in the response variable domain, the obtained components are sensitive when detecting certain deviations from the specified distribution, especially, higher-frequency deviations are more reflected in latter components. The omnibus test, which is based on the original CUSUM process, down-weights the latter components heavily, thus, it has low power when detecting the high-frequency deviations. While we propose new goodness-of-fit tests, including tests based on each estimated component process and a Bonferroni test, which offset the power loss and outperform the omnibus test. Smooth tests based on reweighted sums of a few components are also constructed. The finite sample performance of the tests is illustrated in the context of a Monte Carlo experiment, where we focus on the conditional normal models. The result shows a clear pattern of how these components behave as ``special experts", i.e., the test based on the first component is sensitive to mean shift, while the test based on the second component is sensitive to variance shift, and similar patterns for the third and fourth components when testing against skewness and kurtosis shifts.

      In fact, the two PCD methods, one for composite hypothesis and one for conditional hypothesis, can be used cooperatively. It is convenient to see their relationship through the applications in the Cox model. Roughly speaking, the two PCD methods aim at the decomposition of the bivariate CUSUM process of martingale residuals in different directions. The first method carries out the decomposition in the covariable domain, while the conditional PCD method is for decomposition in the time domain. In the Cox case, the two methods complement each other, namely, by obtaining PCD in both directions we are able to examine possible deviations in both aspects.

      In summary, the structure of the thesis is the following. In chapter 1, we introduce the first PCD method, which works for testing composite hypothesis and extends the existing method to have a larger application range. In chapter 2 and 3, we propose a conditional PCD method, which has not been used in any testing problem and discuss its application in the conditional hazard and conditional distribution models, respectively.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno