Published by the Foundation for Open Access Statistics Editors-in-chief: Bettina Grün, Torsten Hothorn, Rebecca Killick, Edzer Pebesma, Achim Zeileis    ISSN 1548-7660; CODEN JSSOBK
Authors: Dimitrina S. Dimitrova, Vladimir K. Kaishev, Senren Tan
Title: Computing the Kolmogorov-Smirnov Distribution When the Underlying CDF is Purely Discrete, Mixed, or Continuous
Abstract: The distribution of the Kolmogorov-Smirnov (KS) test statistic has been widely studied under the assumption that the underlying theoretical cumulative distribution function (CDF), F (x), is continuous. However, there are many real-life applications in which fitting discrete or mixed distributions is required. Nevertheless, due to inherent difficulties, the distribution of the KS statistic when F (x) has jump discontinuities has been studied to a much lesser extent and no exact and efficient computational methods have been proposed in the literature. In this paper, we provide a fast and accurate method to compute the (complementary) CDF of the KS statistic when F (x) is discontinuous, and thus obtain exact p values of the KS test. Our approach is to express the complementary CDF through the rectangle probability for uniform order statistics, and to compute it using fast Fourier transform (FFT). Secondly, we provide a C++ and an R implementation of the proposed method, which fills the existing gap in statistical software. We give also a useful extension of the Schmid's asymptotic formula for the distribution of the KS statistic, relaxing his requirement for F (x) to be increasing between jumps and thus allowing for any general mixed or purely discrete F (x). The numerical performance of the proposed FFT-based method, implemented both in C++ and in the R package KSgeneral, available from, is illustrated when F (x) is mixed, purely discrete, and continuous. The performance of the general asymptotic formula is also studied.

Page views:: 1442. Submitted: 2016-10-20. Published: 2020-10-07.
Paper: Computing the Kolmogorov-Smirnov Distribution When the Underlying CDF is Purely Discrete, Mixed, or Continuous     Download PDF (Downloads: 1090)
KSgeneral_1.0.0.tar.gz: R source package Download (Downloads: 39; 69KB) Replication materials Download (Downloads: 19; 6MB)

DOI: 10.18637/jss.v095.i10

This work is licensed under the licenses
Paper: Creative Commons Attribution 3.0 Unported License
Code: GNU General Public License (at least one of version 2 or version 3) or a GPL-compatible license.