Empirical distribution function

The green curve, which asymptotically approaches heights of 0 and 1 without reaching them, is the true cumulative distribution function of the standard normal distribution. The grey hash marks represent the observations in a particular sample drawn from that distribution, and the horizontal steps of the blue step function (including the leftmost point in each step but not including the rightmost point) form the empirical distribution function of that sample. (Click here to load a new graph.)
The green curve, which asymptotically approaches heights of 0 and 1 without reaching them, is the true cumulative distribution function of the standard normal distribution. The grey hash marks represent the observations in a particular sample drawn from that distribution, and the horizontal steps of the blue step function (including the leftmost point in each step but not including the rightmost point) form the empirical distribution function of that sample. (Click here to load a new graph.)

In statistics, an empirical distribution function (commonly also called an empirical cumulative distribution function, eCDF) is the distribution function associated with the empirical measure of a sample.[1] This cumulative distribution function is a step function that jumps up by 1/n at each of the n data points. Its value at any specified value of the measured variable is the fraction of observations of the measured variable that are less than or equal to the specified value.

The empirical distribution function is an estimate of the cumulative distribution function that generated the points in the sample. It converges with probability 1 to that underlying distribution, according to the Glivenko–Cantelli theorem. A number of results exist to quantify the rate of convergence of the empirical distribution function to the underlying cumulative distribution function.

Definition

Let (X1, …, Xn) be independent, identically distributed real random variables with the common cumulative distribution function F(t). Then the empirical distribution function is defined as[2]

where is the indicator of event A. For a fixed t, the indicator is a Bernoulli random variable with parameter p = F(t); hence is a binomial random variable with mean nF(t) and variance nF(t)(1 − F(t)). This implies that is an unbiased estimator for F(t).

However, in some textbooks, the definition is given as

[3][4]

Asymptotic properties

Since the ratio (n + 1)/n approaches 1 as n goes to infinity, the asymptotic properties of the two definitions that are given above are the same.

By the strong law of large numbers, the estimator converges to F(t) as n → ∞ almost surely, for every value of t:[2]

thus the estimator is consistent. This expression asserts the pointwise convergence of the empirical distribution function to the true cumulative distribution function. There is a stronger result, called the Glivenko–Cantelli theorem, which states that the convergence in fact happens uniformly over t:[5]

The sup-norm in this expression is called the Kolmogorov–Smirnov statistic for testing the goodness-of-fit between the empirical distribution and the assumed true cumulative distribution function F. Other norm functions may be reasonably used here instead of the sup-norm. For example, the L2-norm gives rise to the Cramér–von Mises statistic.

The asymptotic distribution can be further characterized in several different ways. First, the central limit theorem states that pointwise, has asymptotically normal distribution with the standard rate of convergence:[2]

This result is extended by the Donsker’s theorem, which asserts that the empirical process , viewed as a function indexed by , converges in distribution in the Skorokhod space to the mean-zero Gaussian process , where B is the standard Brownian bridge.[5] The covariance structure of this Gaussian process is

The uniform rate of convergence in Donsker’s theorem can be quantified by the result known as the Hungarian embedding:[6]

Alternatively, the rate of convergence of can also be quantified in terms of the asymptotic behavior of the sup-norm of this expression. Number of results exist in this venue, for example the Dvoretzky–Kiefer–Wolfowitz inequality provides bound on the tail probabilities of :[6]

In fact, Kolmogorov has shown that if the cumulative distribution function F is continuous, then the expression converges in distribution to , which has the Kolmogorov distribution that does not depend on the form of F.

Another result, which follows from the law of the iterated logarithm, is that [6]

and

Confidence intervals

Empirical CDF, CDF and confidence interval plots for various sample sizes of normal distribution
Empirical CDF, CDF and confidence interval plots for various sample sizes of Cauchy distribution
Empirical CDF, CDF and confidence interval plots for various sample sizes of triangle distribution

As per Dvoretzky–Kiefer–Wolfowitz inequality the interval that contains the true CDF, , with probability is specified as

As per the above bounds, we can plot the Empirical CDF, CDF and confidence intervals for different distributions by using any one of the statistical implementations.

Statistical implementation

A non-exhaustive list of software implementations of Empirical Distribution function includes:

  • In R software, we compute an empirical cumulative distribution function, with several methods for plotting, printing and computing with such an “ecdf” object.
  • In MATLAB we can use Empirical cumulative distribution function (cdf) plot
  • jmp from SAS, the CDF plot creates a plot of the empirical cumulative distribution function.
  • Minitab, create an Empirical CDF
  • Mathwave, we can fit probability distribution to our data
  • Dataplot, we can plot Empirical CDF plot
  • Scipy, we can use scipy.stats.ecdf
  • Statsmodels, we can use statsmodels.distributions.empirical_distribution.ECDF
  • Matplotlib, using the matplotlib.pyplot.ecdf function (new in version 3.8.0)[7]
  • Seaborn, using the seaborn.ecdfplot function
  • Plotly, using the plotly.express.ecdf function
  • Excel, we can plot Empirical CDF plot
  • ArviZ, using the az.plot_ecdf function

See also

References

  1. ^ A modern introduction to probability and statistics: Understanding why and how. Michel Dekking. London: Springer. 2005. p. 219. ISBN 978-1-85233-896-1. OCLC 262680588.{{cite book}}: CS1 maint: others (link)
  2. ^ a b c van der Vaart, A.W. (1998). Asymptotic statistics. Cambridge University Press. p. 265. ISBN 0-521-78450-6.
  3. ^ Coles, S. (2001) An Introduction to Statistical Modeling of Extreme Values. Springer, p. 36, Definition 2.4. ISBN 978-1-4471-3675-0.
  4. ^ Madsen, H.O., Krenk, S., Lind, S.C. (2006) Methods of Structural Safety. Dover Publications. p. 148-149. ISBN 0486445976
  5. ^ a b van der Vaart, A.W. (1998). Asymptotic statistics. Cambridge University Press. p. 266. ISBN 0-521-78450-6.
  6. ^ a b c van der Vaart, A.W. (1998). Asymptotic statistics. Cambridge University Press. p. 268. ISBN 0-521-78450-6.
  7. ^ "What's new in Matplotlib 3.8.0 (Sept 13, 2023) — Matplotlib 3.8.3 documentation".

Further reading

Read other articles:

وسام الإمتياز ميدالية ذهبية وميدالية فضيّة مع قفل الجهة المانحة  الدولة العثمانية أنشئت 1882 أول مرة 11 سبتمبر 1883 تالي (أعلى) مزخرفة عسكرية بالذهب والفضة سابق (أدنى) ميدالية الذهب الياقة (بالذهب) ميدالية الفضة - الياقة (فضية) صورة شريط الخدمة   تعديل مصدري - تعديل   وس

 

 

Este nombre sigue la onomástica coreana; el apellido es Lee. Donghae Donghae en septiembre de 2019.Información personalNombre de nacimiento Lee Dong-hae 이동해Otros nombres DonghaeNacimiento 15 de octubre de 1986 (37 años) Mokpo, Jeolla del Sur, Corea del SurMokpo (Corea del Sur) Nacionalidad SurcoreanaEducaciónEducado en Universidad Myongji Información profesionalOcupación Cantante, actor, bailarín, modelo, compositorAños activo 2005 – presenteSeudónimo DonghaeGénero...

 

 

كأس سلوفاكيا 2016–17 تفاصيل الموسم كأس سلوفاكيا  البلد سلوفاكيا  البطل نادي سلوفان براتيسلافا  عدد المشاركين 202   كأس سلوفاكيا 2015–16  كأس سلوفاكيا 2017–18  تعديل مصدري - تعديل   كأس سلوفاكيا 2016–17 هو موسم من كأس سلوفاكيا. كان عدد الأندية المشاركة فيه 202، وفاز فيه ...

Bahasa SkithiaSkithia menurut PtolemaeusDituturkan di  Abkhazia  Ossetia Selatan  Mongolia  Rusia  Tiongkok WilayahAsia Tengah, Eropa TimurEtnisOrang Skithia, Sarmatia, dan AlanEraEra Klasik, Abad Kuno AkhirRumpun bahasaIndo-Eropa Indo-IranIranIran TimurSkithia DialekAlan (Barat) Skithia-Khotan (Timur) Kode bahasaISO 639-3Mencakup:xsc – Skithiaxln – Alanoos – Ossetia LamaLINGUIST Listxsc Skithia xln Alan oos Ossetia L...

 

 

YarsanیارسانKuil Shah Hayas di desa Wardik dekat Mosul di IrakJenisEtnikPenggolonganAgama Iran kunoKitab suciKalâm-e SaranjâmTeologiSinkretisWilayahKurdistanBahasaBahasa Kurdi; Bahasa GoraniPendiriSultan SahakDidirikanAkhir abad ke-14 Iran baratUmatca. 500,000[1] hingga 1,000,000 (di Iran)[2]Nama lainAhl-e Haqq, Kaka'i[3] Yarsan atau Ahle Haqq (bahasa Kurdi: یارسان, Yarsan,[4][5] Persia: اهل حق; Ahli Kebenaran), adalah agama s...

 

 

漢語族拼音方案 漢字注音史(*代表為現行由政府公告承認) 官话 官话拼音史 標準北京音 標準官話拼音對照表 基於拉丁字母的拼寫 威妥瑪拼音 郵政式拼音 國語羅馬字 北方话拉丁化新文字 普通話拼音* 注音第二式 通用拼音 耶魯拼音 法國遠東學院拼音 德國式拼音 捷克式拼音 简式威妥玛拼音 理雅各拼音 使用漢字部件 注音符號* 官话合声字母 基於其他書寫系統拼寫 汉语西

30th BRIT Awards Date16 February 2010VenueEarls CourtHostPeter KayBackstageFearne CottonITV2Caroline Flack and Rufus HoundRadio 1Scott Mills and Greg JamesNetworkITV and BBC (Radio Coverage) < 2009 • BRIT Awards • 2011 > Wikinews bahasa Inggris memberitakan: 2010 BRIT Awards highlights 2010 BRIT Awards dilaksanakan pada Selasa, 16 February 2010. ini merupakan edisi ke-30 dari acara tahunan penghargaan musik pop oleh British Phonographic Industry. Upacara pen...

 

 

Координати: 43°38′24″ пн. ш. 84°23′24″ зх. д. / 43.64000000002777568° пн. ш. 84.39000000002778279° зх. д. / 43.64000000002777568; -84.39000000002778279 Округ Мідленд, Мічиган На мапі штату Мічиган Розташування штату Мічиган на мапі США Заснований 1831 Центр Мідленд Найбільше місто Мідле...

 

 

Русанова Поліна МитрофанівнаНародилася 6 вересня 1946(1946-09-06) (77 років)Макіївка, Сталінська область, Українська РСР, СРСРКраїна  СРСРДіяльність державна діячкаПартія КПРС У Вікіпедії є статті про інших людей із прізвищем Русанова. Поліна Митрофанівна Русанова (нар. 6 вер...

French swimmer Lara GrangeonLara Grangeon, Eilat Israel, March 31, 2019Personal informationNational team France New CaledoniaBorn (1991-09-21) 21 September 1991 (age 32)Nouméa, New Caledonia[1]Height1.72 m (5 ft 8 in)Weight60 kg (132 lb)SportSportSwimmingStrokesButterfly, individual medleyClubCercle des Nageurs Caledoniens Medal record Women's Swimming Representing  France World Championships 2019 Gwangju 25 km open water European Cha...

 

 

Part of a series onImmigration General Immigration by country Immigration policy History and law Immigration law Border security Jus soli Visa Indefinite leave to remain Citizenship Right of asylum (Refugee) Social processes Social integration Immigrant assimilation Acculturation (Acculturation Gap) Persecution Social exclusion Political theories Civic nationalism Social cohesion Nativism Multiculturalism Plurinationalism Ethnocentricism Opposition and reform Criticism of multiculturalism Imm...

 

 

American novelist (1869–1946) Booth TarkingtonBooth Tarkington (1922)BornNewton Booth Tarkington(1869-07-29)July 29, 1869Indianapolis, Indiana, U.S.DiedMay 19, 1946(1946-05-19) (aged 76)Indianapolis, Indiana, U.S.OccupationNovelist, dramatistEducationShortridge High SchoolPhillips Exeter AcademyAlma materPurdue UniversityPrinceton UniversityYears active1899–1946Notable works Penrod (1914) The Magnificent Ambersons (1918) Alice Adams (1921) Notable awardsPulitzer Prize for F...

Moroccan political party Authenticity and Modernity Party حزب الأصالة والمعاصرة ⴰⵎⵓⵍⵍⵉ ⵏ ⵜⴰⵥⵖⵓⵕⵜ ⴷ ⵜⴰⵎⵜⵔⴰⵔⵜGeneral SecretaryAbdellatif OuahbiFounderFouad Ali El HimmaFoundedAugust 2008; 15 years ago (2008-08)Merger ofEnvironment and Development Party, Alliance of Liberties, Civic Initiative for DevelopmentHeadquartersRabatIdeologyReformism[1][verification needed]Monarchism[2]So...

 

 

Beim Bressehaus handelt es sich um ein Fachwerkhaus in Ständerbauweise, das mit Lehmziegeln ausgefacht ist. Ein großes Walmdach schützt das empfindliche Gemäuer vor Regen und Schnee. Fast durchweg steht das Haus in Nord-Süd-Richtung, wobei das Dach auf der Nordseite oft tiefer gezogen ist. Diese Ausrichtung bietet optimalen Schutz vor der kalten Bise, die zudem durch das tiefer gezogene Dach der nördlichen Giebelseite über das Haus geleitet wird. Die Wohnräume befinden sich auf der S...

 

 

Hotel and sports complex in Toronto, Ontario Hotel X Torontoby Library Hotel CollectionView during construction in June 2016Hotel chainLibrary Hotel CollectionGeneral informationTypeHotel and sports complexLocationExhibition PlaceAddress111 Princes' BoulevardTown or cityToronto, OntarioCountryCanadaCoordinates43°37′59″N 79°24′44″W / 43.6331°N 79.4121°W / 43.6331; -79.4121OpenedMarch 20, 2018OwnerPrinces Gate Hotel, LPTechnical detailsFloor count 30 (main to...

Public University in Dera Ismail Khan, Pakistan Gomal Universityجامعہ گوملLogo of the Gomal UniversityOther nameGUMotto in EnglishUniversity at GlanceTypePublicEstablished1974; 49 years ago (1974)ChancellorGovernor of Khyber PakhtunkhwaVice-ChancellorProf. Dr. Shakeeb UllahDirectorDr. Ameer Atta , Director Distance EducationAcademic staff335[1](registered professors)Administrative staff384[1]Students20000[1]Postgraduates485[1]Doc...

 

 

Indian historian and academic (1888–1980) For the Indian physicist, see Ramesh Chandra Majumdar (physicist). R. C. MajumdarMajumdar in 1960Vice-Chancellor of University of DhakaIn office1 January 1937 – 30 June 1942Preceded byA. F. RahmanSucceeded byMahmud Hasan Personal detailsBornRamesh Chandra Majumdar(1888-12-04)4 December 1888Khandapara, Gopalganj, Bengal Presidency, British IndiaDied11 February 1980(1980-02-11) (aged 91)Kolkata, West Bengal, IndiaAlma materUniversity o...

 

 

1946 short story by Ray Bradbury This media article uses IMDb for verification. IMDb may not be a reliable source for film and television information and is generally only cited as an external link. Please help by replacing IMDb with third-party reliable sources. Unsourced material may be challenged and removed. (May 2013) (Learn how and when to remove this template message) The Small Assassin was the cover story in the November 1946 issue of Dime Mystery. The Small Assassin is a short story ...

1950s North Indian Ocean cyclone seasonsSeasonal boundariesFirst system formedApril 8, 1950Last system dissipated1960Seasonal statisticsDepressions71Total fatalities12,500+Total damageUnknown North Indian Ocean tropical cyclone seasons1930s, 1940s, 1950s, 1960, 1961 The years between 1950 and 1959 featured the 1950s North Indian Ocean cyclone seasons. Each season was an ongoing event in the annual cycle of tropical cyclone formation. The North Indian tropical cyclone season has no bounds, but...

 

 

A bocca chiusasingolo discograficoScreenshot del video musicale del branoArtistaDaniele Silvestri Pubblicazione13 febbraio 2013 Durata3:57 Album di provenienzaChe nemmeno Mennea GenerePopMusica d'autore EtichettaSony Music Registrazione2013 Formatidownload digitale Daniele Silvestri - cronologiaSingolo precedenteIl viaggio (pochi grammi di coraggio)(2011)Singolo successivoIl bisogno di te (ricatto d'onor)(2013) A bocca chiusa è un brano musicale scritto e interpretato da Daniele Silvestri, p...

 

 

Strategi Solo vs Squad di Free Fire: Cara Menang Mudah!