Multivariate statistics

Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., multivariate random variables. Multivariate statistics concerns understanding the different aims and background of each of the different forms of multivariate analysis, and how they relate to each other. The practical application of multivariate statistics to a particular problem may involve several types of univariate and multivariate analyses in order to understand the relationships between variables and their relevance to the problem being studied.

In addition, multivariate statistics is concerned with multivariate probability distributions, in terms of both

  • how these can be used to represent the distributions of observed data;
  • how they can be used as part of statistical inference, particularly where several different quantities are of interest to the same analysis.

Certain types of problems involving multivariate data, for example simple linear regression and multiple regression, are not usually considered to be special cases of multivariate statistics because the analysis is dealt with by considering the (univariate) conditional distribution of a single outcome variable given the other variables.

Multivariate analysis

Multivariate analysis (MVA) is based on the principles of multivariate statistics. Typically, MVA is used to address situations where multiple measurements are made on each experimental unit and the relations among these measurements and their structures are important.[1] A modern, overlapping categorization of MVA includes:[1]

  • Normal and general multivariate models and distribution theory
  • The study and measurement of relationships
  • Probability computations of multidimensional regions
  • The exploration of data structures and patterns

Multivariate analysis can be complicated by the desire to include physics-based analysis to calculate the effects of variables for a hierarchical "system-of-systems". Often, studies that wish to use multivariate analysis are stalled by the dimensionality of the problem. These concerns are often eased through the use of surrogate models, highly accurate approximations of the physics-based code. Since surrogate models take the form of an equation, they can be evaluated very quickly. This becomes an enabler for large-scale MVA studies: while a Monte Carlo simulation across the design space is difficult with physics-based codes, it becomes trivial when evaluating surrogate models, which often take the form of response-surface equations.

Types of analysis

Many different models are used in MVA, each with its own type of analysis:

  1. Multivariate analysis of variance (MANOVA) extends the analysis of variance to cover cases where there is more than one dependent variable to be analyzed simultaneously; see also Multivariate analysis of covariance (MANCOVA).
  2. Multivariate regression attempts to determine a formula that can describe how elements in a vector of variables respond simultaneously to changes in others. For linear relations, regression analyses here are based on forms of the general linear model. Some suggest that multivariate regression is distinct from multivariable regression, however, that is debated and not consistently true across scientific fields.[2]
  3. Principal components analysis (PCA) creates a new set of orthogonal variables that contain the same information as the original set. It rotates the axes of variation to give a new set of orthogonal axes, ordered so that they summarize decreasing proportions of the variation.
  4. Factor analysis is similar to PCA but allows the user to extract a specified number of synthetic variables, fewer than the original set, leaving the remaining unexplained variation as error. The extracted variables are known as latent variables or factors; each one may be supposed to account for covariation in a group of observed variables.
  5. Canonical correlation analysis finds linear relationships among two sets of variables; it is the generalised (i.e. canonical) version of bivariate[3] correlation.
  6. Redundancy analysis (RDA) is similar to canonical correlation analysis but allows the user to derive a specified number of synthetic variables from one set of (independent) variables that explain as much variance as possible in another (independent) set. It is a multivariate analogue of regression.[4]
  7. Correspondence analysis (CA), or reciprocal averaging, finds (like PCA) a set of synthetic variables that summarise the original set. The underlying model assumes chi-squared dissimilarities among records (cases).
  8. Canonical (or "constrained") correspondence analysis (CCA) for summarising the joint variation in two sets of variables (like redundancy analysis); combination of correspondence analysis and multivariate regression analysis. The underlying model assumes chi-squared dissimilarities among records (cases).
  9. Multidimensional scaling comprises various algorithms to determine a set of synthetic variables that best represent the pairwise distances between records. The original method is principal coordinates analysis (PCoA; based on PCA).
  10. Discriminant analysis, or canonical variate analysis, attempts to establish whether a set of variables can be used to distinguish between two or more groups of cases.
  11. Linear discriminant analysis (LDA) computes a linear predictor from two sets of normally distributed data to allow for classification of new observations.
  12. Clustering systems assign objects into groups (called clusters) so that objects (cases) from the same cluster are more similar to each other than objects from different clusters.
  13. Recursive partitioning creates a decision tree that attempts to correctly classify members of the population based on a dichotomous dependent variable.
  14. Artificial neural networks extend regression and clustering methods to non-linear multivariate models.
  15. Statistical graphics such as tours, parallel coordinate plots, scatterplot matrices can be used to explore multivariate data.
  16. Simultaneous equations models involve more than one regression equation, with different dependent variables, estimated together.
  17. Vector autoregression involves simultaneous regressions of various time series variables on their own and each other's lagged values.
  18. Principal response curves analysis (PRC) is a method based on RDA that allows the user to focus on treatment effects over time by correcting for changes in control treatments over time.[5]
  19. Iconography of correlations consists in replacing a correlation matrix by a diagram where the “remarkable” correlations are represented by a solid line (positive correlation), or a dotted line (negative correlation).

Dealing with incomplete data

It is very common that in an experimentally acquired set of data the values of some components of a given data point are missing. Rather than discarding the whole data point, it is common to "fill in" values for the missing components, a process called "imputation".[6]

Important probability distributions

There is a set of probability distributions used in multivariate analyses that play a similar role to the corresponding set of distributions that are used in univariate analysis when the normal distribution is appropriate to a dataset. These multivariate distributions are:

The Inverse-Wishart distribution is important in Bayesian inference, for example in Bayesian multivariate linear regression. Additionally, Hotelling's T-squared distribution is a multivariate distribution, generalising Student's t-distribution, that is used in multivariate hypothesis testing.

History

Anderson's 1958 textbook, An Introduction to Multivariate Statistical Analysis,[7] educated a generation of theorists and applied statisticians; Anderson's book emphasizes hypothesis testing via likelihood ratio tests and the properties of power functions: admissibility, unbiasedness and monotonicity.[8][9]

MVA was formerly discussed solely in the context of statistical theories, due to the size and complexity of underlying datasets and its high computational consumption. With the dramatic growth of computational power, MVA now plays an increasingly important role in data analysis and has wide application in Omics fields.

Applications

Software and tools

There are an enormous number of software packages and other tools for multivariate analysis, including:

See also

References

  1. ^ a b Olkin, I.; Sampson, A. R. (2001-01-01), "Multivariate Analysis: Overview", in Smelser, Neil J.; Baltes, Paul B. (eds.), International Encyclopedia of the Social & Behavioral Sciences, Pergamon, pp. 10240–10247, ISBN 9780080430768, retrieved 2019-09-02
  2. ^ Hidalgo, B; Goodman, M (2013). "Multivariate or multivariable regression?". Am J Public Health. 103 (1): 39–40. doi:10.2105/AJPH.2012.300897. PMC 3518362. PMID 23153131.
  3. ^ Unsophisticated analysts of bivariate Gaussian problems may find useful a crude but accurate method of accurately gauging probability by simply taking the sum S of the N residuals' squares, subtracting the sum Sm at minimum, dividing this difference by Sm, multiplying the result by (N - 2) and taking the inverse anti-ln of half that product.
  4. ^ Van Den Wollenberg, Arnold L. (1977). "Redundancy analysis an alternative for canonical correlation analysis". Psychometrika. 42 (2): 207–219. doi:10.1007/BF02294050.
  5. ^ ter Braak, Cajo J.F. & Šmilauer, Petr (2012). Canoco reference manual and user's guide: software for ordination (version 5.0), p292. Microcomputer Power, Ithaca, NY.
  6. ^ J.L. Schafer (1997). Analysis of Incomplete Multivariate Data. Chapman & Hall/CRC. ISBN 978-1-4398-2186-2.
  7. ^ T.W. Anderson (1958) An Introduction to Multivariate Analysis, New York: Wiley ISBN 0471026409; 2e (1984) ISBN 0471889873; 3e (2003) ISBN 0471360910
  8. ^ Sen, Pranab Kumar; Anderson, T. W.; Arnold, S. F.; Eaton, M. L.; Giri, N. C.; Gnanadesikan, R.; Kendall, M. G.; Kshirsagar, A. M.; et al. (June 1986). "Review: Contemporary Textbooks on Multivariate Statistical Analysis: A Panoramic Appraisal and Critique". Journal of the American Statistical Association. 81 (394): 560–564. doi:10.2307/2289251. ISSN 0162-1459. JSTOR 2289251.(Pages 560–561)
  9. ^ Schervish, Mark J. (November 1987). "A Review of Multivariate Analysis". Statistical Science. 2 (4): 396–413. doi:10.1214/ss/1177013111. ISSN 0883-4237. JSTOR 2245530.
  10. ^ CRAN has details on the packages available for multivariate data analysis

Further reading

  • Johnson, Richard A.; Wichern, Dean W. (2007). Applied Multivariate Statistical Analysis (Sixth ed.). Prentice Hall. ISBN 978-0-13-187715-3.
  • KV Mardia; JT Kent; JM Bibby (1979). Multivariate Analysis. Academic Press. ISBN 0-12-471252-5.
  • A. Sen, M. Srivastava, Regression Analysis — Theory, Methods, and Applications, Springer-Verlag, Berlin, 2011 (4th printing).
  • Cook, Swayne (2007). Interactive Graphics for Data Analysis.
  • Malakooti, B. (2013). Operations and Production Systems with Multiple Objectives. John Wiley & Sons.
  • T. W. Anderson, An Introduction to Multivariate Statistical Analysis, Wiley, New York, 1958.
  • KV Mardia; JT Kent & JM Bibby (1979). Multivariate Analysis. Academic Press. ISBN 978-0124712522. (M.A. level "likelihood" approach)
  • Feinstein, A. R. (1996) Multivariable Analysis. New Haven, CT: Yale University Press.
  • Hair, J. F. Jr. (1995) Multivariate Data Analysis with Readings, 4th ed. Prentice-Hall.
  • Schafer, J. L. (1997) Analysis of Incomplete Multivariate Data. CRC Press. (Advanced)
  • Sharma, S. (1996) Applied Multivariate Techniques. Wiley. (Informal, applied)
  • Izenman, Alan J. (2008). Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning. Springer Texts in Statistics. New York: Springer-Verlag. ISBN 9780387781884.
  • Tinsley, Howard E. A.; Brown, Steven D., eds. (2000). Handbook of Applied Multivariate Statistics and Mathematical Modeling. Academic Press. doi:10.1016/B978-0-12-691360-6.X5000-9. ISBN 978-0-12-691360-6.

Read other articles:

Iranian wrestling or Koshti (Persian: کشتی, romanized: koshti) is a martial art and combat sport that has been practiced since ancient times in Iran. A form today is koshti pahlavani practiced in the zurkhaneh.[1] while regional variations differ from one province to another. Olympic freestyle wrestling is often referred to as the first sport of Iran.[1] Iran has won 47 Olympic medals for wrestling.[1] History Locho pahlevani wrestling, which dates back to mor...

 

Logo des International Bertelsmann Forum (1992–2001) International Bertelsmann Forum im Bundesgästehaus (1996) Das International Bertelsmann Forum (kurz IBF) war eine Veranstaltungsreihe unter Federführung der gemeinnützigen Bertelsmann Stiftung. Sie wurde zwischen 1992 und 2006 in unregelmäßigen Abständen durchgeführt und beschäftigte sich vor allem mit Fragen der europäischen Zusammenarbeit. Zu den Teilnehmenden zählen Führungspersönlichkeiten aus Politik, Wirtschaft, Wissensc...

 

Wrestling at the Olympics At the 1904 Summer Olympics, seven wrestling events were contested, all in the freestyle discipline. It was the first time freestyle wrestling was featured, as the first Olympic wrestling contests had been in the Greco-Roman style. Weight classes also made their first appearance. The sport continues to be in the Olympic program to the present day.[1] Wrestling Medal summary Games Gold Silver Bronze Light flyweightdetails Robert Curry United States John H...

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (أغسطس 2020) محمد بن دحيلان أبوتايه معلومات شخصية الميلاد 1880معان تاريخ الوفاة 1939 مواطنة  الأردن القبيلة الحويطات تعديل مصدري - تعديل   محمد بن دحيلان أبو تايه الحويطا

 

Опис У залі сакрального мистецтва Національного музею народного мистецтва Гуцульщини та Покуття імені Й. Кобринського, м. Коломия Ів-Франківської обл. Джерело Сайт Сергія Клименка Час створення 26 червня 2008 року Автор зображення Сергій Клименко Ліцензія Цей твір поширюєт

 

若非特別註明,本條目所有時間皆為東九區標準時間(UTC+9)。 日本的地面数字电视(上圖,ISDB-T标准)和传统模拟电视(下圖,NTSC标准)的畫質比較,以NHK的天氣預報畫面為例。数字电视的解析度和細緻度明顯提高,抗干擾能力较强,還能提供各種互動功能和軟件升級功能。 东京晴空塔,自投入使用以来就一直发射地面数字电视信号。 日本的地面数字电视于2003年开播,使

1986 Chicago CubsLeagueNational LeagueDivisionEastBallparkWrigley FieldCityChicagoOwnersTribune CompanyGeneral managersDallas GreenManagersJim Frey, John Vukovich, Gene MichaelTelevisionWGN-TV/Superstation WGN(Harry Caray, Steve Stone, Dewayne Staats)RadioWGN(Dewayne Staats, Vince Lloyd, Lou Boudreau, Harry Caray)StatsESPN.comBB-reference ← 1985 Seasons 1987 → The 1986 Chicago Cubs season was the 115th season of the Chicago Cubs franchise, the 111th in the National ...

 

American politician (born 1989) Austin Davis35th Lieutenant Governor of PennsylvaniaIncumbentAssumed office January 17, 2023GovernorJosh ShapiroPreceded byKim Ward (acting)Member of the Pennsylvania House of Representativesfrom the 35th districtIn officeFebruary 5, 2018 – December 7, 2022Preceded byMarc GergelySucceeded byMatt Gergely Personal detailsBornAustin Ankarie Davis (1989-10-04) October 4, 1989 (age 34)McKeesport, Pennsylvania, U.S.Political partyDemoc...

 

Asmaa Boujibar Información personalNombre en árabe أسماء بوجيبار Nacimiento 1984 Casablanca (Marruecos) Nacionalidad Francesa y marroquíEducaciónEducada en Universidad de Rennes 1 (Lic. en Ciencias de la Tierra; 2004-2008)Universidad de La Reunión (Maestría en Geosciences; 2008-2009)Universidad Blaise Pascal (Maestría en Geosciences; 2009-2010)Universidad Blaise Pascal (Doc. en Petrología; 2010-2014) Supervisor doctoral Denis Andrault Información profe...

Type of relationship where one person enables the other's self-destructive tendencies Part of a series onSociology History Outline Index Key themes Society Globalization Human behavior Human environmental impact Identity Industrial revolutions 3 / 4 / 5 Social complexity Social construct Social environment Social equality Social equity Social power Social stratification Social structure Perspectives Conflict theory Critical theory Structural functionalism Positivism Social constructionism Sym...

 

Television channel TVN MeteoCountryPolandBroadcast areaPolandNetworkTVNHeadquartersMedia Business CentreWarsaw, PolandProgrammingPicture format576i (16:9 SDTV)OwnershipOwnerTVN GroupSister channelsTVN24 TVN24 BISHistoryLaunched10 May 2003; 20 years ago (2003-05-10)ReplacedTVN Meteo ActiveClosed15 April 2015; 8 years ago (2015-04-15)LinksWebsitetvnmeteo.pl TVN Meteo was a Polish channel dedicated exclusively to weather forecasts, it launched on May 10, 2003....

 

Kereta api BengawanKereta api Bengawan saat melintasi Walet, TambunInformasi umumJenis layananKereta api antarkotaStatusBeroperasiDaerah operasiDaerah Operasi VI YogyakartaPendahuluSenja Ekonomi SoloSenja BengawanMulai beroperasiSekitar tahun 1994Operator saat iniKereta Api IndonesiaLintas pelayananStasiun awalPurwosariJumlah pemberhentianLihatlah di bawahStasiun akhirPasar SenenJarak tempuh567 kmWaktu tempuh rerata9 jam 28 menit[1]Frekuensi perjalananSekali keberangkatan tiap hariJen...

2010s American horror TV series FreakishGenreHorrorCreated byBeth SzymkowskiStarring Leo Howard Liza Koshy Adam Hicks Aislinn Paul Meghan Rienks Melvin Gregg Tyler Chase Mary Mouser Alex Ozerov Hayes Grier Chad L. Coleman Niki DeMar Saxon Sharbino Amanda Steele Jake Busey Country of originUnited StatesOriginal languageEnglishNo. of seasons2No. of episodes20 (list of episodes)ProductionExecutive producers Chris Grismer Brian Robbins Beth Szymkowski Shelley Zimmerman Matthew V. Lewis Camera set...

 

Book by Laurence Dermott The Ahiman Rezon, an edition from 1756. Part of a series onFreemasonry Overview Grand Lodge Masonic lodge Masonic lodge officers Grand Master Prince Hall Freemasonry Regular Masonic jurisdiction Anglo-American Freemasonry Continental Freemasonry History History of Freemasonry Liberté chérie Masonic manuscripts Masonic bodies Masonic Masonic bodies York Rite Order of Mark Master Masons Holy Royal Arch Royal Arch Masonry Cryptic Masonry Knights Templar Red Cross of Co...

 

Kota Bacolod Dakbanwa sang BacolodLungsod ng BacolodJulukan: Kota SenyumanIbukota Sepak Bola Filipina[1]Peta lokasi Dakbanwa sang BacolodNegara FilipinaRegionVisayas Barat (Region VI)ProvinsiNegros OccidentalDistrikLone District of Bacolod CityBarangay61Incorporated (town)1770Incorporated (city)18 Juni 1938Pemerintahan • Wali kotaEvelio Ramos Leonardia (NPC)(2007-2010) • Wakil wali kotaJude Thaddeus Sayson (2007-2010) • KongresMonico Puent...

Steel roller coaster FlashbackPreviously known as The Vampire at Kentucky Kingdom, Boomerang at Star Lake Amusement ParkSix Flags New EnglandPark sectionNorth EndCoordinates42°2′24″N 72°36′54″W / 42.04000°N 72.61500°W / 42.04000; -72.61500StatusOperatingOpening dateMay 5, 2000 (2000-05-05)Kentucky KingdomCoordinates38°11′42″N 85°44′49″W / 38.195°N 85.747°W / 38.195; -85.747StatusRemovedOpening dateJune ...

 

Native American culture in the United States (800 - 1600) This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (October 2019) (Learn how and when to remove this template message) Approximate areas of various Mississippian and related cultures The Mississippian culture was a Native American civilization that flourished in what is now the Midwestern, Eastern, and Sou...

 

Рон-Роберт Цилер Рон-Роберт Цилер Особисті дані Народження 12 лютого 1989(1989-02-12) (34 роки)   Кельн, ФРН Зріст 188 см Вага 86 кг Громадянство  Німеччина Позиція воротар Інформація про клуб Поточний клуб «Кельн» Номер 16 Юнацькі клуби ?–20012001–20052005–2008 «Вікторія» (Кельн) «Кель...

门户可以指: 门 (建筑) (戶,護也。半門曰戶。) 门户城市 门户网站 派别:幫派、党派、教派、派系 家庭地位 这是一个消歧义页,羅列了有相同或相近的标题,但內容不同的条目。如果您是通过某條目的内部链接而转到本页,希望您能協助修正该處的内部链接,將它指向正确的条目。

 

U.S. Navy admiral William W. WheelerNickname(s)TreyBorn1966 (age 57–58)Cross City, FloridaAllegianceUnited StatesService/branchUnited States NavyYears of service1988–2022RankRear AdmiralCommands heldPatrol and Reconnaissance GroupPatrol and Reconnaissance Wing 11 William Wandle Wheeler III[1] (born 1966)[2] is a retired United States Navy rear admiral who served as the Chief of Staff of the United States Strategic Command from July 31, 2020 to July 2022. Prev...

 

Strategi Solo vs Squad di Free Fire: Cara Menang Mudah!