Share to: share facebook share twitter share wa share telegram print page

Binary data

Binary data is data whose unit can take on only two possible states. These are often labelled as 0 and 1 in accordance with the binary numeral system and Boolean algebra.

Binary data occurs in many different technical and scientific fields, where it can be called by different names including bit (binary digit) in computer science, truth value in mathematical logic and related domains and binary variable in statistics.

Mathematical and combinatoric foundations

A discrete variable that can take only one state contains zero information, and 2 is the next natural number after 1. That is why the bit, a variable with only two possible values, is a standard primary unit of information.

A collection of n bits may have 2n states: see binary number for details. Number of states of a collection of discrete variables depends exponentially on the number of variables, and only as a power law on number of states of each variable. Ten bits have more (1024) states than three decimal digits (1000). 10k bits are more than sufficient to represent an information (a number or anything else) that requires 3k decimal digits, so information contained in discrete variables with 3, 4, 5, 6, 7, 8, 9, 10... states can be ever superseded by allocating two, three, or four times more bits. So, the use of any other small number than 2 does not provide an advantage.

A Hasse diagram: representation of a Boolean algebra as a directed graph

Moreover, Boolean algebra provides a convenient mathematical structure for collection of bits, with a semantic of a collection of propositional variables. Boolean algebra operations are known as "bitwise operations" in computer science. Boolean functions are also well-studied theoretically and easily implementable, either with computer programs or by so-named logic gates in digital electronics. This contributes to the use of bits to represent different data, even those originally not binary.

In statistics

In statistics, binary data is a statistical data type consisting of categorical data that can take exactly two possible values, such as "A" and "B", or "heads" and "tails". It is also called dichotomous data, and an older term is quantal data.[1] The two values are often referred to generically as "success" and "failure".[1] As a form of categorical data, binary data is nominal data, meaning the values are qualitatively different and cannot be compared numerically. However, the values are frequently represented as 1 or 0, which corresponds to counting the number of successes in a single trial: 1 (success…) or 0 (failure); see § Counting.

Often, binary data is used to represent one of two conceptually opposed values, e.g.:

  • the outcome of an experiment ("success" or "failure")
  • the response to a yes–no question ("yes" or "no")
  • presence or absence of some feature ("is present" or "is not present")
  • the truth or falsehood of a proposition ("true" or "false", "correct" or "incorrect")

However, it can also be used for data that is assumed to have only two possible values, even if they are not conceptually opposed or conceptually represent all possible values in the space. For example, binary data is often used to represent the party choices of voters in elections in the United States, i.e. Republican or Democratic. In this case, there is no inherent reason why only two political parties should exist, and indeed, other parties do exist in the U.S., but they are so minor that they are generally simply ignored. Modeling continuous data (or categorical data of more than 2 categories) as a binary variable for analysis purposes is called dichotomization (creating a dichotomy). Like all discretization, it involves discretization error, but the goal is to learn something valuable despite the error: treating it as negligible for the purpose at hand, but remembering that it cannot be assumed to be negligible in general.

Binary variables

A binary variable is a random variable of binary type, meaning with two possible values. Independent and identically distributed (i.i.d.) binary variables follow a Bernoulli distribution, but in general binary data need not come from i.i.d. variables. Total counts of i.i.d. binary variables (equivalently, sums of i.i.d. binary variables coded as 1 or 0) follow a binomial distribution, but when binary variables are not i.i.d., the distribution need not be binomial.

Counting

Like categorical data, binary data can be converted to a vector of count data by writing one coordinate for each possible value, and counting 1 for the value that occurs, and 0 for the value that does not occur.[2] For example, if the values are A and B, then the data set A, A, B can be represented in counts as (1, 0), (1, 0), (0, 1). Once converted to counts, binary data can be grouped and the counts added. For instance, if the set A, A, B is grouped, the total counts are (2, 1): 2 A's and 1 B (out of 3 trials).

Since there are only two possible values, this can be simplified to a single count (a scalar value) by considering one value as "success" and the other as "failure", coding a value of the success as 1 and of the failure as 0 (using only the coordinate for the "success" value, not the coordinate for the "failure" value). For example, if the value A is considered "success" (and thus B is considered "failure"), the data set A, A, B would be represented as 1, 1, 0. When this is grouped, the values are added, while the number of trial is generally tracked implicitly. For example, A, A, B would be grouped as 1 + 1 + 0 = 2 successes (out of trials). Going the other way, count data with is binary data, with the two classes being 0 (failure) or 1 (success).

Counts of i.i.d. binary variables follow a binomial distribution, with the total number of trials (points in the grouped data).

Regression

Regression analysis on predicted outcomes that are binary variables is known as binary regression; when binary data is converted to count data and modeled as i.i.d. variables (so they have a binomial distribution), binomial regression can be used. The most common regression methods for binary data are logistic regression, probit regression, or related types of binary choice models.

Similarly, counts of i.i.d. categorical variables with more than two categories can be modeled with a multinomial regression. Counts of non-i.i.d. binary data can be modeled by more complicated distributions, such as the beta-binomial distribution (a compound distribution). Alternatively, the relationship can be modeled without needing to explicitly model the distribution of the output variable using techniques from generalized linear models, such as quasi-likelihood and a quasibinomial model; see Overdispersion § Binomial.

In computer science

A binary image of a QR code, representing 1 bit per pixel, as opposed to a typical 24-bit true color image.

In modern computers, binary data refers to any data represented in binary form rather than interpreted on a higher level or converted into some other form. At the lowest level, bits are stored in a bistable device such as a flip-flop. While most binary data has symbolic meaning (except for don't cares) not all binary data is numeric. Some binary data corresponds to computer instructions, such as the data within processor registers decoded by the control unit along the fetch-decode-execute cycle. Computers rarely modify individual bits for performance reasons. Instead, data is aligned in groups of a fixed number of bits, usually 1 byte (8 bits). Hence, "binary data" in computers are actually sequences of bytes. On a higher level, data is accessed in groups of 1 word (4 bytes) for 32-bit systems and 2 words for 64-bit systems.

In applied computer science and in the information technology field, the term binary data is often specifically opposed to text-based data, referring to any sort of data that cannot be interpreted as text. The "text" vs. "binary" distinction can sometimes refer to the semantic content of a file (e.g. a written document vs. a digital image). However, it often refers specifically to whether the individual bytes of a file are interpretable as text (see character encoding) or cannot so be interpreted. When this last meaning is intended, the more specific terms binary format and text(ual) format are sometimes used. Semantically textual data can be represented in binary format (e.g. when compressed or in certain formats that intermix various sorts of formatting codes, as in the doc format used by Microsoft Word); contrarily, image data is sometimes represented in textual format (e.g. the X PixMap image format used in the X Window System).

1 and 0 are nothing but just two different voltage levels. You can make the computer understand 1 for higher voltage and 0 for lower voltage. There are many different ways to store two voltage levels. If you have seen floppy, then you will find a magnetic tape that has a coating of ferromagnetic material, this is a type of paramagnetic material that has domains aligned in a particular direction to give a remnant magnetic field even after removal of currents through materials or magnetic field. During loading of data in the magnetic tape, the magnetic field is passed in one direction to call the saved orientation of the domain 1 and for the magnetic field is passed in another direction, then the saved orientation of the domain is 0. In this way, generally, 1 and 0 data are stored.[3]

See also

References

  1. ^ a b Collett 2002, p. 1.
  2. ^ Agresti, Alan (2012). "1.2.2 Multinomial Distribution". Categorical Data Analysis (3rd ed.). Wiley. p. 6. ISBN 978-0470463635.
  3. ^ Gul, Najam (2022-08-18). "How do different types of Data get stored in form of 0 and 1?". Curiosity Tea. Retrieved 2023-01-05.
  • Collett, David (2002). Modelling Binary Data (Second ed.). CRC Press. ISBN 9781420057386.

Read other articles:

Laurenzana Entidad subnacional LaurenzanaLocalización de Laurenzana en Italia Coordenadas 40°27′00″N 15°58′00″E / 40.45, 15.966666666667Capital LaurenzanaIdioma oficial ItalianoEntidad Comuna de Italia • País Italia • Región Basilicata • Provincia PotenzaMunicipios limítrofes Anzi, Calvello, Castelmezzano, Corleto Perticara, Pietrapertosa, ViggianoSuperficie   • Total 95 km²Altitud   • Media 850 m s. n. m.Población&#...

This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (November 2021) (Learn how and when to remove this template message)Official logotype of the engineering duty officer community An engineering duty officer (EDO) is a restricted line officer in the United States Navy, involved with the design, acquisition, construction, repair, maintenance, conversion, overhaul a...

2019 single by James Blake featuring Travis Scott and Metro BoominMile HighSingle by James Blake featuring Travis Scott and Metro Boominfrom the album Assume Form Released17 January 2019GenreTrapLength3:13LabelPolydorSongwriter(s) James Blake Jacques Webster II Leland Wayne Producer(s) James Blake Metro Boomin Dre Moon Wavy Dan Foat James Blake singles chronology Don't Miss It (2018) Mile High (2019) Lullaby for My Insomniac (2019) Travis Scott singles chronology Yosemite(2018) Mile H...

Schlacht bei Siegburg Teil von: Erster Koalitionskrieg Datum 1. Juni 1796 Ort Siegburg Ausgang Französischer Sieg Konfliktparteien Frankreich 1804 Frankreich Habsburgermonarchie Österreich Befehlshaber Frankreich 1804 Obergeneral Jean-Baptiste Kléber Habsburgermonarchie Prinz August von Württemberg Truppenstärke ca. 20.000 ca. 7.000 Verluste n.B. 2400 Schlachten und Belagerungen des Ersten Koalitionskrieges (1792–1797) 1792 Verdun – Thionville – Valmy – Lille – Mainz ...

بيوضة الغربية  - منطقة سكنية -  تقسيم إداري البلد الأردن  المحافظة محافظة البلقاء لواء لواء قصبة السلط قضاء قضاء العارضة السكان التعداد السكاني 461 نسمة (إحصاء 2015)   • الذكور 227   • الإناث 234   • عدد الأسر 94 معلومات أخرى التوقيت ت ع م+02:00  تعديل مصدري - تعدي...

Iranian poet and scholar This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (May 2017) (Learn how and when to remove this template message) Mohammad-Taqi Baharمحمدتقی بهارBorn10 December 1886Mashhad, IranDied22 April 1951(1951-04-22) (aged 64)Tehran, IranOccupationPoet, politician and journalistLiterary movementPersian literatureNotable worksTāri...

أرمني يهودي. تاريخ اليهود في أرمينياالتعداد الكليالتعداد 300–500[1]مناطق الوجود المميزةالبلد أرمينيا اللغات العبرية، الأرمنيةالدين اليهودية، أقلية ملحدة ومسيحيةالمجموعات العرقية المرتبطةفرع من يهود مجموعات ذات علاقة يهود آخرون (أشكناز، سفارديون، مزراحيون)، يهود ألم...

This article is about the local government area. For the regional town, see Cowra, New South Wales. Local government area in New South Wales, AustraliaCowra ShireNew South WalesShire offices in CowraLocation in New South WalesCoordinates33°49′S 148°44′E / 33.817°S 148.733°E / -33.817; 148.733Population 12,460 (2016 census)[1] 12,767 (2018 est.)[2] • Density4.434/km2 (11.484/sq mi)Established1980Area2,810 km2 (1,084.9&#...

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (أكتوبر 2022) ديف فيلب معلومات شخصية الميلاد 8 يوليو 1960 (63 سنة)  مركز اللعب حارس مرمى  الجنسية المملكة المتحدة  معلومات النادي النادي الحالي St Blazey A.F.C. [الإنجليزي...

Airport in Russia This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages) This article may need to be rewritten to comply with Wikipedia's quality standards. You can help. The talk page may contain suggestions. (February 2014) This article needs to be updated. Please help update this article to reflect recent events or newly available information. (June 2022) (Learn how and when to remove this te...

Masahiro Usui碓井 将大PekerjaanAktorTahun aktif2007—sekarangSitus webhttp://www.watanabepro.co.jp/usuimasahiro/ Masahiro Usui (碓井 将大code: ja is deprecated , Usui Masahiro, lahir 3 Desember 1991) adalah seorang aktor asal Jepang. Dia mulai berkarier di dunia film sejak tahun 2007, dan dia dikenal dengan peran-perannya dalam serial tokusatsu dan drama: sebagai Hant Jou / Go-on Green dalam serial Super Sentai Engine Sentai Go-onger. Dia juga sering memainkan peran-peran kecil ...

Бетон — термін, який має кілька значень. Ця сторінка значень містить посилання на статті про кожне з них.Якщо ви потрапили сюди за внутрішнім посиланням, будь ласка, поверніться та виправте його так, щоб воно вказувало безпосередньо на потрібну статтю.@ пошук посилань сам...

Ini adalah nama Batak Toba, marganya adalah Sinaga. Reynhard SinagaFoto Reynhard yang dirilis oleh kepolisian InggrisLahirReynhard Tambos Maruli Tua Sinaga[1]19 Februari 1983 (umur 40)Jambi, IndonesiaAlmamaterUniversitas IndonesiaUniversitas ManchesterUniversitas LeedsHukuman kriminalPenjara seumur hidup (88 hukuman seumur hidup bersamaan)Orang tuaSaibun Sinaga (ayah)Normawati Silaen (ibu)AlasanPemerkosaan, serangan seksual, percobaan pemerkosaan, dan serangan seksual dengan pene...

Railway station in Hamburg, Germany ReeperbahnHamburg S-Bahn stationReeperbahn stationGeneral informationLocationReeperbahn20359 Hamburg, GermanyCoordinates53°32′58″N 9°57′21″E / 53.54944°N 9.95583°E / 53.54944; 9.95583Owned byHamburger HochbahnPlatforms1 island platformTracks2 side tracksConstructionStructure typeUndergroundPlatform levels2AccessibleNoOther informationStation codeS-Bahn: ds100: ARESDB station code: 5167Type: HpCategory: 4[1]Fare zo...

Edward R. Roybal Edward Ross Roybal (* 10. Februar 1916 in Albuquerque, New Mexico; † 24. Oktober 2005 in Pasadena, Kalifornien) war ein US-amerikanischer Politiker. Er vertrat den Bundesstaat Kalifornien von 1963 bis 1993 im US-Repräsentantenhaus als erster kalifornischer „Hispanic“ im Kongress seit 1879. Dort trat er vor allem für die Rechte der ethnischen Minderheiten in den USA ein. Edward Roybal wurde in New Mexico geboren und zog 1922 mit seinen Eltern nach Los Angeles, wo er di...

Азербайджано-ангольские отношения Ангола Азербайджан Посольство Анголы в Азербайджане Посол Аугусто да Силва Кунья[1] Адрес Москва, ул.Улофа Пальме, 6 Посольство Азербайджана в Анголе Адрес ЮАР Прочее Установлены 1 декабря 1994 Товарооборот 447 тыс долл. (2022)  Медиафай...

Abu Bakr al-Baghdadi أبو بكر البغداديإبراهيم عواد إبراهيم علي محمد البدري السامرائيAl-Baghdadi di kamp Bucca Irak 2004.Khalifah Negara Islam Irak dan Syam ke-1[1]Masa jabatan7 April 2013 – 31 Oktober 2019PenggantiAbu Ibrahim al-Hashimi al-QurashiAmir Negara Islam IrakMasa jabatan18 April 2010 – 7 April 2013PendahuluAbu Omar al-BaghdadiPenggantidirinya sendiri (sebagai Khalifah) Informasi pribadiLahirIbrahim Aw...

لأماكن أخرى بنفس الاسم، انظر الجرب (توضيح). قرية الجرب  - قرية -  تقسيم إداري البلد  اليمن المحافظة محافظة المحويت المديرية مديرية ملحان العزلة عزلة جبع السكان التعداد السكاني 2004 السكان 178   • الذكور 83   • الإناث 95   • عدد الأسر 46   • عدد المساكن 47 معلوما...

Vitthhal-Rukmini Temple in Pandharpur, Maharashtra, India This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Vithoba Temple – news · newspapers · books · scholar · JSTOR (April 2015) (Learn how and when to remove this template message) The neutrality of the style of writing in this article is disputed. Please d...

National Football League franchise in New Orleans, Louisiana New Orleans Saints Current seasonEstablished November 1, 1966; 57 years ago (1966-11-01)[1]First season: 1967Play in Caesars SuperdomeNew Orleans, LouisianaHeadquartered in Metairie, Louisiana New Orleans Saints logoNew Orleans Saints wordmarkLogoWordmarkLeague/conference affiliations National Football League (1967–present) Eastern Conference (1967–1969) Capitol Division (1967; 1969) Century Divisio...

Kembali kehalaman sebelumnya