Binary Data Basics
– A discrete variable with only one state contains zero information.
– The bit, with two possible values, is the standard unit of information.
– The number of states in a collection of n bits is 2^n.
– The number of states in a collection of discrete variables depends exponentially on the number of variables.
– Ten bits have more states (1024) than three decimal digits (1000).
– Binary data consists of categorical data with two possible values.
– Binary data is often used to represent the outcome of an experiment or a yes-no question.
– Binary data is nominal data and cannot be compared numerically.
– Binary data can also represent the presence or absence of a feature.
– Binary data can be used to represent political party choices in elections.
Binary Variables
– Binary variables have two possible values.
– Independent and identically distributed binary variables follow a Bernoulli distribution.
– Total counts of i.i.d. binary variables follow a binomial distribution.
– Binary data need not come from i.i.d. variables.
– The distribution of binary variables may not be binomial if they are not i.i.d.
Counting and Conversion of Binary Data
– Binary data can be converted to count data by assigning 1 for a value that occurs and 0 for a value that does not occur.
– Grouping binary data allows for counting the occurrences of each value.
– Binary data can be simplified to a single count by considering one value as success and the other as failure.
– Count data with n=1 is binary data.
– Counts of i.i.d. binary variables follow a binomial distribution.
Binary Regression
– Binary regression analyzes predicted outcomes that are binary variables.
– Binomial regression can be used when binary data is converted to count data.
– Logistic regression and probit regression are common methods for binary regression.
– Multinomial regression models counts of i.i.d. categorical variables with more than two categories.
– Non-i.i.d. binary data can be modeled using more complex distributions like the beta-binomial distribution.
Binary Representation and Formats
– 1 and 0 represent two different voltage levels.
– Computers understand 1 as higher voltage and 0 as lower voltage.
– Different methods can be used to store two voltage levels.
– Magnetic tapes with a coating of ferromagnetic material can store 1 and 0 data.
– The orientation of magnetic domains determines whether it is interpreted as 1 or 0.
– Textual data can be represented in binary format, such as compressed or formatted files.
– Image data can sometimes be represented in textual format, like the X PixMap image format.
– Binary formats are more specific for representing data without interpretation.
– Textual formats may include formatting codes and other text-related elements.
– The choice between binary and textual formats depends on the nature of the data.
This article needs additional citations for verification. (April 2019) |
Binary data is data whose unit can take on only two possible states. These are often labelled as 0 and 1 in accordance with the binary numeral system and Boolean algebra.
Binary data occurs in many different technical and scientific fields, where it can be called by different names including bit (binary digit) in computer science, truth value in mathematical logic and related domains and binary variable in statistics.