Skip to main content
Glossary Term

Binary data

Binary Data Basics - A discrete variable with only one state contains zero information. - The bit, with two possible values, is the standard unit of information. - The number of states in a collection of n bits is 2^n. - The number of states in a collection of discrete variables depends exponentially on the number of variables. - Ten bits have more states (1024) than three decimal digits (1000). - Binary data consists of categorical data with two possible values. - Binary data is often used to represent the outcome of an experiment or a yes-no question. - Binary data is nominal data and cannot be compared numerically. - Binary data can also represent the presence or absence of a feature. - Binary data can be used to represent political party choices in elections. Binary Variables - Binary variables have two possible values. - Independent and identically distributed binary variables follow a Bernoulli distribution. - Total counts of i.i.d. binary variables follow a binomial distribution. - Binary data need not come from i.i.d. variables. - The distribution of binary variables may not be binomial if they are not i.i.d. Counting and Conversion of Binary Data - Binary data can be converted to count data by assigning 1 for a value that occurs and 0 for a value that does not occur. - Grouping binary data allows for counting the occurrences of each value. - Binary data can be simplified to a single count by considering one value as success and the other as failure. - Count data with n=1 is binary data. - Counts of i.i.d. binary variables follow a binomial distribution. Binary Regression - Binary regression analyzes predicted outcomes that are binary variables. - Binomial regression can be used when binary data is converted to count data. - Logistic regression and probit regression are common methods for binary regression. - Multinomial regression models counts of i.i.d. categorical variables with more than two categories. - Non-i.i.d. binary data can be modeled using more complex distributions like the beta-binomial distribution. Binary Representation and Formats - 1 and 0 represent two different voltage levels. - Computers understand 1 as higher voltage and 0 as lower voltage. - Different methods can be used to store two voltage levels. - Magnetic tapes with a coating of ferromagnetic material can store 1 and 0 data. - The orientation of magnetic domains determines whether it is interpreted as 1 or 0. - Textual data can be represented in binary format, such as compressed or formatted files. - Image data can sometimes be represented in textual format, like the X PixMap image format. - Binary formats are more specific for representing data without interpretation. - Textual formats may include formatting codes and other text-related elements. - The choice between binary and textual formats depends on the nature of the data.