Originally written on July 10, 2014 by Tom Igoe

Last modified on August 27, 2016 by Tom Igoe

Last modified on August 27, 2016 by Tom Igoe

*Adapted from Variables*

Contents

This tutorial explains how computer programs organize information in computer memory using variables. All computer programming languages use variables to manage memory, so it’s useful to understand this no matter what programming language or computer you’re using. Although the following was written with microcontrollers and physical computing applications in mind, it applies to programming in general.

The programming language examples below use a syntax based on the programming language C. That same syntax is used by many other languages, including Arduino (which is written in C), Java, Processing (which is written in Java), JavaScript, and others.

A computer’s memory is basically a matrix of switches, laid out in a regular grid, not unlike the switches you see on the back of a lot of electronic gear:

Each switch represents the smallest unit of memory, a **bit.** If the switch is on, the bit’s value is 1. If it’s off, the value is 0. Each bit has an address in the grid. We can envision a grid that represents that memory like this:

bit0 | bit1 | bit2 | bit3 | bit4 | bit5 | bit6 | bit7 |

bit8 | bit9 | bit10 | bit11 | bit12 | bit13 | bit14 | bit15 |

– | – | – | – | – | – | – | – |

– | – | – | – | – | – | – | – |

– | – | – | – | – | – | – | – |

– | – | – | – | – | – | – | – |

– | – | – | – | – | – | – | – |

– | – | – | – | – | – | – | – |

All data that’s stored in computer memory is stored in these arrays of bits.

**So if a bit can be only 0 or 1, how do we get values greater than 1?**

When you count normally, you count in groups of ten. This is because you have ten fingers. So to represent two groups of ten, you write “20”, meaning “2 tens and 0 ones”. This counting system is called **base ten,** or **decimal notation**. Each digit place in base ten represents a power of ten: 100 is 10^{2}, 1000 is 10^{3}, etc.

Now, imagine you had only two fingers. You might count in groups of two. This is called **base two**, or **binary notation***.* So two, for which you write “2” in base ten, would be “10” in base two, meaning one group of two and 0 ones. Each digit place in base two represents a power of two: 100 is 2^{2}, or 4 in base ten, 1000 is 2^{3}, or 8 in base ten, and so forth.

Any number you represent in decimal notation can be converted into binary notation by simply regrouping it in groups of two. Once you’ve got the number in binary form, you can store it in computer memory, letting each binary digit fill a bit of memory. So the number 238 in decimal notation would be 11101110 in binary notation. The bits in memory used to store* *238 would look like this:

1 | 1 | 1 | 0 | 1 | 1 | 1 | 0 |

Programming languages organize computer memory by breaking the grid of bits up into smaller chunks and labeling them with names. Those names are called variables, and they refer to a location in the computer’s memory. When you ask for the value of a variable, you’re asking what the states of the switches in that location in memory are.

If you think of your program as a set of instructions, then variables are the words that you use to describe what those instructions act upon.

For example:

When the user has pushed the button two times...

For this you need a variable called buttonPushed, and you need to check when it’s equal to 2:

if (buttonPushed == 2)

When you want to store a piece of something like the number of times a button’s been pushed in the computer’s memory, you give it a name and a **data type,** which states how much memory you intend to use. You usually give it an initial value as well. This is called **declaring the variabl**e, and it looks like this:

int sensorValue = 234; byte buttonPushed = 15; long timeSinceStart = 10324; boolean isOpen = false;

Every variable has a **data type**. The data type of a variable determines how much of the computer’s memory the variable will occupy. Different programming languages have different data types. The examples above use data types from the C programming language that Arduino uses.

The first one, `int sensorValue`, is an integer data type. Ints in C take up 16 bits, so they can contain 2^{16} different values. Ints can only contain integers, but they can be positive or negative, so the variable sensorValue above could range from -32,768 to 32,767. That’s a range from -2^{15} to 2^{15}, with one bit used to store the plus or minus sign.

The second, `buttonPushed`, is a `byte` data type. Bytes take up 8 bits, and can therefore store 2^{8} or 256 different values, from 0 to 255. Bytes in Arduino are **unsigned**, meaning that they can only be positive numbers.

The third, `timeSinceStart` is a `long int`eger type, for storing very large values. In this instance, it might be storing the number of milliseconds since your program started, which can get big very quickly. Long ints are signed, and can range from -2,147,483,648 to 2,147,483,647. That’s 2^{32} possible values.

The fourth, `isOpen`, is a boolean variable. Booleans can only true or false, and ideally take up just one bit in memory (though most programming languages use a whole byte for convenience).

The variables above might look like this in the computer’s memory, if each cell were a bit of memory:

- SensorValue: blue
- buttonPushed: pink
- timeSinceStart: orange
- isOpen: green

0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 |

0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |

0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |

0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 |

0 | – | – | – | – | – | – | – |

When you declare a variable, the microcontroller picks the next available address and sets aside as many bits are needed for the data type you declare. If you declare a byte, for example, it sets aside 8 bits. An integer gets 16 bits. A string gets one byte (eight bits) for every character of the string, and a byte to end the string.

The variable types above are all for whole numbers, or integers. But how do you store a number like 3.1415 or 2.7828 or other fractional numbers? These are called **floating-point numbers** in the programming world, and they’re a special type of variable called a **float**. In Arduino, floats are actually 32-bit numbers, stored in 4 bytes of memory. A few of those bits are used to store the decimal point position.

How do you know what data type to choose when declaring variables? It depends on two factors: what you’re going to use the variables for, and what functions you plan to use on them. First, consider how likely the numbers you might store are likely to be. For example, if you’re counting button pushes, you’re unlikely to get more than a few hundred in a few minutes, so an int or a byte might be fine. But for a number that might get large, like counting the number of milliseconds since some past event, the number could get very large, so you might need a long int.

Different built-in functions of a programming language will require different data types as parameters, so when you can, use data types that match the functions you plan to use. For example, if you were using a variable to store the results of Arduino’s `millis()` function, you should use a long int, because `millis()` returns that data type.

When you add, subtract, multiply, or divide with variables in a computer program, the results you get depend on the variable types you used. For example, if you ran the function below:

int voltage = 5; int divider = 2; int newVoltage = voltage / divider;

You might think that newVoltage = 2.5, right? Wrong. Because you used ints, the fractional part is gone, so the result would be 2. Here’s another:

byte buttonPushes = 254; buttonPushes = buttonPushes + 4;

After this, you’d expect that buttonPushes = 258, right? Wrong again! Because you used a byte, you can’t store a value larger than 255, so when the result is larger than that, the number rolls over to the lowest possible value again. The result would be 2.

Wait, what?

Look at it this way. The highest value you can store in a byte is 255. Therefore, if you try to store 256, it rolls over to 0. 257 rolls over to 1. And 258 rolls over to 2. So 254 + 4 in a byte variable yields 2. If you used an int instead of a byte, then you’d get the result you expect (258) because an int can hold values larger than 255.

There are three notation systems used most commonly in programming languages to represent numbers: **binary** (base two), **decimal** (base ten), and **hexadecimal** (base sixteen). In hexadecimal notation, the letters A through F represent the decimal numbers 10 through 15. Furthermore, there is a system of notation called ASCII, which stands for American Standard Code for Information Interchange, which represents most alphanumeric characters from the romanized alphabet as number values. More on ASCII can be found in the pages on serial communication. For more, see this online table representing the decimal numbers 0 to 255 in decimal, binary, hexadecimal, and ASCII. While you can work mostly in decimal notation, there are times when it’s more convenient to represent numbers in ms other than base 10.

Here’s a chart showing a few number values in the different bases, and the different notation forms:

Decimal value |
Hexadecimal |
Binary |

3 | 0x03 | `0b11` |

12 | 0x0C | `0b1100` |

45 | `0x2D` |
`0b101101` |

234 | `0xEA` |
`0b11101010` |

1000 | `0x3E8` |
`0b1111101000` |

Because the values are all bits in the computer’s memory, you can use all of these notation systems interchangeably. Here are a few examples:

if (colorValue == 0xFF); // check to see if the color value is 255 // add 5 to 0x90. Result will be 0x95: int channelNumber = 5; int midiCommand = 0x90 + channelNumber;

Variables are **local** to a particular function in your code if they are declared in that function. Local variables can’t be used by functions outside the one that declares them, and the memory space allotted to them is released when the function ends. Variables are **global** when they are declared at the beginning of a program, outside all functions. Global variables are accessible to all functions in a the program, and their value is maintained for the duration of the program. Usually you use global variables for values that will need to be kept in memory for future use by other functions, and local variables when you know the value won’t be used outside that function. In general, it’s better to default to local variables when you can, to manage memory more efficiently. Here’s a typical example:

int oldButtonPush = 0; // global variable void setup() { Serial.begin(9600); } void loop() { int buttonPush = digitalRead(3); // local variable if (buttonPush != lastButtonPush) { // the button changed. Do something here // then store the current button push state // in the global variable for the next time // through the loop: oldButtonPush = buttonPush; } }

In this example, the variable `buttonPush` is local to the loop function. You couldn’t read it in the setup, or any other function. The variable `oldButtonPush`, on the other hand, is global, and can be read by any function. In the example above, the local variable is used to read the latest state of a digital input, and then later, the value is put into the global variable so that you can get a new reading and compare it to the old one.

In addition to variables, every programming language also includes **constants**, which are simply variables that don’t change. They’re a useful way to label and change numbers that get used repeatedly within your program. For example, imagine you’re writing a program that runs a servo motor. Servo motors have a minimum and maximum pulse width that doesn’t change, although each servo’s minimum and maximum might be somewhat different. Rather than change every occurrence of the minimum and maximum numbers in the program, we make them constants, so we only have to change the number in one place.

You don’t have to use constants in your programs, but they’re handy to know about, and you will encounter them in other people’s programs.

In C and therefore in Arduino, there are two ways you can declare constants. You can use the const keyword, like so:

const int LEDpin = 3; const int sensorMax = 253;

Or you can use **define**:

#define LEDPin 3 #define sensorMax 253

Defines are always preceded by a #, and are don’t have a semicolon at the end of the line. Defines always come at the beginning of the program. They actually work a bit like aliases. What happens is that you define a number as a name, and before compiling, the compiler checks for all occurrences of that name in the program and replaces it with the number. This way, defines don’t take up any memory, but you get all the convenience of a named constant. There are several defines in the libraries of the Arduino core libraries, so it’s preferable to use const instead of #define for constants.

For more on variables in Arduino, see the variable reference page.