Tech General

Big Endian vs. Little Endian: Key Comparisons

In endianness, big endian prioritizes high-value bytes first, while little endian favors low-value bytes first.

Hossein Ashtari Technical Writer

December 6, 2023

egg with shell cracked to reveal computer chip inside signifying endianness

Big-endian is defined as a byte order wherein the most significant byte (or ‘big end’) of a multibyte data value is stored at the lowest memory address.
Little-endian is defined as a byte order wherein the least significant byte (or ‘little end’) of a multibyte data value is stored at the lowest memory address.
This article covers the definition of endianness and the key comparisons between big-endian and little-endian.

What Is Endianness?
What Is Big Endian?
What Is Little Endian?
Big Endian vs. Little Endian: Key Differences

What Is Endianness?

Big-endian is a byte order wherein the most significant byte (or ‘big end’) of a multibyte data value is stored at the lowest memory address. On the other hand, little endian is a byte order wherein the least significant byte (or ‘little end’) of a multibyte data value is stored at the lowest memory address.

But what exactly does endianness mean? Let’s find out.

Origin of the term endianness

Israeli-American computer scientist Danny Cohen was the first to use the terms ‘big endian’ and ‘little endian’ in the context of computer science. His paper ‘On Holy Wars and a Plea for Peace,’ published in 1980, adopted ‘endianness’ as a term for data ordering from the writings of Jonathan Swift, the author of Gulliver’s Travels.

In Swift’s 1726 novel, one plotline describes a conflict among the Lilliputians based on how they shelled their boiled eggs. One sect cracked their eggs from the ‘big end’ while the other did so from the ‘little end.’ Cohen explicitly mentioned this inspiration for the terminology used in his 1980 note in the appendix.

With that fun fact out of the way, let’s dive into the technicalities.

Endianness definition

Computers only truly ‘understand’ machine language, also known as binary, which consists of only 0s and 1s. One ‘bit’ is either one 0 or a 1, and eight bits together create a ‘byte.’ These simple concepts are built upon to create complex computing systems.

Now, we finally have enough context to start defining endianness. Some data, including the English vowels a, e, i, o, and u, can be represented using a single byte. However, most data pieces require multiple bytes for representation. Endianness is the concept that defines how computers read and process bytes.

Just like humans read different languages in different orders (think English from left to right and Arabic from right to left), different computers read bytes differently, and this phenomenon is defined as endianness. Now, if you know only one language and so does everyone around you, left to right or right to left is never a problem. This is the case with computers, too. A computer doesn’t have issues if it never shares the data stored in its memory with another computer because every computer ensures internal consistency for its own data.

However, computers not connected to the internet are a technological rarity nowadays, and devices share more data with each other than ever before. Naturally, this data needs to be read in the same order across devices because if some computers read bytes from left to right and others do it from right to left, cross-platform communication issues will arise.

Endianness helps ensure that computers read the bytes stored in memory in a specific order. Two methods of reading bytes define endianness: big endian and little endian. In big-endian, the ‘big end’ of the byte is stored first. This means when multiple bytes are read, the first byte stored in the lowest memory address is the largest. Conversely, Little Endian stores the little end of the byte first. This means that during the reading process, the first byte stored at the lowest memory address is the smallest.

Endianness example

To understand endianness better, let’s take a number that requires multiple bytes to be represented and understand the big endian and little endian ways to represent it.

Big Endian and Little Endian Binary Arranged in Reading Order

Source: freeCodeCampOpens a new window

The real number 9,499,938 requires three bytes to be represented in binary format.

In the image above, the 0b at the beginning is a notation signifying that the number ahead is binary. Without the notation, some readers might not realize the difference between 1,100 as the real number (one thousand one hundred) and binary 1100.

Also worth noting is that bit ordering does not fluctuate; the order of the 1s and 0s within a byte stays constant regardless of endianness. It is the order of the bytes that changes. This also means that if only one bit is transmitted, no issues will arise as only one way exists to order a specific bit.

Endianness in the real world

The endianness of computers is generally an arbitrary decision made at the manufacturing level by the semiconductor vendor. However, this decision has a long-term effect on product lines. For instance, many modern computers are little-endian while most mainframe computers are big-endian.

As vendors upgrade their technologies, they almost always maintain the existing endianness for backward compatibility. For instance, designers of the Intel 8086, the predecessor of the x86 family, chose their endianness in the 1970s and continue to use it even today.

See More: What Are Haptics? Meaning, Types, and Importance

What Is Big Endian?

Big-endian is a byte-ordering system used in computer architecture to determine how multibyte data values, such as floating-point numbers or integers, are stored in memory. In big-endian, the most significant byte carrying the highest order bits is stored at the lowest memory address. The less significant bytes go to higher memory addresses for storage.

Big-endian is associated with older network protocols and computer architectures; however, it remains relevant in computing contexts such as dealing with compatibility and data interchange issues.

See More: What Is an NFT (Non-Fungible Token)? Definition, Working, Uses, and Examples

What Is Little Endian?

Like big endian, little endian is a byte-ordering system used in computer architecture to define how multibyte data values are arranged in memory. Here, the least significant byte that carries the lowest order bits goes to the lowest memory address for storage. The more significant bytes are stored at higher memory addresses.

Little-endian enjoys greater prominence in modern computing, including x86 and x86-64 architectures, as well as in software applications. It plays a critical role in data interchange, portability, and compatibility, especially in scenarios involving cross-platform development.

See More: What Is the Internet of Everything? Meaning, Examples, and Uses

Big Endian vs. Little Endian: Key Differences

The contrast between big-endian and little-endian is subtle. While the two byte-ordering systems are contrasting, they are also heavily intertwined within the concept of endianness.

Abstract concepts vs. raw data

To better understand the key comparisons between big-endian and little-endian, let’s start with the fundamental aspect, which is the difference between a real number and its representation in the form of raw data.

A number is abstract. It can be used for various purposes, such as maintaining a count of something. Think of a ‘dozen,’ signified by the number 12. Whether it is a dozen eggs or a dozen people, the idea of ‘12’ doesn’t change. Even across languages (for instance, ‘doce’ in Spanish) and other representation schemas (‘C’ in hexadecimal code), the concept stays the same.

In contrast, data is a physical concept. For instance, human writing on a piece of paper or a raw sequence of bytes stored in computer memory would be data. Unlike numbers, data has no intrinsic meaning and must be interpreted by the reader. If the reader is a human, this is rarely a problem as we can infer meaning from context. But if the reader is a computer, it might face a problem. After all, computers store raw data without understanding abstract concepts.

All information in a computer is ultimately broken down into 0s and 1s and stored in memory. When it’s time to read back the 0s and 1s, the computer does so and attempts to recreate an abstract concept using the raw data. Here, the assumptions made are vital; without them, the 0s and 1s would mean nothing.

This would not be a problem if a rule existed stating that all computers, platforms, and programs must use the same language. However, since that is not the case, different types of computers will interpret data differently.

When numbers turn into data

While there is a lot of scope for confusion among computers when it comes to data interpretation, most systems have a common starting point by agreeing on some data formats.

A bit can only be either one of two values: 1 (on) or 0 (off)
A byte consists of eight bits in a sequence.
In a byte, bits go from right to left and from Bit 0 to Bit 7 (totaling eight).
Bit 0 is the smallest and the rightmost, while Bit 7 is the largest and leftmost.

These basic format agreements serve as building blocks for data exchange. If computers read and store data at only one byte at a time, most of the confusion would disappear (and computers would become much less useful than they are today; however, this is just a hypothetical example).

A byte is a byte no matter what machine you’re on, and even the ordering of bytes stays constant across machines. Problems arise because not all data is ‘single-byte data’ like ASCII text. Much of our data today is stored using multiple bytes, like floating-point numbers or integers. This is where everything breaks down because no universal agreement on storing these sequences exists.

Let’s look at an example

Byte value

Consider a sequence of 4 bytes — E, F, G, and H — with each byte being made up of 8 bits and having a specific hex value.

Byte Name	E	F	G	H
Location	0	1	2	3
Value (hex)	0x12	0x34	0x56	0x78

In this example, ‘E’ is an entire byte with a hex value 0x12 and a binary value of 00010010. If ‘E’ is interpreted numerically, it would be the number ‘18’. However, no rule states that this value must be interpreted numerically; it can be interpreted as an ASCII character or even something entirely different.

Pointers

In programming languages such as C, a pointer is a number that references a memory location. Programmers use pointers to interpret the data stored at a particular location. For instance, when a C programmer casts a pointer to a certain type (such as a char * or int *), it tells the computer how to interpret the data at that location.

Look at the declaration below.

void *p = 0; // p is a pointer to an unknown data type

// p is a NULL pointer — do not dereference

char *c; // c is a pointer to a char, usually a single byte

Here, data cannot be extracted from ‘p’ as its type is unknown. The ‘p’ might be pointing at a letter, a number, an image, or something different. There is no way to know how many bytes to read or how to interpret them.

Now, check this out.

c = (char *)p;

This statement directs the computer to point to the same location as ‘p’ and parse the data into a single character. Here, ‘c’ would point to a memory location ‘0’ or byte ‘E.’ If ‘c’ were printed, the parsed value would be ‘E’ or hex ‘0x12’, as ‘E’ is a whole byte.

This example is platform-agnostic because, as mentioned earlier, all computers agree on defining a single byte. If a pointer is set to a single byte (here, ‘char *’), one byte at a time is read while walking through memory. The endianness of the computer would not matter here as any memory location can be examined, and every (modern) computer would return the same information.

Now, let’s understand the problem

A single byte can only contain 256 values, from 0 to 255. Therefore, most data types today contain multiple bytes, such as floating-point numbers and long integers. The problem arises when a computer attempts to read more than one byte at a time. When multibyte data is to be read, where should the biggest byte appear?

In a big-endian machine, the big end of the data is stored first. In the case of multiple bytes, the biggest byte is the first one with the lowest address. On the other hand, little endian machines store data little-end first, with the first byte being the smallest in case of multiple bytes.

Now, the four bytes (E F G H) are stored the same way whether the machine is big endian or little endian, i.e., ‘E’ is in memory location 0, ‘F’ is in memory location 1, and so on. This arrangement can be created by using the knowledge that bytes are machine-agnostic. Walking through memory, one byte at a time, and setting the needed values can be executed and will work on any machine:

c = 0; // point to location 0 (this will not work on a real machine)

*c = 0x12; // Set E’s value

c = 1; // point to location 1

*c = 0x34; // Set F’s value

… // repeat for G and H

Thus, bytes E, F, G, and H have been set up in locations 0, 1, 2, and 3.

Data interpretation

Finally, we’ve set enough context to look at an example with multibyte data. Here, we use ‘short int,’ a two-byte (16-bit) number ranging from 0 to 65535 if unsigned.

short s; // pointer to a short int (2 bytes) s = 0; // point to location 0; s is the value

Here, ‘s’ points to a short and looks at byte location 0, containing ‘E.’

What happens when the value at ‘s’ is read?

In a big-endian machine,

A short is two bytes, and when read off by the machine, it is interpreted as ‘location s is address 0 (E, or 0x12)’ and ‘location s + 1 is address 1 (F, or 0x34)’.
Since the first byte is the biggest, the number must be 256 * byte 0 + byte 1, 256*E + F, or 0x1234. The first byte is multiplied by 256 (2^8) because it had to be shifted over 8 bits.

In a little-endian machine,

A short is two bytes, which will be read off like in the big-endian machine: ‘location s is 0x12’ and ‘location s + 1 is 0x34’.
Here, the first byte is the smallest, so the value of the short is byte 0 + 256 * byte 1, or 256*F + E, which is 0x3412.

The problem does not lie in interpreting the short or byte locations. Location 0 and location 1 mean the same on both machines, and a short is two bytes in both cases. Both machines start from location ‘s’ and read the memory while going upward.

And yet, the big-endian machine interprets ‘s’ as 0x1234, and the little-endian machine interprets the same as 0x3412. The same data results in two completely different numbers.

UNIX vs. NUXI

The NUXI Problem is a good example of the issues faced with byte order. In a nutshell, it describes a phenomenon wherein UNIX, when stored on a big-endian machine, might show up as NUXI when transferred to a little-endian machine.

Imagine four bytes (U, N, I, and X) stored as two shorts: UN and IX. Just like E, F, G, and H in the example above, each letter here is an entire byte. To store the two shorts, the following would be written:

short *s; // pointer to set shorts

s = 0; // point to location 0

*s = UN; // store first short: U * 256 + N (imaginary code)

s = 2; // point to next location

*s = IX; // store second short: I * 256 + X

Regardless of the platform, if ‘UN’ is stored on a machine, and the machine is then asked to read it back, it would be ‘UN.’ However, if ‘char *’ is used to move through the memory one byte at a time, the order could vary.

On a big-endian machine, the order would be:

Byte	U	N	I	X
Location	0	1	2	3

Here, U is the bigger byte in ‘UN’ and is therefore stored first. Similarly, in ‘IX,’ I is the bigger byte and is therefore stored first.

However, the byte order would be different on a little-endian machine:

Byte	N	U	X	I
Location	0	1	2	3

Here, ‘N’ is the littlest byte in ‘UN’ and is stored first. The bytes may be stored backward in memory, so to speak, but the little-endian machine interprets them accurately when reading them back (as machines are internally consistent).

The NUXI problem arises if data is exchanged between a big-endian and a little-endian machine.

Data exchange

So, does all this mean data cannot be exchanged between a big-endian and a little-endian machine? Well, exchanging data between big-endian and little-endian machines requires addressing byte order differences.

One solution is adopting a common format for network communication, often called ‘network order,’ which is typically big-endian. Conversion functions such as ‘hton’ and ‘ntoh’ ensure data consistency across platforms, even on big-endian machines.

Alternatively, using a Byte Order Mark (BOM), such as 0xFEFF, as a prefix can signify the data’s byte order. If it matches a machine’s byte order, no translation is needed. Otherwise, conversion is required. However, the BOM method increases data overhead and may lead to issues if forgotten or misinterpreted.

This approach is notably used in Unicode, which utilizes it for multibyte character encodings. XML avoids these complexities by defaulting to UTF-8, where endianness doesn’t matter for single bytes, simplifying cross-platform data exchange.

Managing byte order discrepancies is crucial in low-level networking to avoid data interpretation errors and ensure data portability.

Applications

Big-endian and little-endian are both widely used in digital electronics. The CPU typically determines the endianness in use.

For instance, many reduced instruction set computers (RISC)-based platforms, IBM 370 mainframes, and Motorola microprocessors are big-endian. Transmission control protocol/internet protocol (TCP/IP) also leverages the big-endian approach, which is why big-endian is also known as ‘network order.’ Conversely, DEC Alphas, Intel processors, and many programs that run on them are little endian.

Apart from big-endian and little-endian, mixed forms of endianness also exist. For instance, VAX floating point uses a mixed endian approach that is also known as middle endian. Bi-endian processors can operate in either big-endian mode or little-endian mode and even switch as needed.

Language compilers such as Java or FORTRAN must know the endianness in which the object code they develop would be stored. Converters can change endianness as required.

Takeaway

Big-endian and little-endian are byte-order systems. Each form of endianness has its advantages. For instance, big-end systems carry out data storage in memory like humans generally think about data: from left to right. This makes low-level debugging easier. On the other hand, little-endian machines allow users to read the lowest byte first and very easily check whether a number is odd or even.

Let’s end this article by addressing some burning questions readers might have: Why does endianness as a concept exist? Why does a single system not exist, and why do different computers have to be different?

The answer is simple — the same reason all humans don’t speak the same language, and some languages are written from left to right, while others are written from right to left. Like human communications, endianness communication systems developed independently, and the need to interact developed later!

Did this article help you understand endianness and the key comparisons between big-endian and little-endian? Let us know on FacebookOpens a new window , XOpens a new window , or LinkedInOpens a new window ! We’d love to hear from you.

Image Source: Shutterstock

MORE ON TECH GENERAL

Software Development

Hossein Ashtari

Technical Writer

opens a new window opens a new window opens a new window

Interested in cutting-edge tech from a young age, Hossein is passionate about staying up to date on the latest technologies in the market and writes about them regularly. He has worked with leaders in the cloud and IT domains, including Amazon—creating and analyzing content, and even helping set up and run tech content properties from scratch. When he’s not working, you’re likely to find him reading or gaming!

Do you still have questions? Head over to the Spiceworks Community to find answers.

Big Endian vs. Little Endian: Key Comparisons

Table of Contents