_                _   _                 ____            _     _ 	
   / \   _ __   ___ | |_| |__   ___ _ __  |  _ \ _ __ ___ (_) __| |	
  / _ \ | '_ \ / _ \| __| '_ \ / _ \ '__| | | | | '__/ _ \| |/ _\` |	
 / ___ \| | | | (_) | |_| | | |  __/ |    | |_| | | | (_) | | (_| |	
/_/   \_\_| |_|\___/ \__|_| |_|\___|_|    |____/|_|  \___/|_|\__,_|	
                                                                bbs
  XQTRs lair...
Home // Blog // NULL emag. // Files // Docs // Tutors // GitHub repo

In the first issue we learned some basic stuff about HEX editors. Before
we continue to more practical issues we must learn how computers store
numbers in files. It sounds dumb, but there are more than one ways to do
so and its confusing, in some cases. Some basic stuff from Wikipedia.

---------------------------------------------------------------------------

From Wikipedia, the free encyclopedia Jump to navigation Jump to search

Endianness refers to the sequential order in which bytes are arranged
into larger numerical values when stored in memory or when transmitted
over digital links. Endianness is of interest in computer science because
two conflicting and incompatible formats are in common use: words may be
represented in big-endian or little-endian format, depending on whether
bits or bytes or other components are ordered from the big end (most
significant bit) or the little end (least significant bit).

In big-endian format, whenever addressing memory or sending/storing
words bytewise, the most significant byte—the byte containing the most
significant bit—is stored first (has the lowest address) or sent first,
then the following bytes are stored or sent in decreasing significance
order, with the least significant byte—the one containing the least
significant bit—stored last (having the highest address) or sent last.

Little-endian format reverses this order: the sequence
addresses/sends/stores the least significant byte first (lowest address)
and the most significant byte last (highest address). Most computer systems
prefer a single format for all its data; using the system's native format
is automatic. But when reading memory or receiving transmitted data from a
different computer system, it is often required to process and translate
data between the preferred native endianness format to the opposite format.

The order of bits within a byte or word can also have endianness (as
discussed later); however, a byte is typically handled as a single
numerical value or character symbol and so bit sequence order is obviated.

Both big and little forms of endianness are widely used in digital
electronics. The choice of endianness for a new design is often arbitrary,
but later technology revisions and updates perpetuate the existing
endianness and many other design attributes to maintain backward
compatibility. As examples, the IBM z/Architecture mainframes and the
Motorola 68000 series use big-endian while the Intel x86 processors use
little-endian. The designers of System/360, the ancestor of z/Architecture,
chose its endianness in the 1960s; the designers of the Motorola 68000 and
the Intel 8086, the first members of the 68000 and x86 families, chose
their endianness in the 1970s.

Big-endian is the most common format in data networking; fields in the
protocols of the Internet protocol suite, such as IPv4, IPv6, TCP, and UDP,
are transmitted in big-endian order. For this reason, big-endian byte order
is also referred to as network byte order. Little-endian storage is popular
for microprocessors, in part due to significant influence on microprocessor
designs by Intel Corporation. Mixed forms also exist; for instance, in VAX
floating point, the ordering of bytes in a 16-bit word differs from the
ordering of 16-bit words within a 32-bit word. Such cases are sometimes
referred to as mixed-endian or middle-endian. There are also some bi-endian
processors that operate in either little-endian or big-endian mode.

---------------------------------------------------------------------------

        Big Endian            Little Endian
        
                         32bit int
                 0A0B0C0D         0A0B0C0D   
 mem             | | | |          | | | |           mem
      .----.     | | | |          | | | |    .----.
 a    | 0A | <---' | | |          | | | '--> | 0D | a
      '----'       | | |          | | |      '----'
      .----.       | | |          | | |      .----.
 a+1  | 0B | <-----' | |          | | '----> | 0C | a+1
      '----'         | |          | |        '----'
      .----.         | |          | |        .----.
 a+2  | 0C | <-------' |          | '------> | 0B | a+2
      '----'           |          |          '----'
      .----.           |          |          .----.
 a+3  | 0D | <---------'          '--------> | 0A | a+3
      '----'                                 '----'



So... in big-endian you will see the number stored inside the file
exactly as you see it, but in little-endian it will be in reverse order.
Always remember that. 

If we do a search for a number, with a hex editor and this number is stored
in little-endian format then we also have to search for the number in 
reverse order. As in the example above, if we want to look for number
0A0B0C0D in a file, which is saved in little endian then in the hex editor
we will input the search string as 0D0C0B0A. If we don't, probably we wont 
find a match.

Its very important and we need it for the next lesson in hex editing files ;)