Friday 26 October 2007

Bits and Bytes

Bits and Bytes,
a very basic tutorial with introduction to steganography.

First you should know that this information here is kept very basic and not very accurate. If you find some horrible mistakes please leave a comment. I tried to use easy vocabularies because this post is targetted on beginners.

I. A computer is like a big calculator.
The only thing it can do is calculate numbers, which are stored in binary format.
The binary format, numbers represented in base 2, are used because you can easily represent the two possible binary digits (0 and 1) with electric power (voltage on or voltage off). Everything what your computer does is just computing lots of these binary numbers, and send or recieve those numbers to/from other components. Your monitor for example shows colors, that represent certain numbers and your keyboard sends some numbers to your computer when you press a key.

A computer can only compute a small range of bits at once, which depends on the processor (usually 32 or 64 bits today).
Also it can address its memory only in byte steps (byte aligned).
A group of 8 bits is called a byte and can hold 256(2^8) different numbers (from 0 to 255).

In the beginning of computers there was the need for a codetable that converts numbers to letters.
This codetable is called ASCII-Table .
It uses 8 bits (1 byte) to represent one character.
An example: You can see that the number 65 represents an 'A'.
65 in binary is 01000001. (Try out the windows calculater and use the scientific view, you can convert numbers from binary to decimal beside other things with it)

- - - - - - - - - -

II. As you migth know, steganography is the art of hiding a message.
For example using invisible ink and write with that ink between the normal lines.

In the age of computers there are lots of ways to hide data inside other data,
but lets focus on very basic stuff; Bits and Bytes.

In computer age you can even hide letters due hiding its bits.
Ill "hide" the string ABC in some garbage data applying this rule:
Alernately take one bit of the message and one bit of the garbage.
like: M = Message; G = Garbage
MGMGMGMGMGMG....

ABC: 01000001 01000010 01000011
Garbage: 11110000 11110000 11110000
Mixed up: 011101010000001001110101000010000111010100001010
MGMGMGMGMGMGMGMG.....

You can easily decode this by using only every second bit, starting with the first.

You could also use your own codetable to represent letters, like A = 1, B = 2 and so on, what will need less bits than ascii.
Also its possible to just not use the highest bit (which is always 0 in standard ascii), or even not use the leading '01', cause all letters start with a binary '01'.

No comments: