Post

Steganography with Python

Learn how to play with Crypto to make something useful

Steganography with Python

Have you ever wondered how to hide your data away from prying eyes? Of course, there are many sophisticated techniques out there to accomplish the task, but there are some simple yet effective techniques too. In this write-up, I will introduce you to one such way of hiding data: Steganography.

What is Steganography?

Steganography is basically a technique used to high data in “plain sight”. Yes, that’s right. Now, you might wonder, how can the data be hidden if it is present in plain sight? Well to do that, we use another file as a carrier of the hidden data. This carrier can be anything like a photo, audio, video, or any other file. In this tutorial, I will show you a simple steganography technique to hide plaintext messages in image files.

Basic Info

We can express any ASCII string as a binary string. The range of ASCII values is [0, 255]. So, we need at most 8 bits to convert an ASCII character to its binary representation. We then concatenate the binary representations of all the characters to form the final binary string.

Strategy

First of all we read the image as an array. Then after reshaping it to a 1-D array, we can use each element of this array to represent a bit as follows:

  1. If the bit to be stored is one, we ensure that the element is odd:
    • If the element is already odd, no change is made.
    • If the element is even, we reduce its value by 1 and make it even. Also, if the value is 0, we increase it by 1 to ensure that the value remains in the range [0, 255].
  2. If the bit to be stored is 0, we ensure that the element is even:
    • If the element is already even, no change is made.
    • If the element is odd, we reduce its value by 1 to make it even.

Clearly, in this way, we can store any binary string. Note that increasing/decreasing the color intensity value by 1 doesn’t affect the picture much In fact, it is virtually impossible to differentiate this doctored image from the real one by just looking at the image.

JPEG images use compression algorithms. This means that after saving the image, the color intensity values might be slightly different when we read the image next time. But, we need the exact values for our strategy to work. Therefore, we save the final image in the png format which preserves the exact color intensity values.

Now, that you understand the strategy, let’s get our hands dirty.

Show me the Code!

First of all, we make the necessary imports. We use cv2 to read/save image files. And we use sys to accept command-line arguments from the user.

1
2
import cv2
import sys

After that, we write a helper function that converts an ASCII string to a binary string.

1
2
3
4
5
6
7
def str_to_binary(message):
  binary_string=''
  for char in message:
    num=ord(char)
    binary=format(num, '08b')
    binary_string+=binary
  return binary_string

We will also need a function to convert a binary string to an ASCII string during decryption.

1
2
3
4
5
6
7
8
def binary_to_str(binary_message):
  message=''
  for i in range(0, len(binary_message), 8):
    binary=binary_message[i:i+8]
    num=int(binary, 2)
    char=chr(num)
    message+=char
  return message

Now, coming to the encryption method, first of all, we read the image as an array. Then the message string is converted to a binary string using the str_to_binary function defined above. Note that, we will also need to know the length of the binary string during decryption. So, to do that, we can reserve the first few bits (16 in this implementation) for storing the length of the binary string. As discussed earlier, we will use the elements of the image array to store the bits.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def encryption(name, message, result):
  #reading the image
  img=cv2.imread(name)
  data=img
  data=data.reshape((-1))

  #processing the message
  msg=str_to_binary(message)
  length=len(msg)
  binary_length=format(length, '016b')
  msg=binary_length+msg

  #exiting the function if message cant be stored completely
  if data.size<len(msg):
    print("Image is too small")
    return

  #using array elements to store bits
  for i in range(len(msg)):
    if data[i]%2==int(msg[i]):
      continue
    if data[i]==0 and msg[i]=='1':
      data[i]=1
    else:
      data[i]-=1
  #reshaping the array to its initial shape and then saving the image
  data=data.reshape((img.shape))
  cv2.imwrite(result, data)

For decryption, first of all, we extract the length of the string using the first 16 elements and after that use the elements to get the binary string. This binary string is then converted to the ASCII message using the binary_to_str function defined above.

1
2
3
4
5
6
7
8
9
10
11
12
def decryption(name):
  img=cv2.imread(name)
  data=img.reshape((-1))
  msg=''
  length=''
  for i in range(16):
    length+=str(data[i]%2)
  length=int(length, 2)
  for i in range(16, length+16):
    msg+=str(data[i]%2)
  message=binary_to_str(msg)
  return message

Finally, to make it usable as a command-line tool, we add the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
if __name__=='__main__':
  # Instructions:
  #   usage: 
  #     for encryption: pass args: en   path_to_image   message_to_hide   path_for_final_image
  #     for decryption: pass args: de path_to_image
  #   the extension of final image extension should be png
  try:
    if len(sys.argv)<2:
      raise Exception('Invalid Arguments')
    if sys.argv[1]=='en':
      if len(sys.argv)!=5:
        raise Exception('Invalid Arguments')
      encryption(sys.argv[2], sys.argv[3], sys.argv[4])
    elif sys.argv[1]=='de':
      if len(sys.argv)!=3:
        raise Exception('Invalid Arguments')
      print(decryption(sys.argv[2]))
  except:
    print("Wrong usage or invalid arguments")

Additional Steps

The above approach is just a start. To get more out of the technique, we can first convert the plaintext into ciphertext and then use the above strategy. This will add one additional layer of security for the data.

You can view the combined source code here: bhagwatgarg/steganography-blog

That’s all folks!

This post is licensed under CC BY 4.0 by the author.