Wednesday, 12 October 2016

CODZ ciphers, testing RC4 in python

Some of the CODZ ciphers consist of binary data, and it is not clear what sort of cipher has created them. Notably 2,3,7 and 10 are base64 converted binary data. Ciphers 4,5,11, and 14 are hex data. I will be dealing with 2,3,7 and 10 in this post.

Cipher 2 decrypts to: dd3ed3b56ff21a7751ebbbaf6ffd1086f190a6b29b49c2a47e656bcc91768346df0321db23505874cb9f8a71d978490e and 3a411e6479e27ee31d601d486c7c807dbd20d85273ad599f90a7126caae9406be1434fdbbeeefa45d9c6cbf5ad4fb4f7 depending on wether you read the base64 data forwards or backwards. Either way you get exactly 384 bits of binary, which is suspiciously similar to the SHA384 hash length. I tried running john the ripper on the two hashes for a day or so, but nothing turned up. Either it is not SHA384 or we're gonna need some better word lists.

Ciphers 3,7 and 10 are longer. I have tested their entropy after breaking them up into different length segments, and the entropy is pretty much always equal to random, this means they are highly unlikely to have been generated by a classical cipher. More likely something modern like RC4. I have tried decrypting the 3 ciphers as RC4 using a wordlist for the keys, but no luck.

Code for testing entropy:

from base64 import decodestring
from bitarray import bitarray
from binascii import b2a_hex
from math import log

ctext = 'iW9cXmzOU7ZuZBtW40b3ng...' # I cut the rest of the cipher off to save space
data = decodestring(ctext)

a = bitarray(endian='little')

for bitlen in range(1,16):
  code = {}
  for i in range(2**bitlen):
    string = format(i,'0'+str(bitlen)+'b')
    code[i] = bitarray(string)
  num = a.decode(code)
  freq = {}
  for i in num:
      if i in freq: freq[i] += 1.
      else: freq[i] = 1.
  N = len(num)
  en = 0
  for i in freq.values():
    p = i/N
    en -= p*log(p)
  print bitlen,' entropy: ',en,' if random: ',-log(1./2**bitlen)

For decrypting RC4, we keep the key corresponding to the decrypt with the lowest entropy, if the wrong key is used something high entropy will result, but if it decrypts to text we should get something easily identifiable. Python code for decrypting RC4 with a word list:

data = base64.b64decode(ctext)

def rc4(data,key):
    S = range(256)
    j = 0
    out = []

    #KSA Phase
    for i in range(256):
        j = (j + S[i] + ord( key[i % len(key)] )) & 0xFF
        S[i] , S[j] = S[j] , S[i]

    #PRGA Phase
    i = j = 0
    for char in data:
        i = ( i + 1 ) & 0xFF
        j = ( j + S[i] ) & 0xFF
        S[i] , S[j] = S[j] , S[i]
        out.append(chr(ord(char) ^ S[(S[i] + S[j]) & 0xFF]))
    return out
# find a good word list and use it here
keylist = open("C:\Users\james\Desktop\cipher_stuff\simplesub_word\\count_1w.txt")

besten = 10e10
bestkey = ""
bestout = ""
for count,key in enumerate(keylist):
  key = key.split()[0].strip()
  for i in range(3):
    if i == 0: key = key[0].upper() + key[1:].lower()
    elif i == 1: key = key.upper()
    else: key = key.lower()
    out = rc4(data,key)
    # compute entropy of the decoded text
    freq = {}
    for i in out:
        if i in freq: freq[i] += 1.
        else: freq[i] = 1.
    N = len(out)
    en = 0
    for i in freq.values():
        p = i/N
        en -= p*log(p)
    if en < besten:
        besten = en
        bestkey = key
        bestout = ''.join(out[:20])
  if count % 500 == 0: print count, key, bestkey,besten,bestout
print count, key, bestkey,besten,bestout

No comments:

Post a Comment