Python - Trying To Deal With The Bits Of A File
Solution 1:
To swap bytes 10010001
and 00100101
:
#!/usr/bin/env pythonimport string
a, b = map(chr, [0b10010001, 0b00100101])
translation_table = string.maketrans(a+b, b+a) # swap a,bwithopen('input', 'rb') as fin, open('output', 'wb') as fout:
fout.write(fin.read().translate(translation_table))
Solution 2:
read() returns an immutable string, so you'll first need to convert that to a list of characters. Then go through your list and change the bytes as needed, and finally join the list back into a new string to write to the output file.
filedata = f.read()
filebytes = list(filedata)
for i, c inenumerate(filebytes):
iford(c) == 0x91:
filebytes[i] = chr(0x25)
newfiledata = ''.join(filebytes)
Solution 3:
Following Aaron's answer, once you have a string, then you can also use translate
or replace
:
In [43]: s = 'abc'
In [44]: s.replace('ab', 'ba')
Out[44]: 'bac'
In [45]: tbl = string.maketrans('a', 'd')
In [46]: s.translate(tbl)
Out[46]: 'dbc'
Docs: Python string
.
Solution 4:
I'm sorry about this somewhat relevant wall of text -- I'm just in a teaching mood.
If you want to optimize such an operation, I suggest using numpy. The advantage is that the entire translation operation is done with a single numpy operation, and those are written in C, so it is about as fast as you can get it using python.
In the below example I simply XOR every byte with 0b11111111
using a lookup table -- first element is the translation of 0b0000000
, the second the translation of 0b00000001
, third 0b00000010
, and so on. By altering the lookup table, you can do any kind of translation that does not change within the file.
import numpy as np
import sys
data = np.fromfile(sys.argv[1], dtype="uint8")
lookup_table = np.array(
[i ^ 0xFFfor i in range(256)], dtype="uint8")
lookup_table[data].tofile(sys.argv[2])
To highlight the simplicity of it all I've done no argument checking. Invoke script like this:
python name_of_script.py input_file.txt output_file.txt
To directly answer your question, if you want to swap 0b10010001
and 0b00100101
, you replace the lookup_table = ...
line with this:
lookup_table = np.array(range(256), dtype="uint8")
lookup_table[0b10010001] = 0b00100101
lookup_table[0b00100101] = 0b10010001
Of course there is no lookup table encryption that isn't easily broken using frequency analysis. But as you may know, encryption using a one-time pad is unbreakable, as long as the pad is safe. This modified script encrypts or decrypts using a one-time pad (which you'll have to create yourself, store to a file, and somehow (there's the rub) securely transmit to the intended recipient of the message):
data = np.fromfile(sys.argv[1], dtype="uint8")
pad = np.fromfile(sys.argv[2], dtype="uint8")
(data ^ pad[:len(data)]).tofile(sys.argv[3])
Example usage (linux):
$ ddif=/dev/urandom of=pad.bin bs=512 count=5$ python pytrans.py pytrans.py pad.bin encrypted.bin
Recipient then does:
$ python pytrans.py encrypted.bin pad.bin decrypted.py
Viola! Fast and unbreakable encryption with three lines (plus two import lines) in python.
Post a Comment for "Python - Trying To Deal With The Bits Of A File"