This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
tesla1889's profile picture

huffman table compression

Started by tesla1889, 14 February 2013 - 12:55 PM
tesla1889 #1
Posted 14 February 2013 - 01:55 PM
one method to compress/decompress a string is the huffman table method.


compression:
  1. create a table of all bytes in the string, from most to least used
  2. if two bytes are equally used, the first byte in the string is first
  3. for every byte in the string, append the byte's table location number of 1 bits and a 0 bit to the output
  4. prepend all of the bytes in the table to the output
  5. prepend the number of bytes in the table to the output
  6. append 0 bits until divisible by 8
  7. return the string
note: the longer or more repetitive the string is, the more effective this encoding is.


example:
[indent=1]text: apple pie is great food[/indent]

[indent=1]huffman table: peaiolsgrtfd[/indent]

[indent=1]bitstream: 11110 110 110 11111110 1110 10 110 111110 1110 10 111110 111111110 10 1111111110 11111111110 1110 11110 111111111110 10 1111111111110 1111110 1111110 11111111111110[/indent]

[indent=1]output (in hex) [table]: 0D [20 70 65 61 69 6F 6C 73 67 72 74 66 64] F6 DF DD 6F BA FB FD 7F DF FB BD FF D7 FF BF 7E FF F8[/indent]


this method failed to compress this string because it was too small. larger strings will be more effective. if it had not been for the table prepended to it, the compression would have worked on the example string.


decompression:
  1. load the first byte number of bytes into the table (if not enough, handle error)
  2. read each sequence of 1 bits, adding the byte in the table according to the number of 1s before each 0
  3. return the string
immibis #2
Posted 14 February 2013 - 06:10 PM
This isn't really a tutorial, is it?
tesla1889 #3
Posted 14 February 2013 - 06:37 PM
it is, actually. there isn't any code, but it is a guide to huffman encoding

although, i must admit, this is partially a reaction to all of these "compression" schemes that keep popping up on here
Dlcruz129 #4
Posted 14 February 2013 - 06:38 PM
it is, actually. there isn't any code, but it is a guide to huffman encoding

although, i must admit, this is partially a reaction to all of these "compression" schemes that keep popping up on here

Code would definitely be useful.