[RFC] Universal Compressed Graphics version 2 - ComputerCraft Forums (archive)

ardera #1

ardera's profile picture

463 posts

Location Germany

Posted 22 January 2016 - 01:17 PM

Purpose & Introduction
UCG (Universal Compressed Graphics) is a file format for storing bitmaps as compact as possible, but still relatively fast to decompress.

Concept
The format uses a combination of Huffman Code & Run-Length encoding. Basically, it Huffman encodes both the colors (the 16 main colors + 1 transparent "color") and the run-lengths, and writes the results to a file.

Performance & Results
A 213pixels high and 320pixels wide UCG image file takes about 0.4 seconds to decode in ComputerCraft redux, which is pretty good considering the complexity of Huffman Code reading and the low computing speed of ComputerCraft computers. But while the original paintutils file is about 65kilobytes in size, the UCG file only takes up 25kilobytes.
Generally, an UCG file is about 10%-40% the size of the same image stored in paintutils format.

Advantages over paintutils images

UCG files are much smaller than paintutils files
the width & height of the image are stored at the start of the image, you don't have to read the whole image to find out the image size

See Also
You can find a detailed description and specification here: https://github.com/a.../Images/UCG1.md
The CraftOS-Standards Github repository where I proposed the standard: https://github.com/o...aftOS-Standards
And the discussion about the specification: https://github.com/o...andards/pull/34

Implementations
Here are 2 APIs for handling UCGs. They're not publicly released yet, so there's only little documentation about how to use them.
libucg: https://github.com/ardera/libucg/
xaif: https://github.com/ardera/xaif

Edited on 30 July 2017 - 08:24 PM

Bomb Bloke #2

Bomb Bloke's profile picture

7083 posts

Location Tasmania (AU)

Posted 23 January 2016 - 02:12 AM

Why 0xFF2137 for the header? Why not something like 0x554347?

Presumably you've got a pre-made encoder / decoder somewhere?

I'm sorta scratching my head over the idea of using LengthValues with Huffman. Why not just use Huffman? Would you not get the same results with less effort? Edit: Ah, I get it: Huffman first, LengthValues second.)

The reference numbers ("see section x.x.x") in sections 5.6 and 5.7 are all incorrect. I guess you shuffled things around but didn't update them.

Word: 2 byte / 16 bit binary number

You mean you mean a two-byte unsigned integer, yeah? And the point you're trying to make in the Byte section is that all multi-byte values are big-endian?

Edited on 23 January 2016 - 01:48 AM

ardera #3

ardera's profile picture

463 posts

Location Germany

Posted 23 January 2016 - 10:43 AM

I could've used any random number for the signature. But I chose 0x21 as the second byte, because U is the 21st letter in the alphabet and 0x37 as the third byte because C is the 3rd and G is the 7th letter in the alphabet.

Yes, you can find links to 2 APIs for handling UCGs in the See-Also section of the Specification:
[indent=1]libucg: https://github.com/ardera/libucg/[/indent]
[indent=1]xaif: https://github.com/ardera/xaif[/indent]
There is no real documentation of them, as I didn't really publicly release them yet. Both of them contain a bit information about how to use them in the file.

Bomb Bloke, on 23 January 2016 - 02:12 AM said:
I'm sorta scratching my head over the idea of using LengthValues with Huffman. Why not just use Huffman? Would you not get the same results with less effort? Edit: Ah, I get it: Huffman first, LengthValues second.)

I don't really understand what you mean. Do you mean Run-Length Encoding and Huffman Coding?

Bomb Bloke, on 23 January 2016 - 02:12 AM said:
The reference numbers ("see section x.x.x") in sections 5.6 and 5.7 are all incorrect. I guess you shuffled things around but didn't update them.

Oh ok, I will update them as soon as possible.

Bomb Bloke, on 23 January 2016 - 02:12 AM said:
You mean you mean a two-byte unsigned integer, yeah? And the point you're trying to make in the Byte section is that all multi-byte values are big-endian?

Yes and yes, all multi-byte values are big endian.

Bomb Bloke #4

Bomb Bloke's profile picture

7083 posts

Location Tasmania (AU)

Posted 23 January 2016 - 12:46 PM

ardera, on 23 January 2016 - 10:43 AM said:
I don't really understand what you mean. Do you mean Run-Length Encoding and Huffman Coding?

I'd got it into my head that you were figuring out your length values for RLE, bundling them up as LengthValues, then running them through Huffman. Long story short, you're not. Nothing to worry about.

ardera #5

ardera's profile picture

463 posts

Location Germany

Posted 30 July 2017 - 02:28 PM

BUMP Updated UCG to version 2 (already in march), uses markov chains as a prefilter. Encoding speed decreased a bit because the prefilter uses a lot of table manipulation; but the decoder is now blazing fast.

I don't even know what compression methods I should implement next. The file size actually goes up or stays the same when compressing an ucg file with 7-Zip (using Deflate or LZMA, small wordsize since the file is very small, compression level ultra), meaning the information density is already pretty high. Google Brotli decreases the file size by another 6 percent, from 2,27 to 2,12 kilobytes, but its main advantage is that it uses a predefined dictionary and I'm not gonna implement that.

Engineer #6

Engineer's profile picture

1522 posts

Location The Netherlands

Posted 02 August 2017 - 07:48 PM

Is UCG faster than paintutils?

Bomb Bloke #7

Bomb Bloke's profile picture

7083 posts

Location Tasmania (AU)

Posted 03 August 2017 - 02:30 AM

In terms of rendering, it's barely any more efficient… but you really need to go out of your way to be less efficient than the paintutils method.

Keeping the image in RLE format when loaded not only takes less storage, but it means you can render faster, too! But for real speed, you really want to switch to term.blit().

oeed #8

oeed's profile picture

2151 posts

Location Auckland, New Zealand

Posted 08 August 2017 - 11:20 AM

In terms of printing maybe, but parsing the file, paintutils is quite a bit quicker. I was using UCG a lot in Quartz and I did find that with 'large' images (I'm talking 320 x 200) loading them did take a little while (over 100ms from memory). I'd imagine this was due to using the bit API though, rather than having native versions.

Edited on 08 August 2017 - 09:20 AM