This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
AfterLifeLochie's profile picture

C2 - Pure Lua file and folder compression system

Started by AfterLifeLochie, 29 January 2013 - 09:25 PM
AfterLifeLochie #1
Posted 29 January 2013 - 10:25 PM
C2 is an actually-compressive (LZ78 functional, LZMA planned) extensible archive system I've been writing in pure Lua since I've returned from my short hiatus - designed with minimal overhead and space usage, while retaining complete flexibility. This is also a demonstration of what happens when I get my hands on OOP ideas.

C2 is currently in it's infancy stage - and as yet, cannot actually store files - but the LZ78 code is fully functional (albeit, doesn't return in exactly the desired final format, yet). The LZ78 implementation I've written is tested with standard (a-z, A-Z, 0-9, symbols, space) characters, and I assume all characters (including un-printable ones) will function correctly in and out of the implementation. Please let me know if it throws back any hiccups!

As ComputerCraft does not support binary operations natively, such as :seek() or :getSize(), I've also written a "file.binary" handler wrapper - which also requires work, admittedly, but functions like a binary file in buffer.

You can take a peek at the development-code here at my GitHub - if you want to test the speed of C2, you can use the "packtest" script, or, if you want to use/fiddle with the LZ78 implementation, "comptest" is your friend. Please be aware I'll most likely be changing the LZ78's output to return "flattened-binary" and not it's current disgusting table - and that stuff in these API's/code is not guaranteed to stay as it is now.

Ninja-edit: Forgot this - if you don't want to download, here's a preview of compression:
Spoiler
NeverCast #2
Posted 29 January 2013 - 10:51 PM
The random person working on this was you?

Edit: Forgot to say what I should have said!
Amazing! Your classloader approach is very interesting also.
When my eyes aren't trying to close on me I'll take a better look at this!

I already can think of so many uses, and I'm not even mad about the table :P/> Can write this out to many formats. Fantastic A-Tab!
AfterLifeLochie #3
Posted 29 January 2013 - 10:52 PM
The random person working on this was you?

Edit: Forgot to say what I should have said!
Amazing! Your classloader approach is very interesting also.
When my eyes aren't trying to close on me I'll take a better look at this!

I already can think of so many uses, and I'm not even mad about the table :P/> Can write this out to many formats. Fantastic A-Tab!
Yeah, I had/have been using a different nick on IRC on and off the last few days. I seem to have confused a lot of people. :P/>

Edit: Thanks! I'm definitely writing out the table very, very soon - it's one of the "must do before continuing", and I'm finalizing a flexible format tonight. I'm still being a "byte-pincher" and trying to reduce overhead.
NeverCast #4
Posted 29 January 2013 - 10:55 PM
Sneaky Sneaking! tut tut xD
KillaVanilla #5
Posted 01 February 2013 - 01:46 PM
Wow, I didn't think this was possible.
NeverCast #6
Posted 01 February 2013 - 02:43 PM
Edit: Thanks! I'm definitely writing out the table very, very soon - it's one of the "must do before continuing", and I'm finalizing a flexible format tonight. I'm still being a "byte-pincher" and trying to reduce overhead.

Ahh yes, I am really bad for that.
I was doing it when working on BuildCraft networking, and now I'm really bad for it in the CCTube format. Everything must be packed in to every available bit! If there is 1 unused bit, then I could save an entire byte every 8 bytes! etc
AfterLifeLochie #7
Posted 01 February 2013 - 05:23 PM
Wow, I didn't think this was possible.
It's not impossible, just time consuming, and takes a lot of care. Lost bits are going to be lost forever - and that's not a good thing.

I was doing it when working on BuildCraft networking, and now I'm really bad for it in the CCTube format. Everything must be packed in to every available bit! If there is 1 unused bit, then I could save an entire byte every 8 bytes! etc
You should be able to use either C2 on it's own, or, use the raw compression engine (I'm dusting up a really decent API to make things really easy). Officials are still on the table, though, as I've not decided exactly what's going on. LZ78 doesn't compress quite as… compress-y as I'd hoped, so I'm extending to LZW support. LZMA may come in the future, as well.

Edit: I've been asked for LZ78 benchmarks. Here's a sample test:
Spoiler