This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
ElvishJerricco's profile picture

A better serializer - rewrite of textutils.serialize

Started by ElvishJerricco, 12 June 2013 - 03:53 AM
ElvishJerricco #1
Posted 12 June 2013 - 05:53 AM
Textutils is a nice API. But it's serialize function could be better. It doesn't have an option for having the output be formatted, and it's not optimized to use the features of lua to shorten the resulting string. For example, any values with keys in sequence from 1 to n can be encoded like

{1,2,3,4, ... n}

Also, there is no need for a comma on the last element.


{1,2,3}

And finally, string indexes can be entered without the brackets and quotes.


{a=1,b=2}

Put all this together and you get dramatically shorter strings.


{[1]=1,[2]=2,[3]=3,[4]=4,["a"]=4,["b"]="test"}
vs
{1,2,3,4,a=4,b="test"}

And on top of that, with the new serializePretty, you can get it formatted


{
	1,
	2,
	a={
		b="c"
	},
	d="test"
}

USAGE:
Get the code off pastebin with

pastebin get sQsQvmzC betterTU

To get it installed, you can run it as a file before using its apis, or you can paste it into your program, or if you want to you can os.loadAPI it, although then you'd have an empty betterTU table in your _G…

To use it, use the same code as before, because it installs the functions into textutils itself.


local str = textutils.serialize({1,2,3,4})
local pStr = textutils.serializePretty({1,2,3,4})

local t = textutils.unserialize(str) -- or pStr, either will work

That's it! Hope you like the further compressed text and formatted text!
theoriginalbit #2
Posted 12 June 2013 - 06:17 AM
I hate to be the first reply and its problems, but you have problems, and they're big ones. Here are 3;

Problem 1:
This table

local t = { ["something brilliant"] = "not!" }
any key that contains a space requires having the [""] surrounding it, your function currently does not do this.

Problem 2:
You do not stop recursive entries like the default textutils.serialize does, this means the following code will crash the program with a stack overflow.

local t = { 3, 16, 4 }
local a = { t }
table.insert( t, a )
print(textutils.serialize(t))

Problem 3:
the default textutils.unserialize is unable to unserialize your serialization, so you will need to write an unserializer too, that is able to handle both pretty and non-pretty inputs, you shouldn't expect the developer to know which format it is in…
ElvishJerricco #3
Posted 12 June 2013 - 06:43 AM
I hate to be the first reply and its problems, but you have problems, and they're big ones. Here are 3;

Problem 1:
This table

local t = { ["something brilliant"] = "not!" }
any key that contains a space requires having the [""] surrounding it, your function currently does not do this.

Problem 2:
You do not stop recursive entries like the default textutils.serialize does, this means the following code will crash the program with a stack overflow.

local t = { 3, 16, 4 }
local a = { t }
table.insert( t, a )
print(textutils.serialize(t))

Problem 3:
the default textutils.unserialize is unable to unserialize your serialization, so you will need to write an unserializer too, that is able to handle both pretty and non-pretty inputs, you shouldn't expect the developer to know which format it is in…

I have already solved the first problem.

I had forgotten about the second one. I'll get right on that. Shouldn't be hard.

Actually, default textutils.unserialize works perfectly with this. At least it does with this example text of 621 lines representing some lasm language details.

Spoiler

{
	symbols = {
		[")"] = true,
		["("] = true
	},
	keywords = {
		[".arguments"] = true,
		[".options"] = true,
		[".stacksize"] = true,
		[".local"] = true,
		[".upvalue"] = true,
		[".argcount"] = true,
		[".const"] = true,
		[".params"] = true,
		[".vararg"] = true,
		[".upval"] = true,
		[".end"] = true,
		[".args"] = true,
		[".function"] = true,
		[".func"] = true,
		[".maxstacksize"] = true
	},
	opcodes = {
		NOT = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				}
			},
			code = 19
		},
		LOADK = {
			type = "iABx",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "C",
					index = "Bx"
				}
			},
			code = 1
		},
		SETTABLE = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 9
		},
		CLOSE = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				}
			},
			code = 35
		},
		GETTABLE = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 6
		},
		DIV = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 15
		},
		TEST = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "C"
				}
			},
			code = 26
		},
		ADD = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 12
		},
		SETLIST = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "B"
				},
				{
					type = "N",
					index = "C"
				}
			},
			code = 34
		},
		TFORLOOP = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "C"
				}
			},
			code = 33
		},
		LOADBOOL = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "B"
				},
				{
					type = "N",
					index = "C"
				}
			},
			code = 2
		},
		TESTSET = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				},
				{
					type = "N",
					index = "C"
				}
			},
			code = 27
		},
		FORLOOP = {
			type = "iAsBx",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "sBx"
				}
			},
			code = 31
		},
		UNM = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				}
			},
			code = 18
		},
		CALL = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "B"
				},
				{
					type = "N",
					index = "C"
				}
			},
			code = 28
		},
		EQ = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 23
		},
		LT = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 24
		},
		RETURN = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "B"
				}
			},
			code = 30
		},
		JMP = {
			type = "iAsBx",
			arguments = {
				{
					type = "N",
					index = "sBx"
				}
			},
			code = 22
		},
		CLOSURE = {
			type = "iABx",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "P",
					index = "Bx"
				}
			},
			code = 36
		},
		SETGLOBAL = {
			type = "iABx",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "C",
					index = "Bx"
				}
			},
			code = 7
		},
		MUL = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 14
		},
		TAILCALL = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "B"
				}
			},
			code = 29
		},
		POW = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 17
		},
		CONCAT = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				},
				{
					type = "R",
					index = "C"
				}
			},
			code = 21
		},
		GETUPVAL = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "U",
					index = "B"
				}
			},
			code = 4
		},
		NEWTABLE = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "B"
				},
				{
					type = "N",
					index = "C"
				}
			},
			code = 10
		},
		SETUPVAL = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "U",
					index = "B"
				}
			},
			code = 8
		},
		SELF = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 11
		},
		SUB = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 13
		},
		LOADNIL = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				}
			},
			code = 3
		},
		LEN = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				}
			},
			code = 20
		},
		FORPREP = {
			type = "iAsBx",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "sBx"
				}
			},
			code = 32
		},
		MOD = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 16
		},
		VARARG = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "N",
					index = "B"
				}
			},
			code = 37
		},
		MOVE = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "R",
					index = "B"
				}
			},
			code = 0
		},
		LE = {
			type = "iABC",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "RK",
					index = "B"
				},
				{
					type = "RK",
					index = "C"
				}
			},
			code = 25
		},
		GETGLOBAL = {
			type = "iABx",
			arguments = {
				{
					type = "R",
					index = "A"
				},
				{
					type = "C",
					index = "Bx"
				}
			},
			code = 5
		}
	}
}

Which was originally written in json with my json API. But then I used these textutils functions to tranlate it, which is where I noticed the first problem you pointed out. Again, I fixed that. So now I just need to do stop recursive entries.

EDIT: Got it fixed. Only took a couple of minutes. Yea writing this quickly after a night of no sleep because I was working on various other things does not make for a programmer fit to remember all problems on the first version =P
theoriginalbit #4
Posted 12 June 2013 - 06:47 AM
Actually, default textutils.unserialize works perfectly with this. At least it does with this example text of 621 lines representing some lasm language details.
Pretty sure it didn't for me, maybe I hadn't hit save from the problem 1 code change, because its working now…


So now I just need to do stop recursive entries.
Look into doing the same way that the default one does it, storing the reference of all the tables it has serialised and making sure its not on that list.
ElvishJerricco #5
Posted 12 June 2013 - 06:54 AM
Actually, default textutils.unserialize works perfectly with this. At least it does with this example text of 621 lines representing some lasm language details.
Pretty sure it didn't for me, maybe I hadn't hit save from the problem 1 code change, because its working now…


So now I just need to do stop recursive entries.
Look into doing the same way that the default one does it, storing the reference of all the tables it has serialised and making sure its not on that list.

As noted in my edit, it's fixed now. Yea I thought about sending tracking info through a function parameter but I didn't like that concept so I went to see if textutils does it that way and it does. So I figured I'd go ahead and do it that way.