2 posts
Posted 14 October 2013 - 12:12 AM
I just joined this forum because I thought I should report this bug, but it seems I need at least 3 approved posts before I can do it.
I'll still submit my bug report here in case anyone sees fit to fix it.
There is an error with the way textutils.serialize detects recursive references. The way it's done is that it makes sure that the same table does not exist twice anywhere within the tree structure, instead of the correct approach, which would be to make sure that same table does not exist twice inside the same branch.
This should be allowed, but is not
A
| \
B B
or in code
B = {}
A = {B, B}
This is (should be) absolutely fine, but textutils will throw an error.
Suggested fix:
http://pastie.org/8400569I added a shallowcopy method that does a simple key/value copy, and I modified the serializeImpl method in such a way that it passes copies of the tTracking list instead of the whole list, thus different branches don't interfere with each other.
8543 posts
Posted 14 October 2013 - 12:40 PM
Split into new topic.
The suggested fix takes a slightly too naïve approach. Unserializing the resulting table would result in A = {B, C} where tables B and C are identical (with the exception of any tables they contain, etc.), so a somewhat different approach would need to be taken. A change in approach capable of re-assembling tables with correct sets of matched references would probably be able to handle recursive tables correctly as well. I believe there is an example of such a serialization function on either the lua-users wiki, or in the PIL.
808 posts
Posted 14 October 2013 - 07:51 PM
Split into new topic. The suggested fix takes a slightly too naïve approach. Unserializing the resulting table would result in A = {B, C} where tables B and C are identical (with the exception if any tables they contain, etc.), so a somewhat different approach would need to be taken. A change in approach capable of re-assembling tables with correct sets of matched references would probably be able to handle recursive tables correctly as well. I believe there is an example of such a serialization function on either the lua-users wiki, or in the PIL.
The issue with that is that you can't reference other values in the table in the serialized table. So you can't have a serialized table that looks like
{[1]={},[2]=[1]}
You would need to take a code-like approach if you wanted it to be parsed by loadstring the way it currently is.
(function() local A = {} local B = {} A[1] = B A[2] = B return A end)()
Problem is, that would be disgusting to try to write the serializer for. I've got some ideas on how to do it but I'm not going to write them out.
8543 posts
Posted 14 October 2013 - 10:06 PM
That really wouldn't be all that bad to write a serializer for. Here's a very lightly tested serializer/unserializer (you'll note that it's very similar to the existing unserializer) that handles the A = {B, B} case correctly. It should handle most other cases correctly, though there are probably some corner cases I'm forgetting about. I'm not sure why that seems so disgusting to you. It is a little bigger than the existing serializer, of course.
function serialize(target, tracking, cursor)
local firstRun = false
local outString = ""
if not tracking then tracking = {[target] = "__locTable", size = 1} outString = "__locTable = {}\n" cursor = "__locTable" firstRun = true end
if not tracking[target] or tracking.size < 2 then
for k,v in pairs(target) do
if type(v) == "function" then error("Cannot serialize type: function!") end
local keyName = ""
if type(k) == "string" then
keyName = '["' .. k .. '"]'
else
keyName = "[" .. k .. "]"
end
outString = outString..cursor..keyName.." = "
if type(v) == "table" then
if not tracking[v] then
outString = outString .. "{}\n"
tracking[v] = cursor..keyName
tracking.size = tracking.size + 1
local insertString = serialize(v, tracking, cursor..keyName)
outString = outString..insertString
else
outString = outString..tracking[v]
end
elseif type(v) == "string" then
outString = outString..string.format("%q", v).."\n"
else
outString = outString..tostring(v).."\n"
end
end
if firstRun then outString = outString.."return __locTable\n" end
return outString
end
return ""
end
function unserialize(str)
local func, e = loadstring(str)
if not func then
return str
else
setfenv(func, {})
return func()
end
end
2 posts
Posted 14 October 2013 - 10:47 PM
I agree that my method is pretty naive, but I didn't want to spend a long time coding a better approach when I didn't know if my post would even get any responses.
If you want to, and think it's worth the effort to write and test a more solid approach that'd be awesome. I'm not sure how easy it is to submit bug fixes to CC, but seeing as how you're an admin with over 4k posts I'm thinking you'll probably have 100x chance that I have.
8543 posts
Posted 15 October 2013 - 12:30 AM
For reference, I used the word naïve in a
specific coding context. Just wanted to ensure that was understood as I meant it.
The developers to read through this section and the Suggestions section on occasion. I'd imagine that they're probably satisfied with the current state of the serialization function. It accomplishes good, compact serialization that successfully covers a large majority of the use cases. There are other serialization functions floating around on the forums that are available for use, and I believe some even claim to properly handle functions in tables. If the developers wanted to change out the serialization function, they've got a wide range to choose from. One major downside is that any change that requires a different unserialization method would break all of the existing serialized tables. For those reasons (easy availability of alternatives for those who want them and breaking changes to switch to a more feature-rich method), I find it unlikely that they would choose to incorporate a different serialization method into the ROM.
7508 posts
Location
Australia
Posted 15 October 2013 - 01:56 AM
Also as another point, I have not seen one single Lua-side bug fix, code improvement, code replacement or anything of the such in all the time I've been using ComputerCraft… I've seen more programs added, I've seen more bugs added, I've never seen any of said bugs resolved or fixed. The reason for this is because there is a system in place which easily allows us, the users, to fix the bugs ourselves and use the better versions that we've written… It also gives us inspiration and motivation for writing programs that the community would like as they can address pain points of the general user. Lastly, I would much prefer that dan and Cloudy spend more time working on adding new features to the mod and squishing the Java-side bugs, than attempting to fix or improve something that I can do myself on the computer.
808 posts
Posted 15 October 2013 - 04:59 AM
That really wouldn't be all that bad to write a serializer for. Here's a very lightly tested serializer/unserializer (you'll note that it's very similar to the existing unserializer) that handles the A = {B, B} case correctly. It should handle most other cases correctly, though there are probably some corner cases I'm forgetting about. I'm not sure why that seems so disgusting to you. It is a little bigger than the existing serializer, of course.
I just tried your code in a lua shell, it does not work.
> x = {}
> a = {}
> a.b = {}
> a.c = {}
> a.b.d = x
> a.c.d = x
> s = serialize(a)
> = s
__locTable = {}
["c"] = {}
["b"] = {}
> -- wut?
That's not even all that complex of a table. And attempting to loadstring() that obviously errors with
[string "__locTable = {}..."]:2: unexpected symbol near '['
[/code]
321 posts
Location
Melbourne, Australia
Posted 15 October 2013 - 05:47 AM
Also as another point, I have not seen one single Lua-side bug fix, code improvement, code replacement or anything of the such in all the time I've been using ComputerCraft… I've seen more programs added, I've seen more bugs added, I've never seen any of said bugs resolved or fixed. The reason for this is because there is a system in place which easily allows us, the users, to fix the bugs ourselves and use the better versions that we've written… It also gives us inspiration and motivation for writing programs that the community would like as they can address pain points of the general user. Lastly, I would much prefer that dan and Cloudy spend more time working on adding new features to the mod and squishing the Java-side bugs, than attempting to fix or improve something that I can do myself on the computer.
http://www.computercraft.info/forums2/index.php?/topic/10921-typo-in-adventure/?
808 posts
Posted 15 October 2013 - 06:25 AM
Also as another point, I have not seen one single Lua-side bug fix, code improvement, code replacement or anything of the such in all the time I've been using ComputerCraft… I've seen more programs added, I've seen more bugs added, I've never seen any of said bugs resolved or fixed. The reason for this is because there is a system in place which easily allows us, the users, to fix the bugs ourselves and use the better versions that we've written… It also gives us inspiration and motivation for writing programs that the community would like as they can address pain points of the general user. Lastly, I would much prefer that dan and Cloudy spend more time working on adding new features to the mod and squishing the Java-side bugs, than attempting to fix or improve something that I can do myself on the computer.
Yea bit. I don't really think you're right on this. For example, I pointed out a lua-side rednet bug where the side couldn't be a modem attached via peripheral proxy from OpenPerihperal, and Dan came along and fixed that. There's been a number of lua fixes. It'd be stupid to leave a bunch of code as bug-ridden just because you want other people to fix your bugs for you.
8543 posts
Posted 15 October 2013 - 10:20 AM
Hah! I copied the wrong version! Let me see if I can fix it real quick, sorry.
Edit: there, fixed the code I posted above. It should work now. Sorry about my miscopying. That's what I get for having it open in multiple windows. If it still doesn't work, I'll copy over the correct version of it when I get home from work.
808 posts
Posted 15 October 2013 - 01:27 PM
Hah! I copied the wrong version! Let me see if I can fix it real quick, sorry.
Edit: there, fixed the code I posted above. It should work now. Sorry about my miscopying. That's what I get for having it open in multiple windows. If it still doesn't work, I'll copy over the correct version of it when I get home from work.
Still not right. The d fields in b and c don't get set to a common table, or at all.
> x = {}
> a = {}
> a.b = {}
> a.c = {}
> a.b.d = x
> a.c.d = x
> s = serialize(a)
> =s
__locTable = {}
__locTable["c"] = {}
__locTable["b"] = {}
return __locTable
> --wut?
8543 posts
Posted 15 October 2013 - 01:44 PM
Huh. That's odd. I'll have to drag out the working version in a few hours when I get home. Clearly, trying to make from memory the fixes I had figured out last night isn't working. I'll be sure to run the test cases you've posted before I upload it, but I believe both of them are fine in the actual working version. Sigh.
808 posts
Posted 15 October 2013 - 01:50 PM
Huh. That's odd. I'll have to drag out the working version in a few hours when I get home. Clearly, trying to make from memory the fixes I had figured out last night isn't working. I'll be sure to run the test cases you've posted before I upload it, but I believe both of them are fine in the actual working version. Sigh.
I'm working on my own version of this right now. The way I see it, it should be done in two passes. One to identify all values, and one to write them in a format. The result would look something like:
> x = {}
> a = {}
> a.b = {}
> a.c = {}
> a.b.d = x
> a.c.d = x
> = serialize(a)
local c1={}
local c2={d=c1}
local c3={d=c1}
local c4={b=c2,c=c3}
return c4
That result would be a well optimized version and I'm not sure I'll be willing to go that far with it, but I think the arbitrary cx values is the way to go.
808 posts
Posted 15 October 2013 - 02:50 PM
Ok this worked perfectly. The serialized text isn't the smallest but it works perfectly, even for recursively stored tables
Spoiler
function serialize(v)
-- two pass
-- pass 1: identify constants
-- pass 2: write constants to format
local function find(t, v)
for k,v2 in pairs(t) do
if v2 == v then return k end
end
end
local function contains(t, v)
for k,v2 in pairs(t) do
if v2 == v then return true end
end
return false
end
local function pass1(v, track)
if type(v) == "table" then
if not contains(track, v) then
table.insert(track, v)
for k,v in pairs(v) do
pass1(v, track)
end
end
end
end
local function strForVal(v, track)
local t = type(v)
if t == "string" then
return ("%q"):format(v)
elseif t == "table" then
return "c"..find(track, v)
elseif t == "number" or t == "boolean" or t == "nil" then
return tostring(v)
else
error("Invalid type: " .. t)
end
end
local function format(v, track)
local str = ""
if type(v) ~= "table" then return str end
for i,v in ipairs(track) do
for k,v in pairs(v) do
str = str .. "c"..i.."["..strForVal(k,track).."]=" .. strForVal(v,track).."\n"
end
end
return str
end
local function pass2(v, track)
local str = ""
for i,v in ipairs(track) do
str = str .. "local c"..i.."={}\n"
end
str = str .. format(v, track)
str = str .. "return "..strForVal(v, track)
return str
end
local track = {}
pass1(v, track)
return pass2(v, track)
end
function unserialize(s)
local f,e = loadstring(s)
if f then
setfenv(f, {})
return f()
end
end
This worked:
> x = {}
> a = {}
> a.b = {}
> a.c = {}
> a.b.d = x
> a.c.d = x
> x.a = a
> = serialize(a)
local c1={}
local c2={}
local c3={}
local c4={}
c1["c"]=c2
c1["b"]=c4
c2["d"]=c3
c3["a"]=c1
c4["d"]=c3
return c1
> a1 = unserialize(serialize(a))
> = a1.b.d.a == a1
true
8543 posts
Posted 16 October 2013 - 12:16 AM
Here, dug it up:
function serialize(target, tracking, cursor)
local firstRun = false
local outString = ""
if not tracking then tracking = {[target] = "__locTable", size = 1} outString = "local __locTable = {}\n" cursor = "__locTable" firstRun = true end
for k,v in pairs(target) do
if type(v) == "function" then error("Cannot serialize type: function!") end
local keyName = ""
if type(k) == "string" then
keyName = '[' ..string.format("%q", k).. ']'
else
keyName = "[" .. k .. "]"
end
outString = outString..cursor..keyName.." = "
if type(v) == "table" then
if not tracking[v] then
outString = outString .. "{}\n"
tracking[v] = cursor..keyName
tracking.size = tracking.size + 1
local insertString = serialize(v, tracking, cursor..keyName)
outString = outString..insertString
else
outString = outString..tracking[v].."\n"
end
elseif type(v) == "string" then
outString = outString..string.format("%q", v).."\n"
else
outString = outString..tostring(v).."\n"
end
end
if firstRun then
outString = outString.."return __locTable"
end
return outString
end
I didn't actually test the unserializer I posted above, but I don't see any reason it wouldn't work properly. Anyway, that should work, and it seems to be a fair bit more compact than the version you wrote above. An interesting exercise nonetheless!
808 posts
Posted 16 October 2013 - 12:39 AM
Here, dug it up:
That one's pretty good. Not to keep pointing out problems, but some still exist in there =P String keys aren't formatted via %q so there's potential syntax errors if the key contains new lines. Also it doesn't allow tables as keys.
8543 posts
Posted 16 October 2013 - 12:45 AM
True, the tables as keys is indeed a problem. I've no idea what on earth I was doing last night when I was working on this. The version I copied over originally did have the string.format, but this version only has the block quotes. Ugh. Tables as keys would indeed not be handled by this one. Gonna paste the string.format fix into my post above for reference, but I've no particular interest in getting tables as keys going on with it.
Edit: Oh, keys. Yeah. Fixed that too. Now I'm done. :P/>