This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
kazagistar's profile picture

Transparent recursive filesystem persistence API

Started by kazagistar, 05 June 2012 - 06:05 AM
kazagistar #1
Posted 05 June 2012 - 08:05 AM
Download

Overview:
So, I decided to try my hand at coding up a persistence API, with all the bells and whistles, to hopefully make it the API of choice for the very few people who don't decide to make one from scratch :)/>/> This API does NOT go in the apis folder, you just call shell.run("path/to/persistence") from wherever you have a dependency on it and it creates a table "persistence" that has all the functions in it, so you don't have to muck around with registering it or whatnot.
It uses metatables to be simple, minimalistic, all while preserving data types (number, boolean, string, table). It translates nested tables into nested folders. Essentially, you can create a "persistence table" by calling persistence.new("folder"), and any values you store to that table are stored to the folder, and any values you read from the table are read from the folder, automatically. You can easily read through folder and see all the data and data types, and even edit it manually (easy, but not recommended, no real point). Also, any tables you read out from a persistence table will also be persistence tables. There are also a number of utility functions included to make the persistence tables easier to work with, since they don't function QUITE like normal tables.

Example:
To store the initial position of a turtle, I can do the following:

data = persistence.new("datastore")
data.location = {x=10, y=-432,z=0,o=2}
Now I can restart the turtle and the data will be still be there. To access any of it, I can do anything like the following examples:

data = persistence.new("datastore")
print(data.location.x)

data = persistence.new("datastore")
data["location"]["y"] = 49

location = persistence.new("datastore/location")
location.z = location.z + 1
It should be fairly straightforward, but if you have any question, feel free to ask!

Utility functions:
  • persistence.path(ptable): returns the filesystem path of the given persistence table's folder
  • persistence.list(ptable): returns a numbered list of keys in the table
  • persistence.pairs(ptable): The pairs() function is broken for persistence tables so use this as a drop in replacement.
  • persistence.table(ptable) returns a the table in the native, NON persistent format, for when you want the contained variables to NOT all be backed up to the filesystem.
Final notes:
I plan to add a couple more features for custom data type storage (I have some of the hooks in already) but I will get to that as I need it. For now, if you have a table with a metatable (for those of you who know what that is) it will just be saved like any other table by iterating through the pairs(), stripping the metatable information.
Please post your comments, suggestions, and criticisms!
Oddstr13 #2
Posted 29 June 2012 - 04:27 PM
This works just as I expected.
The fact that i can specify a directory to store a table in is awesome!

local data = persistence.new("/disk/data")
if data.number_of_runs == nil then data.number_of_runs = 0 end
data.number_of_runs = data.number_of_runs
print("This is run number " .. tostring(data.number_of_runs))


It's just one thing to say: Thank you very much!


Best Regards
Oddstr13
delkrak #3
Posted 29 July 2012 - 05:18 PM
This is ingeniously simple!!

However.. i don't get the macanism that saves my variables. Example:

data_old = persistence.new("datastore")
data.location = {x=10, y=-432,z=0,o=2} -- no store call
data_new = persistence.new("datastore")  -- i only see a call to setmetatable, no further loading
print(data.x)

i don't get how this could remotely work.. Could you please explain this magic? :ph34r:/>/>
Vendan #4
Posted 01 August 2012 - 10:49 PM
This is ingeniously simple!!

However.. i don't get the macanism that saves my variables. Example:

data_old = persistence.new("datastore")
data.location = {x=10, y=-432,z=0,o=2} -- no store call
data_new = persistence.new("datastore")  -- i only see a call to setmetatable, no further loading
print(data.x)

i don't get how this could remotely work.. Could you please explain this magic? :ph34r:/>/>

It works by not being a table really.

When you do
persistance.new("blah")
it returns a table with a metatable attached. The metatable tells lua that there is something special about the table, and that certain actions are handled by the metatable instead of by the normal lua functions.

Specifically, when you do a

data.location = {x=10, y=-432,z=0,o=2}
It actually doesn't put it in the table, it calls

put(data, "location" {x=10, y=-432,z=0,o=2})
and that writes the information out to the file system.

Retrieving info happens the same way, just in reverse.

print(data.x)
actually looks(to the computer) more like

print(get(data, "x"))
and that pulls it from the file system. Note, there was nothing saved to there, so it wouldn't print anything, but you could do

print(data.location.x)
which the computer would translate into

print(get(get(data,"location"), "x"))

It's kinda complex, but a nice demonstration of Lua's power
natedogith1 #5
Posted 26 August 2012 - 07:20 PM
you can turn a function into a string with "string.dump(function)" and reload it with "loadstring(string)" though you might have issues with embedded nulls
kazagistar #6
Posted 06 September 2012 - 10:21 PM
Just FYI, the problem with functions is that, in my experience, the main use of storing a function in a table is that it is a closure of some kind, and closures over local variables get cleared when saved and restored.
CoolisTheName007 #7
Posted 24 October 2012 - 07:41 PM
What's the advantage of this API in comparison with the serialize() function?
Orwell #8
Posted 25 October 2012 - 08:11 PM
You don't ever need to worry about explicitly storing the variables to disk. Because of operator overloading (sort of) it just acts like a regular variable. So you create them once, and then you just use them as any normal variable. A main advantage would be converting an existing program to use persistent variables by simple changing the definition of the variables and nothing more.
CoolisTheName007 #9
Posted 28 October 2012 - 09:04 PM
You don't ever need to worry about explicitly storing the variables to disk. Because of operator overloading (sort of) it just acts like a regular variable. So you create them once, and then you just use them as any normal variable. A main advantage would be converting an existing program to use persistent variables by simple changing the definition of the variables and nothing more.
Ah, me feels dumb not have seen it.
Btw, kazagistar, it would be great if you could implement this serialize API from immibis:
http://www.computerc...-serialization/

And someone named O'Reillys used it in a persistence API, but it lacks the simplicity of the operator overloading whossname:
http://www.computercraft.info/forums2/index.php?/topic/1131-13-oreillys-persistent-variables-tables-encryption-namespaces/page__hl__persistent__fromsearch__1
kazagistar #10
Posted 31 October 2012 - 03:55 PM
Yeah. I was kinda gone for more then a month… I will update this to a newer version soon, since there seems to be renewed interest around the programs I created way back then. Plus the compliments stroke my ego, so I really have no choice. :P/>/>

EDIT: Ugh, I went back to the code, poked around for an hour or two, and remembered all the horrible ugly bits. I feel like I should document all the nasty problems with recursive filesystem persistance for posterity.

1) Non-string keys. Only string keys can be used trivially… number and boolean keys could be added by adding a "type", but that would be annoying at best, and tables, threads, functions and so on, used as keys, would be absurd. Personally, I just only accepted strings as keys. Wanna persist a numbered list? Tough luck.

2) Buffering. If you read from file every time you use a variable, like I did in my first version here, there is no synchronization problem, but this is suboptimal if you wish to write once, and read often, such as with a settings file. This can be solved by having the file system read into table buffers first just once. These buffers have to be stored centrally by the library, so that multiple references access the same buffer, to keep them consistant. This is a solvable problem for sure, but does add a layer of complexity.

3) Table writing. This has a dozen subproblems, each of which adds a large chunk of complexity. When you store a table in a persistance table, the following have to be dealt with:

a) Normal tables get stored by reference, but here, we basically have to store by value, copying each value. When it is this transparent, it could be confusing.

b ) Assume t1 is a persistance table, and we call "t1.t2 = t1.t2.t3" or something similar. Normally, when a persistance table is overwritten, we delete the table first, and then write the data, but in this case, that would delete the data to be written. Thus, you need to detect that you are using persistance tables, create a copy of t3, and then delete t2 and write out the stored t3. Alternatively, you could create some kind of beastly state sync system.

c) Similar to b, but now we do "t1.t2.t3 = t1.t2"… if the others are solved, this should be easy enough… you have to just copy the current state of t2 to t3, minus the modifications you are currently making.

I am mostly not sure if there is any use case for transparent recursive filesystem persistance. For things like settings files, flat files with buffering should be enough, no folders needed. If it is changed often, the performance will be hideous, and manual saving is probably better.