I'm sure many of us know that lua runs in a virtual machine via bytecode. You can get bytecode for a function by using string.dump(func). This is nice because you can compile Lua code into bytecode and save the bytecode instead of the source.
LASM is an assembly language for this bytecode. It's entire purpose is to allow the direct programming of Lua bytecode. On the surface, this isn't so useful. But theoretically, this could be useful in creating any new languages. It's much easier to target an assembly language in a compiler than it is to target Lua itself.
I'm not the first to create a LASM assembler for Lua, but I am the first to make one that's compatible with LuaJ and CC. And I think mine's pretty nice.
Here is a large writeup on the Lua bytecode specification as it stands today. The syntax used in their examples isn't exactly like my LASM implementation's syntax, but it's close.
Spoiler
For those of you that don't know, assembly languages are very nearly an exact representation of every byte in the bytecode. You will type out every instruction to the Lua VM.
-- helloworld.lasm
.stacksize 2
.const "Hello, World!" -- constant is at index 0
.const "print"
getglobal 0 1 -- load into register 0, the global with name from constant at index 1 ("print")
loadk 1 0 -- load into register 1, the constant at constant index 0 ("Hello, World!")
call 0 2 1 -- call the function at register 0, with 1 parameter, and keeping no returns.
This simple LASM program prints the ever well known string "Hello, World!" First, we declare our constants. The order they're declared in determines the index they're at. So declaring constant "Hello, World!" puts it at index 0 (unlike Lua, the Lua VM usually works with 0 indexing). Then "print" is kept at constant index 1.
Next, we use the "getglobal" instruction to load the "print" global into register 0. Next we use "loadk" to put "Hello, World!" into register 1. Finally, we call register 0 (print), with one parameter (register 1, the "Hello, World!"), and keeping no return values. For more info on "call," read the writeup posted above.
So what are these "register" and "constant index" things? The Lua VM has four different stacks. The register stack, the constant stack, the upvalue stack, and the function prototype stack. The register stack is managed manually by the program. That's what .stacksize 2 is doing. It's telling Lua VM that we will be using no more registers than 2 (0, and 1). We don't have to use all the register we ask for, but we can't use more.
The constant stack is automatic. As you declare constants they get added to the stack. This is managed at compile time, not runtime, so there's no modifying it.
The upvalue stack is a stack that you can reference to get data from registers (or other upvalues) of the function prototype that you are a child of.
The function prototype stack is pretty much the same as the constant stack, except the data held is the prototypes of functions. You see, when you declare a function in code, you're not writing magical function code to memory or anything. The bytecode has a special section where all your functions are written, and your code creates closures to use those functions (see closure instruction).
Spoiler
The writeup has a different way of doing some things in its examples. For example, when you declare a function, you don't follow it with four numbers like the writeup does. Number of parameters and upvalues are managed automatically by the compiler, and varargs and stacksize are handled by the programmer.
.function
.vararg 4
.stacksize 0
.end
.varag defaults to 2, .stacksize must be specified in every function.Locals and upvalues do require a string following them that acts as the name of them, just like in the writeup. But now, they serve a purpose besides debugging data for the VM.
.stacksize
.local "a"
.upvalue "b"
getupval %a %b
Using the string names, we can reference their indexes in the register and upvalue stacks. Just a handy feature of the compiler.
Params and functions also have this ability optionally added, and constants can be declared inline by prefixing with "&"
.stacksize 2
.function "a"
.stacksize 2
.param "b" -- register index 0
getglobal 1 &"print"
move 2 %b
call 1 2 1
.end
.local "myClosure"
.closure %myClosure %a
loadk 1 &"Hello, World!"
call %myClosure 2 1
And that about does it. That's my language
Download: pastebin get ZghTBkmh lasm
Usage:
lasm [in file] [out file]
There is one little requirement though. LASM was designed with my Project NewLife in mind (which has been updated and now includes LASM), so it doesn't have any way of automatically making CraftOS able to run Lua bytecode files. So either at startup or at least before you try to run the output file, run the following code somehow.
function _G.loadfile(inFile)
local data = {}
local file = assert(fs.open(inFile,"rb"))
for i = 1,fs.getSize(inFile)do
data[i] = string.char(file.read())
end
file.close()
return loadstring(table.concat(data), fs.getName(inFile))
end