This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
HDeffo's profile picture

Optimization and micro-optimization

Started by HDeffo, 05 September 2015 - 07:19 PM
HDeffo #1
Posted 05 September 2015 - 09:19 PM
IntroductionAfter spending over a month now testing various codes and formats in computercraft using a profler as well as reading lua optimization guides this is my quick and simple guide for the computercraft community on pushing the most out of their programs.

some of these methods offer a 30-90% increase in speeds as opposed to conventional methods used in most programs while others offer a minuscule shave off of run time that may not be worth the extra effort it takes to implement them. Anything over a 20% increase in average speeds i will mark as "MAJOR" so you know which methods to look towards first. If a tip is marked with "WEAK" that means trying to use this will usually result in more hassle than the particular method is worth and if implemented wrong could even cause a decrease in speed.

Before beginning optimization first ask yourself does my program even need optimized. Yes we all want to have an insanely blazing fast program but not all programs are considered time critical. A frame buffer should always be as fast as possible because it is relied on for its speed. A door password on the other hand as long as it has a reasonable speed doesnt need to be blazing fast. After asking yourself if a program needs optimized you also need to ask can I make something faster even before applying any optimization tricks. You will gain a much larger speed difference many times simply by rewriting a program in a more resource friendly way then you will by changing simply how you write and use those resource using optimization hacks. After you have passed those two questions and still decided your program is taking too long to do what you want please follow this guide to push that extra bit out of your code.


This guide will be divided into four parts.
The good practice section is stuff that increase readabilit and have a major effect on speed or have such a massive effect on speed it is enough to completely toss out the impact of readability. These tips should always be followed whenever you are coding anything that they apply to.
The optimization section features tips that can get a decent boost on speed while sacrificing some readability in return. These should only be followed if you feel speed is crucial to your program.
The micro-optimization section contains tips that only offer a minor impact on speed however they will usually hurt readability and functionality a lot of times. They are left here more for academic purposes in case anyone ever does have a reason to micro-optimize. Remember the first law of micro optimization: Don't do it.
The misinformation section contains tips that either were at one time helpful for optimization and now have no impact or a negative impact as well as tips that tend to be misinformed or just bad practices people tend to pick up when trying to improve their codes. This is essentially a section of things you SHOULD NOT DO


It should also be noted this guide ONLY applies to Lua in Computercraft in conjecture with the latest version. Old outdated versions, other lua platforms, and computercraft emulators may not benefit in the same way from these tips. If you want to apply these to a platform outside of minecraft computercraft please test them before assuming it still applies.


Good Practice
MAJOR: use local variables

local foo = "Hello World!"
print(foo)
is faster than

foo = "Hello World"
print(foo)
this also applies to using global variables. You can make these local to gain a large (roughly 30%) speed increase in your program


--#this is slower
local foo = os.time()
for i=1,100000
   x = math.sin(i)
end
print(os.time()-foo)


sleep(0.0)

--#this is faster
local sin = math.sin
foo = os.time()
for i=1,100000
   x = sin(i)
end
print(os.time()-foo)
why?when accessing a local variable on the machine code side of lua there is only 1 line of code that it needs to pass through as. If our lua code was

a = a + b
if a and b are both local and assuming they are 0 and 1 in the registry respectively then lua runs

ADD 0 0 1
however if a and b are both global values this becomes

GETGLOBAL 0 0
GETGLOBAL 1 1
ADD 0 0 1
SETGLOBAL 0 0
it is easy to see how this would build up very quickly on larger programs.

MAJOR: pre-index tablesI consider this the most important optimization tip in this guide because I have never seen a program take this step and it offers one of the most beneficial increases in speed (i recorded up to a 90% speed increase in my profiler). Essentially you should start a table with the number of indexes you expect it to have or as close to that as possible

--# this is slower
local foo = {}
for i=1,5 do
   foo[i] = i
end
--#than this
local bar = {true,true,true,true,true}
for i=1,5 do
   bar[i] = i
end

please note if you expect to use non numerical indexed values(or non sequential) then your pre indexed table should be done a bit differently


local foo = {[1] = true,[2] = true,[3] = true}

if you don't know the names of your non numerical indexes you can use the above example and name them 1-(size) and alternate removing an existing preset index (e.g. foo[5] = nil) and then adding a single new index
why?lua tables are built off of arrays behind the scenes. a single lua table is built off of an array and a hash. both of these require a preset size and whenever a table increases in size both of these need to be rebuilt behind the scenes according to this new size. When you add a new numerical index in sequential order 1,2,3,4… this increases the size of the array. When you add a non sequential non numerical index e.g. ["foobar"] or [6] it increases the size of the hash. When setting a value in a table to nil the sizes wont decrease until after you set a new value into the table which is why the last example of presetting the removing a preset value before every added value still works for that speed increase.

string metatables

string.len("foobar")--#slowest
("foobar"):len--#faster
local len = string.len
len("foobar")--#fastest
why?the string metatable takes fewer/less intensive lookups than a global function but more than a local function

buffer nested tables

foo.y.x = foo.y.x + 1 --#slower

local y = foo.y
y.x = y.x+1 --#faster
for anyone who wasnt aware making a new table from an old one only creates a pointer meaning changing y.x in our new table also changes foo.y.x
why?buffering a nested table means the machine has to do fewer lookups since it has to do 1 lookup per table in the nest every time you index a nested value

MAJOR: use internal buffers instead of iteratorsessentially find workarounds that dont require you to use an iterator such as pairs(). iterators add a LOT of extra function overhead and slow down a program.
why?Most iterators in Computercraft Lua iterate through the entire table given every time regardless of the output. So if you have a table of 10 items and iterate through all of them it has to go through 100 items to return your 10

use factorization

x*(y+z) --#faster
x*y + x*z --#slower
why?lua cant factor on its own and a factored expression is just easier on the machine.

Optimization
MAJOR: no variables in functions

local tbl = {"Hello","world!","I","am","me"}
function foo()
   for i=1,#tbl do
      print(tbl[i])
   end
end
is inherently faster than if the table was placed within the function. This only applies to functions that will be used more than once. Usually an object building function for example would only be run once per program in which case there would be no noticeable speed differences
why?every time a function with a variable defined in it is run that variable is recreated and added into the local registry for that function. Adding a variable to the registry takes a bit more time than simply looking up a local variable or upvalue.

MAJOR: don't put constants in loopsSame use and reasoning as above in "don't use variables in functions". The difference in this one is it only applies for constants (variables the loop itself wont be changing) and doesn't have a catch 22 to it. Simply define your constants before the loop instead of inside it

MAJOR:use table.concat to concat stringsplease note this tip is only major if your program uses a lot of string concats ".." or you concat vary large strings. otherwise pretty self explanatory
why?in lua strings are immutable while tables are not. This means every time you concat a string the machine must iterate through every character in the two strings and create a while new string in the registry. When you concat a table the values are simply appended together to return the new instance. Appending is faster because the size of the first string doesn't need to be taken into account when determining the time it takes to finish while in a string concat the size of the first string is still taken into account because it has to be remade along with the rest of the strings. On top of this a string concat has to make a new string per concat in the series which is where the major speed advantage comes from

print("H".."e".."l".."l".."o".."!")--#creates 10 new strings to print Hello!

--[[#if you are confused
#"H" + "e" = 2
#"He" + "l" = 4
#"Hel" + "l" = 6
#"Hell" + "o" = 8
#"Hello" + "!" = 10
#]]

print(table.concat({"H","e","l","l","o","!"},""))--#creates 1 new string to print Hello!

use sequential numeric indexes in tableuse 1,2,3,4,5… in your table indexes instead of ["foo"] or ["bar"] when possible
why?array lookups are a small bit faster than hash lookups

Don't use math.max or math.mininstead of max or min use

x = foo>bar and foo or bar --#max
x = foo<bar and foo or bar --#min
why?max and min both add extra function overhead that are not needed

use or when handling nil evaluationthere is a slight speed gain in putting nil checks in an or statement

if not foo then bar = "empty" else bar = foo end--#slower

bar = foo or "empty"--#faster
why?if statements have a little bit more overhead when evaluating a nil value compared to or.

Micro-optimization
WEAK:avoid upvalueswhile still faster than global variables an upvalue (or enclosed local variable) is a small bit slower than a local variable an example of an upvalue below

local foo = 6
local bar = 5
function getfunction()
   return function(name)
      return foo+bar
   end
end
this can be a catch 22. You should avoid creating functions that use any type of variable if possible (usually isn't). If you instead move a local variable into a function this will cause a decrease in speed larger than using the upvalue. this is explained in the next tip below
why?simply put an upvalue merely references to the existing local variable instead of creating a new one. This gives the machine one extra step in order to find a value but this step runs very fast so the difference isn't much

multiply dont divide

x*.5--#faster

x/2--#slower
why?the lua VM is better at multiplication than division
put all non variable math first

(1+2*3)+x --#faster
x+1+2*3 --#slower
why?lua has to look up the value for a variable every time it does a math operation to that variable then sets the value and continues to the next operation in the series. setting the variable to the last item handled or as far back as it can be means fewer lookups and sets.

avoid loadstring

local foo = loadstring("return true")--#slower
local foo = function() return true end--#faster
self explanatory
why?loadstring triggers the lua compiler which is a more intensive effort than simply creating a new function.

WEAK: use for instead of whilefor loops are a very very very small bit faster than while loops. Probably not worth the loss in readability if the program didn't need a for statement.
why?the virtual machine for lua has very specific instructions for for statements. The way these were written happen to be just slightly more micro-optimized than the instructions for a while statement
avoid table.insert

foo[#foo+1] = 5--#faster
table.insert(foo,5)--#slower
[namedspoiler="why?]
table.insert adds the extra overhead of a function slowing down the program.
[/namedspoiler]
[/namedspoiler]

[namedspoiler="avoid using the assert function"]
this function is more intensive than simple if statement
why?an if statement will skip everything in it if it returns false. Assert will still evaluate the code in it regardless of validity and only run the code if the statement is valid.

MAJOR: avoid unpackalthough it is more writing if you know the total number of returned variables use foo[1],foo[2],foo[3],foo[4]… instead of unpack(foo)
why?unpack adds extra function overhead and slows the call down 50%

use multiplication instead of exponentsx^2 is slower than x*x
why?the machine is not 'smart' enough to simplify the equation down into x*x itself.

MAJOR: avoid pairs() and ipairs()the old fashion i=1,x or i=1,#tbl is twice as fast as pairs() and ipairs(). even though you cant always replace them it is best to whenever possible.
why?pairs() and ipairs() add overhead of several extra functions not just their own overhead.

Misinformation/Bad practices
DO NOT localize your argumentsIn most Lua sandboxes (including old versions of computercraft) arguments ran faster if they were passed local variables instead of global or unsaved variables(such as a previous argument's variable). This is not the case in the current version of computercraft. Potentially depending on what you are saving you could even cause the run time to slow down trying to practice this outdated advice.
Edited on 07 September 2015 - 04:10 PM
MKlegoman357 #2
Posted 05 September 2015 - 09:32 PM
Well, this is going to be helpful. I already know where I'll be putting the table-string concat method.

EDIT: never assume something is self-explanatory. I've seen many people assuming that, while the "self-explanatory" wasn't that clear.
Edited on 05 September 2015 - 07:34 PM
Lupus590 #3
Posted 05 September 2015 - 09:38 PM
You may want to include warnings for errors people may think they find. The one I'm thinking of is declaring a local in a loop and then trying to use it outside of the loop.
HDeffo #4
Posted 05 September 2015 - 09:44 PM
You may want to include warnings for errors people may think they find. The one I'm thinking of is declaring a local in a loop and then trying to use it outside of the loop.

I actually advised against that and to instead put the local outside of the loop as it is faster if the variable instead changed every iteration.

that being said when I get back and finish writing this I will add warnings on what negatives each tip could have. the only ones with real negatives are the ones with WEAK in the spoiler title.
Edited on 05 September 2015 - 07:45 PM
HDeffo #5
Posted 05 September 2015 - 11:17 PM
EDIT: never assume something is self-explanatory. I've seen many people assuming that, while the "self-explanatory" wasn't that clear.

You are very right xD later today I will add more detail on those and explain them when I am back on a computer
Bomb Bloke #6
Posted 06 September 2015 - 12:53 AM
Some other things to check, if you're still in the mood. I've often pondered them but never bothered to sit down and test:

Spreading assignments over multiple lines. (I imagine that executes slower than clumping them all into one, but it may be sorted out at compile time for all I know.)

Term updates; if you need to update just the characters at either end of a line, is it faster to move the cursor once and re-blit the whole line, or is it faster to move the cursor twice in order to just do the ends? Does line width / use of monitors make a difference?

Bit shifts vs multiplying/dividing by two. The function call is a penalty, but when dealing with larger numbers, I suspect shifting may still win out.

Checking for odd numbers via bit BANDing instead of a modulus operation. Same as above, I suspect BAND tests are faster for larger numbers.

lua tables are built off of arrays behind the scenes. a single lua table is built off of an array and a hash. both of these require a preset size and whenever a table increases in size both of these need to be rebuilt behind the scenes according to this new size. When you add a new numerical index in sequential order 1,2,3,4… this increases the size of the array. When you add a non sequential non numerical index e.g. ["foobar"] or [6] it increases the size of the hash. When setting a value in a table to nil the sizes wont decrease until after you set a new value into the table which is why the last example of presetting the removing a preset value before every added value still works for that speed increase.

Just to elaborate for those who aren't familiar with Java's arrays;

Think of them as being like a table which only accepts numeric indexes. But on top of that, you have to specify exactly how many indexes you want available within them before you can start using them.

If you later find that you've run out of indexes and need more, then to "expand" an array, you typically 1) define a new one of a larger size, and then 2) manually copy every index from the old array to the new.

LuaJ stores all the numeric indexes of a Lua table in a Java array (it uses a separate hash map to store all the other keys). Every time it finds itself needing to perform the above process, the new array will be sized according to whatever power of two will suffice. Eg, if you have a table with 128 indexes and want to add one more, LuaJ will initialise a new 256 index array and copy over the existing 128 indexes before adding number 129.

So if you DO need to use a loop or something to fill a large one, theoretically doing so in reverse order should end up being faster (as you'll be targeting the highest index first, thereby resizing the array only once - directly to the maximum size needed).

use 1,2,3,4,5… in your table indexes instead of ["foo"] or ["bar"] when possible

… and if you do use strings as key names, use the format myTable.keyName to reference them where possible, as opposed to myTable["keyName"]. This saves LuaJ a step in figuring out whether you're providing a numeric index or not.

lua has to look up the value for a variable every time it does a math operation to that variable then sets the value and continues to the next operation in the series. setting the variable to the last item handled or as far back as it can be means fewer lookups and sets.

Er, but why does the order matter if the variable is still only mentioned once? Wouldn't the same amount of operations get performed on it either way?
Edited on 05 September 2015 - 10:58 PM
HDeffo #7
Posted 06 September 2015 - 01:10 AM
@bomb bloke (quotes don't work well on my phone so I'm doing this)


Thanks I'll look through your suggestions and test the speeds of each I hadn't thought of those things.

Reverse indexing would actually be worse. If indexes are saved in reverse it doesn't initially read them as being sequential. Lua saves non sequential indexes in it's hash and not it's array. So you'll be increasing the hash size each time you add a new index and then at the last one increases the size of the array and drops the size of the hash moving all indexes over to the array.

Compile time may change using foo.bar instead of foo["bar"] but run speeds are exactly the same. Both save using 4 operation codes both of which are exactly the same.

I will be honest here I actually don't know. I only know the official lua website says to order operations putting variable as the last operation. And my profile tests have shown this does offer a slight speed increase but as to why I don't know the op codes behind it's math so I really don't know. I assume for some reason it must get the value of the variable after each operation but that doesn't really make sense…so I'm sorry that one I don't have the best explanation on why
Edited on 05 September 2015 - 11:16 PM
Bomb Bloke #8
Posted 06 September 2015 - 03:30 AM
Reverse indexing would actually be worse. If indexes are saved in reverse it doesn't initially read them as being sequential. Lua saves non sequential indexes in it's hash and not it's array. So you'll be increasing the hash size each time you add a new index and then at the last one increases the size of the array and drops the size of the hash moving all indexes over to the array.

Ah… I'd missed that! Interesting on two levels; I'd long been wondering what shoving a random value into some super-high index did to RAM usage. Not much, as it turns out.

I suppose this means that if you do have to build a table out of sequence, it may be worth unpacking it into a new table definition afterwards in order to "smooth it out", so to speak.

Compile time may change using foo.bar instead of foo["bar"] but run speeds are exactly the same. Both save using 4 operation codes both of which are exactly the same.

You're right, at run-time the check shouldn't "need" to be performed in the same manner, so I guess the compiler would filter it out. And compilation times seldom matter.

It still needs to be done at run-time if the key is being provided via a variable, but since you can't use dot notation if you're needing to do that, that's a moot point.
HDeffo #9
Posted 06 September 2015 - 04:51 AM
-snip-

sorry I was wrong it is 3 op codes here they are exactly. although even more technical the lookup itself is only 2 regardless of method so even better :P/>



b=a.x
GETGLOBAL 0 ; a
GETDOTTED 2 ; x
SETGLOBAL 3 ; b
b=a["x"]
GETGLOBAL 0 ; a
GETDOTTED 2 ; x
SETGLOBAL 3 ; b
Edited on 06 September 2015 - 02:52 AM
Bomb Bloke #10
Posted 06 September 2015 - 05:57 AM
Out of interest, what does this resolve to?

b = a[x]
HDeffo #11
Posted 06 September 2015 - 06:08 AM
Out of interest, what does this resolve to?
b = a[x]

offhand from what i know of lua op codes it would be

GETGLOBAL 0 –#a
GETGLOBAL 1 –#x
GETDOTTED 2 –#value of x
SETGLOBAL 3 –#b

I could be wrong on this I only just recently got into looking through the lua codes
Bomb Bloke #12
Posted 06 September 2015 - 06:19 AM
Silly question, but I assume you're also considering how LuaJ actually handles these codes?
HDeffo #13
Posted 06 September 2015 - 06:31 AM
generally more codes=slower run time. That being said I haven't managed to look through LuaJ in its entirety so many of these are purely off of what my profiler is telling me along with what other people have written in on lua-users. The only one I have directly looked into on how LuaJ handles the codes were for tables because I wanted to make sure it handled tables the same backend as it did when wrapped over C++ before I made the suggestion of pre-indexing tables.

Just tested and confirmed by looking through the java end of the code. Line width when using term.blit does not matter.
Bomb Bloke #14
Posted 06 September 2015 - 10:48 AM
So if you want to alter multiple parts of a line, doing it in one call is always the better option? Thanks, I'd suspected as much. :)/>
SquidDev #15
Posted 06 September 2015 - 01:07 PM
Whilst I did know most of these, it is interesting seeing these all put together in one place. I will stress though, readable/maintainable code is far more important than micro-optimised code. These could probably be split into three sections:
  • efficient programming (such as table.concat, using hash lookups instead of a linear search, etc…)
  • necessary evils (caching global/table lookups)
  • "micro-optimisations" (avoid ipairs/pairs/min/max/unpack, for not while).
For those interested in finding more about the LuaJ VM, the relevant code can be found here. Just a couple of questions/bit of scepticism though:

put all non variable math first

(1+2*3)+x --#faster
x+1+2*3 --#slower
lua has to look up the value for a variable every time it does a math operation to that variable then sets the value and continues to the next operation in the series. setting the variable to the last item handled or as far back as it can be means fewer lookups and sets.

You've said this adds performance increases, though I have no clue why. Some (older) versions of Lua implemented constant folding, which putting the variable first would remove - however I'm pretty sure LuaJ doesn't implement it and so it should have no effect. Very odd. The relevant line for multiplication is here. What it effectively looks like is:


locals[result] = (isLeftConstant ? constants[left] : locals[left]).add((isRightConstant ? constants[right] : locals[right]))
So the performance increase is odd indeed.

In the same line of things:
don
does not need a why or explanation. Same use and reasoning as above in "don't use variables in functions". The difference in this one is it only applies for constants (variables the loop itself wont be changing) and doesn't have a catch 22 to it. Simply define your constants before the loop instead of inside it

Constant creation is cache, so again this should have effect. Odd.

localize arguments
the extent of this tip depends on what is being sent as an argument. from a very minor <5% speed difference in numbers to a whopping 92% speed improvement when sending functions. when convenient save an argument as a local variable before using it in a function.

when saving a variable as a local first it allows the function after to use a pointer to the variable instead of creating a new value in its registry.

Could you provide an example/the code you used to profile? I'm probably misunderstanding what your saying.


Just another couple of optimisations/enhancements on above you can do:

Upvalues are quicker than tablesWhen writing classes the common way of doing it is to use metatables:

local awesomeClass = {}
function awesomeClass:foo() return self.something end
local function new() return setmetatable({something  = "HELLO"}, {__index = awesomeClass}) end

It is quicker to use upvalues

local function new()
  local something = "HELLO"
  return {
	awesomeClass = function() return something end,
  }
end

Closures creation is slowCreating a closure (function within function) will happen every time you call that function. If you can, try to remove it to be outside that function. Have a fairly contrived example:


local function foo(x)
	local function bar(y)
	   -- Do something with x and y
	end

   return bar("Foo")
end
--#---------------------
local function bar(x, y)
   -- Do something with x
end
local function foo(x)
   return bar(x, "Foo")
end
Edited on 06 September 2015 - 11:08 AM
Bomb Bloke #16
Posted 06 September 2015 - 01:25 PM
How does the time it takes to define a new local compare to the time it takes to perform a table lookup?

That is to say, in this example:

foo.y.x = foo.y.x + 1 --#slower

local y = foo.y
y.x = y.x+1 --#faster

… you're trading one of four table lookups, for one extra assignment and variable initialisation. What's the average difference there?
Edited on 06 September 2015 - 11:29 AM
SquidDev #17
Posted 06 September 2015 - 02:01 PM

foo.y.x = foo.y.x + 1 --#slower

local y = foo.y
y.x = y.x+1 --#faster

… you're trading one of four table lookups, for one extra assignment and variable initialisation. What's the average difference there?

One thing to remember about Lua is that it is not entirely stack based but more a 'slot/register based' bytecode (the only time when the stack really comes into play is with variable number of arguments/return values). IIRC foo.y.x = foo.y.x + 1 is converted to the equivalent of:


local temp = foo
temp = temp.y

local temp2 = foo
temp2 = temp2.y

temp = temp.x
temp = temp + 1
temp2.x = temp

So there is no overhead to 'declaring' a new local variable as every expression is stored to a local variable anyway.

I recommend downloading ChunkSpy. You'll need to patch the IO library slightly (or just run it in normal Lua). If you run ChunkSpy.lua –auto –interact then you can type in expressions and get a dump of the Lua bytecode.
Edited on 06 September 2015 - 12:03 PM
HDeffo #18
Posted 07 September 2015 - 01:50 AM
when i posted this below the formatting went to complete hell.. be warned because its a little too much for me to clean up afterwards
Spoiler<p>I am very poor at explaining things and left people a little bit confused sorry! <img alt=" :unsure:/>" class="bbc_emoticon" src="http://www.computercraft.info/forums2/public/style_emoticons/default/unsure.png" title=" :unsure:/>" /> Until I can update the guide in a few hours please read through this of good points brought up by another player. A lot either I didn&amp;#39;t explain well at all or misunderstood the reason why and he does know a bit more on instructions than me so I will fix those accordingly</p>
<p> </p>
<div>
</div>
<div>WEAK:avoid upvalues //partially true, upvalues are usually a good approach and in most cases can not be superseded by any other means. Users should not &amp;#39;avoid&amp;#39; upvalues since doing so will probably render the resultant design inefficient. </div>
<div>[member=&amp;#39;viluon&amp;#39;]
</div>
<div>I put that this one was only partially true depending on case. This is considered a micro optimization and why i left it as WEAK</div>
<div> </div>
<div>
</div>
<div>MAJOR: no variables in functions //false, very very false. The code itself is a function, which is why loadstring returns a function, and the bytecode accepts the lowest level as a function!! Code execution cannot occur outside of a function, therefore all your variables are indeed in functions!</div>
<div>
</div>
<div>whenever possible especially in conjunction with strings lua attempts to keep only 1 point of reference for each variable and instead relies on pointers for any further references to it. This is especially true in terms of tables. When you place a variable definition inside of a function the bytecode has to define that variable every time the function runs again instead of just using its pointer. Allocation for it to create and dump a variable over and over again is more taxing on speed than merely telling it to hold onto that variable.</div>
<div> </div>
<div>
</div>
<div>MAJOR: pre-index tables (please everyone read this one especially) //true, should be stated that the best way to preindexing is filling with nil values, since such a preindex can then execute in a single instruction</div>
<div>
</div>
<div>nil isnt the best way to do this because the handle can also lower its own array size. This is a very misunderstood technique and people just started assuming making a lot of nils would be better because its fewer instructions. When you first create this table it will be of the size you want it to be however as soon as you add your first item to this table it will check its size and lower to match that single item putting even more work into it than had you just not preindexed at all. while Lua doesnt check for adding nil into a table it will check the number of items soon as a non nil is added.</div>
<div> </div>
<div>
</div>
<div>avoid loadstring //false, there is no other way to natively compile Lua source code. Correct name for this advice would be something like &amp;#39;Avoid frequent code compilation&amp;#39;.</div>
<div>
</div>
<div>not false similar to avoid upvalues. This is considered a micro-optimization technique where you sacrifice some readability for speed. There are ways occasionally to get around loadstring it is just most often preferred.</div>
<div> </div>
<div>
</div>
<div>MAJOR:use table.concat to concat strings //false, complete hoax! the native concatenation operator is faster and can concatenate multiple strings at once (using a single instruction), while saving the overhead of a function call AND a global variable lookup (table) AND the table lookup (concat) !!!</div>
<div>EDIT: Here is some evidence, taken from the document linked below:</div>
<div>
</div>
<div>&amp;gt;local a = &amp;quot;foo&amp;quot;..&amp;quot;bar&amp;quot;..&amp;quot;baz&amp;quot;</div>
<div>; function [0] definition (level 1)</div>
<div>; 0 upvalues, 0 params, 3 stacks</div>
<div>.function 0 0 2 3</div>
<div>.local &amp;quot;a&amp;quot; ; 0</div>
<div>.const &amp;quot;foo&amp;quot; ; 0</div>
<div>.const &amp;quot;bar&amp;quot; ; 1</div>
<div>.const &amp;quot;baz&amp;quot; ; 2</div>
<div>[1] loadk 0 0 ; &amp;quot;foo&amp;quot;</div>
<div>[2] loadk 1 1 ; &amp;quot;bar&amp;quot;</div>
<div>[3] loadk 2 2 ; &amp;quot;baz&amp;quot;</div>
<div>[4] concat 0 0 2</div>
<div>[5] return 0 1</div>
<div>; end of function</div>
<div>
</div>
<div>In the second example, three strings are concatenated together. Note that there is no string</div>
<div>constant folding. Lines [1] through [3] loads the three constants in the correct order for</div>
<div>concatenation; the CONCAT on line [4] performs the concatenation itself and assigns the</div>
<div>result to local a.</div>
<div>// end of evidence</div>
<div>
</div>
<div>This issue with this one is all of my benchmark tests continually show again and again table.concat is faster than string concats. every Benchmark test ive seen anyone in Lua do has shown string being slower. Every Lua site suggests using table instead of string. some of the official Lua pages mention using table instead of string. And again ive not seen a single benchmark show either being the same speed or string being faster. Your evidence shows maybe i dont understand why its faster but it has nothing to do with the fact it still is faster.</div>
<div> </div>
<div>
</div>
<div>use sequential numeric indexes in table //true, arrays are faster than tables</div>
<div>
</div>
<div> </div>
<div>
</div>
<div>MAJOR: don&amp;#39;t put constants in loops //false, local variables belong to the parent function, Lua VM has a special structure that stores information about when a local variable goes out of the scope. This does not affect loops in any way!</div>
<div>
</div>
<div>this is again like the function one. Benchmark and semi-official/official websites all state differently. If I am mistake on why this method is slower and its not defining the function every single iteration forgive me on that regardless the speed tests and other users dont lie.</div>
<div> </div>
<div>
</div>
<div>multiply dont divide //not tested, but this definitely depends on the implementation, so cannot be applied to all Lua platforms</div>
<div>
</div>
<div>tested thoroughly on Computercraft LuaJ implementation. valid enough?</div>
<div> </div>
<div>
</div>
<div>put all non variable math first //true, but not because the reason given. The Lua compiler does limited constant folding for numbers, so such an expression is evaluated at compile time and saved as the resultant constant, has nothing to do with lookups (which are pretty much instantaneous in case of local variables and constants btw)v</div>
<div> </div>
<div>
</div>
<div>I stated I wasnt sure why this is true it is just backed up by the numbers. I will use your explanation on it as to why</div>
<div> </div>
<div>
</div>
<div>use factorization //idk about this one</div>
<div>
</div>
<div>simpler equations are better basically its a common suggestion for any language because it holds true for any</div>
<div> </div>
<div>
</div>
<div>WEAK: use for instead of while //false, this was only true for Lua 4.x and no longer applies to Lua 5. A &amp;#39;while true do … end&amp;#39; loop will translate into a single unconditional jump instruction, so is very very efficient, much better than for loops actually (but then again, fors are more efficient in other scenarios)</div>
<div> </div>
<div>
</div>
<div>NORMALLY you are right yes. however with the implementation of Computercraft for and while loops both execute at extremely similar speeds. A for statement tested over repeated tests taking place over a week completed on average about 5% faster than a while statement. That being said both are very efficient and fast. This is considered micro-optimization and listed as WEAK</div>
<div> </div>
<div>
</div>
<div>avoid table.insert //partially true, but table.insert is more readable and easier to use if you aren&amp;#39;t appending to the end of the array (and table.insert could be implemented Java/C/C++/Whatever-side, so it actually could be faster, but generally isn&amp;#39;t)</div>
<div>
</div>
<div>micro-optimization tip. Some optimization takes speed in favor of readability.</div>
<div> </div>
<div>
</div>
<div>string metatables //true, but fastest would be #&amp;quot;asdf&amp;quot;</div>
<div>
</div>
<div>I believe you are right on that one and honestly didnt even think of that method. I will test it later and add that as an option if it does prove to be faster(i dont see why it wouldnt be with no function overhead)</div>
<div> </div>
<div>
</div>
<div>avoid using the assert function //false, you apparently misunderstood what assert does. Assert errors if the condition is not true, and is used e.g. for checking correct types of arguments.</div>
<div> </div>
<div>
</div>
<div>assert evaluates everything inside of it. Even if it doesnt continue and error the error message is still generated and evaluated. a simple if block only evaluates the if condition first and if that doesnt pass then it skips the code inside altogether. I do understand asserts an if statement is just faster</div>
<div> </div>
<div>
</div>
<div>MAJOR: avoid unpack //false, yes, it is better to avoid a function call if you know how many elements your table has, but unpack is pretty efficient (esp. when localized) and can be used on arrays of any length. Avoiding unpack can lead to hard-to-find bugs e.g. when you change the length of your array </div>
<div> </div>
<div>
</div>
<div>while being micro-optimization so i might remove the MAJOR in it this still is very true. In LuaJ according to speed tests unpack is extremely slow compared to not using it. consistently not using unpack pulls out a very massive speed DIFFERENCE. now both methods can be considered efficient because lua itself is very efficient. while the speed difference is radical between these since the speed in the first place is extremely low the noticeable speeds arent quite as radical </div>
<div> </div>
<div>
</div>
<div>Don&amp;#39;t use math.max or math.min //true, due to extra function calls</div>
<div>WEAK: use or when handling nil evaluation //true, should be stated that</div>
<div>
</div>
<div>local x = a==nil and b or c</div>
<div>
</div>
<div>can also be used (will return b if a is nil, or c if b is nil too. If a is different from nil, will return c)</div>
<div>use multiplication instead of exponents //true</div>
<div>
</div>
<div>I will add your part to the suggestions on nil handling :3</div>
<div> </div>
<div>
</div>
<div>MAJOR: localize arguments(all readers should also focus on this one) //true</div>
<div>
</div>
<div>IRONICALLY i just tested this one again in the latest version(most of my tests are from one version back) as of the current computercraft version this tip is no longer true at all. Not even remotely true anymore and i need to update this very quickly. While in previous versions of computercraft this held true and in C++ versions of lua it still seems to be the current one sometimes saving the value as a local can increase the speed because of the time it took to save those values as locals and since theres no speed difference between passing a global or local through a function argument.</div>
<div> </div>
<div>
</div>
<div>MAJOR: avoid pairs() and ipairs() //false, why?? They&amp;#39;re native and pairs can not be replaced, since it will also iterate on non-array elements</div>
<div>
</div>
<div>pairs() is faster than ipairs() and for i=1,x is faster than both. both use several functions behind the scenes especially if i remember right the next function. This function is written very inefficiently and iterates through the entire table each time it is run and it is run for each item in a table. In other words the larger your table is the slower pairs() and ipairs() become yet having very little effect on i=1,x</div>
<div> </div>
<div>
</div>
<div>buffer nested tables //true, while not very useful. Avoid nested tables if possible, to prevent the need for buffering</div>
<div>use a table for comparisons //true</div>
<div>
</div>
<div> </div>
<div>
</div>
<div>MAJOR: use internal buffers instead of iterators //eeh.. probably true but iterators aren&amp;#39;t that bad. If you are having trouble, change your design. Thats all.</div>
<div>
</div>
<div>this is essentially a very minor version of memoization that im suggestiong. While iterators handle what they do very well memoization is a lot lot better and is considered very good coding practice and should be at least attempted for an understand at.</div>
<div> </div>
<div>
</div>
<div>I don&amp;#39;t have time now but I will post more info on the topic when I do. While I was writing my message [member=&amp;#39;SquidDev&amp;#39;] replied to the topic. He&amp;#39;s a pro, listen to him.</div>
<div>
</div>
<div>It was actually SquidDev who told me how to get a very accurate profiler test set up for computercraft and first got me into running all of this.</div>
<div>
</div>
<div>There is no such thing as a GETDOTTED instruction. GETGLOBAL has two arguments, so none of the bytecode [member=&amp;#39;HDeffo&amp;#39;] posted would ever work.</div>
<div> </div>
<div>
</div>
<div>wont argue on this one i was just using what i found on lua-users since you are right I am sure on the bytecode end of Lua you know more than I do</div>
Edited on 06 September 2015 - 11:52 PM
Bomb Bloke #19
Posted 07 September 2015 - 02:02 PM
The trick there is to mash the rich text editor button until it bugs out in reverse and fixes its prior mistakes.

SpoilerI am very poor at explaining things and left people a little bit confused sorry! :unsure:/> Until I can update the guide in a few hours please read through this of good points brought up by another player. A lot either I didn't explain well at all or misunderstood the reason why and he does know a bit more on instructions than me so I will fix those accordingly

WEAK:avoid upvalues //partially true, upvalues are usually a good approach and in most cases can not be superseded by any other means. Users should not 'avoid' upvalues since doing so will probably render the resultant design inefficient.
[member='viluon']
I put that this one was only partially true depending on case. This is considered a micro optimization and why i left it as WEAK

MAJOR: no variables in functions //false, very very false. The code itself is a function, which is why loadstring returns a function, and the bytecode accepts the lowest level as a function!! Code execution cannot occur outside of a function, therefore all your variables are indeed in functions!
whenever possible especially in conjunction with strings lua attempts to keep only 1 point of reference for each variable and instead relies on pointers for any further references to it. This is especially true in terms of tables. When you place a variable definition inside of a function the bytecode has to define that variable every time the function runs again instead of just using its pointer. Allocation for it to create and dump a variable over and over again is more taxing on speed than merely telling it to hold onto that variable.

MAJOR: pre-index tables (please everyone read this one especially) //true, should be stated that the best way to preindexing is filling with nil values, since such a preindex can then execute in a single instruction
nil isnt the best way to do this because the handle can also lower its own array size. This is a very misunderstood technique and people just started assuming making a lot of nils would be better because its fewer instructions. When you first create this table it will be of the size you want it to be however as soon as you add your first item to this table it will check its size and lower to match that single item putting even more work into it than had you just not preindexed at all. while Lua doesnt check for adding nil into a table it will check the number of items soon as a non nil is added.

avoid loadstring //false, there is no other way to natively compile Lua source code. Correct name for this advice would be something like 'Avoid frequent code compilation'.
not false similar to avoid upvalues. This is considered a micro-optimization technique where you sacrifice some readability for speed. There are ways occasionally to get around loadstring it is just most often preferred.

MAJOR:use table.concat to concat strings //false, complete hoax! the native concatenation operator is faster and can concatenate multiple strings at once (using a single instruction), while saving the overhead of a function call AND a global variable lookup (table) AND the table lookup (concat) !!!
EDIT: Here is some evidence, taken from the document linked below:

>local a = "foo".."bar".."baz"
; function [0] definition (level 1)
; 0 upvalues, 0 params, 3 stacks
.function 0 0 2 3
.local "a" ; 0
.const "foo" ; 0
.const "bar" ; 1
.const "baz" ; 2
[1] loadk 0 0 ; "foo"
[2] loadk 1 1 ; "bar"
[3] loadk 2 2 ; "baz"
[4] concat 0 0 2
[5] return 0 1
; end of function
In the second example, three strings are concatenated together. Note that there is no string
constant folding. Lines [1] through [3] loads the three constants in the correct order for
concatenation; the CONCAT on line [4] performs the concatenation itself and assigns the
result to local a.
// end of evidence
This issue with this one is all of my benchmark tests continually show again and again table.concat is faster than string concats. every Benchmark test ive seen anyone in Lua do has shown string being slower. Every Lua site suggests using table instead of string. some of the official Lua pages mention using table instead of string. And again ive not seen a single benchmark show either being the same speed or string being faster. Your evidence shows maybe i dont understand why its faster but it has nothing to do with the fact it still is faster.

use sequential numeric indexes in table //true, arrays are faster than tables

MAJOR: don't put constants in loops //false, local variables belong to the parent function, Lua VM has a special structure that stores information about when a local variable goes out of the scope. This does not affect loops in any way!
this is again like the function one. Benchmark and semi-official/official websites all state differently. If I am mistake on why this method is slower and its not defining the function every single iteration forgive me on that regardless the speed tests and other users dont lie.

multiply dont divide //not tested, but this definitely depends on the implementation, so cannot be applied to all Lua platforms
tested thoroughly on Computercraft LuaJ implementation. valid enough?

put all non variable math first //true, but not because the reason given. The Lua compiler does limited constant folding for numbers, so such an expression is evaluated at compile time and saved as the resultant constant, has nothing to do with lookups (which are pretty much instantaneous in case of local variables and constants btw)v
I stated I wasnt sure why this is true it is just backed up by the numbers. I will use your explanation on it as to why

use factorization //idk about this one
simpler equations are better basically its a common suggestion for any language because it holds true for any

WEAK: use for instead of while //false, this was only true for Lua 4.x and no longer applies to Lua 5. A 'while true do … end' loop will translate into a single unconditional jump instruction, so is very very efficient, much better than for loops actually (but then again, fors are more efficient in other scenarios)
NORMALLY you are right yes. however with the implementation of Computercraft for and while loops both execute at extremely similar speeds. A for statement tested over repeated tests taking place over a week completed on average about 5% faster than a while statement. That being said both are very efficient and fast. This is considered micro-optimization and listed as WEAK

avoid table.insert //partially true, but table.insert is more readable and easier to use if you aren't appending to the end of the array (and table.insert could be implemented Java/C/C++/Whatever-side, so it actually could be faster, but generally isn't)
micro-optimization tip. Some optimization takes speed in favor of readability.

string metatables //true, but fastest would be #"asdf"
I believe you are right on that one and honestly didnt even think of that method. I will test it later and add that as an option if it does prove to be faster(i dont see why it wouldnt be with no function overhead)

avoid using the assert function //false, you apparently misunderstood what assert does. Assert errors if the condition is not true, and is used e.g. for checking correct types of arguments.
assert evaluates everything inside of it. Even if it doesnt continue and error the error message is still generated and evaluated. a simple if block only evaluates the if condition first and if that doesnt pass then it skips the code inside altogether. I do understand asserts an if statement is just faster

MAJOR: avoid unpack //false, yes, it is better to avoid a function call if you know how many elements your table has, but unpack is pretty efficient (esp. when localized) and can be used on arrays of any length. Avoiding unpack can lead to hard-to-find bugs e.g. when you change the length of your array
while being micro-optimization so i might remove the MAJOR in it this still is very true. In LuaJ according to speed tests unpack is extremely slow compared to not using it. consistently not using unpack pulls out a very massive speed DIFFERENCE. now both methods can be considered efficient because lua itself is very efficient. while the speed difference is radical between these since the speed in the first place is extremely low the noticeable speeds arent quite as radical

Don't use math.max or math.min //true, due to extra function calls
WEAK: use or when handling nil evaluation //true, should be stated that

local x = a==nil and b or c
can also be used (will return b if a is nil, or c if b is nil too. If a is different from nil, will return c)
use multiplication instead of exponents //true
I will add your part to the suggestions on nil handling :3

MAJOR: localize arguments(all readers should also focus on this one) //true
IRONICALLY i just tested this one again in the latest version(most of my tests are from one version back) as of the current computercraft version this tip is no longer true at all. Not even remotely true anymore and i need to update this very quickly. While in previous versions of computercraft this held true and in C++ versions of lua it still seems to be the current one sometimes saving the value as a local can increase the speed because of the time it took to save those values as locals and since theres no speed difference between passing a global or local through a function argument.

MAJOR: avoid pairs() and ipairs() //false, why?? They're native and pairs can not be replaced, since it will also iterate on non-array elements
pairs() is faster than ipairs() and for i=1,x is faster than both. both use several functions behind the scenes especially if i remember right the next function. This function is written very inefficiently and iterates through the entire table each time it is run and it is run for each item in a table. In other words the larger your table is the slower pairs() and ipairs() become yet having very little effect on i=1,x

buffer nested tables //true, while not very useful. Avoid nested tables if possible, to prevent the need for buffering
use a table for comparisons //true

MAJOR: use internal buffers instead of iterators //eeh.. probably true but iterators aren't that bad. If you are having trouble, change your design. Thats all.
this is essentially a very minor version of memoization that im suggestiong. While iterators handle what they do very well memoization is a lot lot better and is considered very good coding practice and should be at least attempted for an understand at.

I don't have time now but I will post more info on the topic when I do. While I was writing my message [member='SquidDev'] replied to the topic. He's a pro, listen to him.
It was actually SquidDev who told me how to get a very accurate profiler test set up for computercraft and first got me into running all of this.
There is no such thing as a GETDOTTED instruction. GETGLOBAL has two arguments, so none of the bytecode [member='HDeffo'] posted would ever work.
wont argue on this one i was just using what i found on lua-users since you are right I am sure on the bytecode end of Lua you know more than I do
Exerro #20
Posted 07 September 2015 - 05:32 PM
Avoiding defining constant locals and globals in functions is a valid point. Not quite sure if you're right with the pointer theory, but any code that is run in a function is likely to be run more than once, so if it doesn't need to be, better to have it out of the repeated code. The same goes for loops… why define it over and over in the loop when you can define it once outside of that loop? While defining a local really doesn't take much time, it does take time, and with thousands of iterations that can really add up.

Let's say it takes 0.001s to define a local (I'd suggest it takes a tiny proportion of this time really). You're also iterating through every pixel on a 51x19 screen like this.
for x = 1, 51 do
	for y = 1, 19 do
		local something = 5
	end
end
That local definition is being run 969 times, making the total time taken to define it 0.969 seconds (nearly a whole second).

With concatenation, it completely depends on how you're using it. If you're concatenating 5 things, using '..' is definitely quicker.

local s = "a" .. "b" .. "c" .. "d" .. "e"
-- will execute faster than
local s = table.concat { "a", "b", "c", "d", "e" }

However, you're generally concatenating things in a loop, something like this:

local s = ""
for i = 1, n do
	 s = s .. v
end
Don't do this! Lua (C-Lua at least) takes time to make sure you're not duplicating strings. When you create a string, it will check every other string in existence, and if it is equal to that string, use that instead of the new one. Absolutely no idea why it does this, but it does. This is a really slow operation, so creating strings repeatedly (like in the above example) is slow. In that case, using table.concat() is a lot better:

local t = {}
for i = 1, n do
	t[i] = v
end
local s = table.concat( t )
You'll notice ridiculous speed increases by doing this.

There is, indeed, no GETDOTTED instruction. There is GETTABLE:
GETTABLE A B C R(A) := R( B)/>[RK©] Copies the value from a table element into register R(A). The table is referenced by register R( B)/>, while the index to the table is given by RK©, which may be the value of register R© or a constant number.

Basically. The only difference between a.x and a[x] where x is some string is that the latter means that the index needs to be loaded into a register first (1 instruction). There is no difference between a.x and a["x"], Lua is clever, it can tell "x" is a constant string. a["x" .. ""], however, is much longer - 3 instructions longer than a.x (3 times the instructions).

I'd also like to suggest mentioning that… an example is the only way I can explain this.

Let's say you want to determine the circumference of a circle given its radius. You could use this:

	return 2 * math.pi * radius
See how there's a constant there '2*math.pi'? That isn't constant folded, so every time you call this function, you'll be multiplying 2 numbers that are constant. I would suggest that it's much quicker to do this:

local pi2 = 2 * math.pi
...
	return pi2 * radius
Taking constant values like that out of functions to the code above is generally a good idea. Some good mathematical knowledge will allow you to factorise your code then calculate the constants above, so you're doing 2 operations instead of 5.
Edited on 07 September 2015 - 03:32 PM
HDeffo #21
Posted 07 September 2015 - 05:37 PM
-snip-

you really seem to know your stuff in this case :D/> your comments will be tested and added to the guide. Thanks for your input
クデル #22
Posted 14 September 2015 - 08:24 AM
Thanks for sharing, will be definitely using some of these!
blunty666 #23
Posted 14 September 2015 - 06:27 PM
With concatenation, it completely depends on how you're using it. If you're concatenating 5 things, using '..' is definitely quicker.

local s = "a" .. "b" .. "c" .. "d" .. "e"
-- will execute faster than
local s = table.concat { "a", "b", "c", "d", "e" }

However, you're generally concatenating things in a loop, something like this:

local s = ""
for i = 1, n do
	 s = s .. v
end
Don't do this! Lua (C-Lua at least) takes time to make sure you're not duplicating strings. When you create a string, it will check every other string in existence, and if it is equal to that string, use that instead of the new one. Absolutely no idea why it does this, but it does. This is a really slow operation, so creating strings repeatedly (like in the above example) is slow. In that case, using table.concat() is a lot better:

local t = {}
for i = 1, n do
	t[i] = v
end
local s = table.concat( t )
You'll notice ridiculous speed increases by doing this.

Just to clarify on this, if I know exactly how many strings I am concatenating together I should use the ".." operator in one long statement. But if I don't know how many there'll be, I should stick them all in a table and call table.concat at the end.
Exerro #24
Posted 14 September 2015 - 09:02 PM
-snip-

Just to clarify on this, if I know exactly how many strings I am concatenating together I should use the ".." operator in one long statement. But if I don't know how many there'll be, I should stick them all in a table and call table.concat at the end.

Yeah pretty much, putting all of them in one line will be quickest because it's one instruction, but obviously you can't do that with a set of unknown length strings, so table.concat is the best way to go there.
Bomb Bloke #25
Posted 15 September 2015 - 02:28 AM
Note that table.concat() starts to beat out .. concatenation once you reach a certain number of elements. I'm finding that number to be about ten to fifteen.

local minElements, maxElements, word, reps, counter = 1, 200, "\"asdf\"", 1000000, 0

print("Test: "..word..".."..word..".."..word.." vs table.concat()")

repeat
	local elements, curElements = {}, math.floor((maxElements - minElements) / 2 + minElements)
	for i = 1, curElements do elements[i] = word end
	
	local func1 = loadstring("local t = os.clock()    for i = 1, "..reps.." do    local s = "..table.concat(elements,"..").."    end   return os.clock() - t")()
	sleep(0)
	local func2 = loadstring("local e = {"..table.concat(elements,",").."}    local t = os.clock()    for i = 1, "..reps.." do    local s = table.concat(e)    end    return os.clock() - t")()
	sleep(0)

	counter = counter + 1
	print("Set "..counter..", "..curElements.." elements: "..func1.."s (..), "..func2.."s (concat)")

	if func1 > func2 then maxElements = curElements else minElements = curElements end
until func1 == func2 or minElements == maxElements
Wergat #26
Posted 11 January 2016 - 03:10 PM
Hi,
I am currently working on a big project that requires a bunch of optimizing and i am trying to reduce unwanted latency caused by my bad coding. It would be great if you could answer me a few questions i have about optimization in CC.

1) Is it recommend to cache a array's index amount?
Spoiler

local t = {elements = {1,2,3,4,5,6},amount = 6}
-- Using
t.amount
-- instead of
#t.elements
If i use the cached version X times, does it become faster?

2) Objects/Classes with "optimized" tables + get/setters or …not?
Spoiler

local aObject = {
  d = {
	{10,20,23984,43,6,8,293,.9239,921392}
	{1,2,13.332,34634,8,293,59,.0001,329}
  }
  getX = function(i) return d[1][i] end
  getY = function(i) return d[2][i] end
  setX = function(i,v) return d[1][i] = v end
  setY = function(i,v) return d[2][i] = v end
}
local bObject = {
  x = {10,20,23984,43,6,8,293,.9239,921392}
  y = {1,2,13.332,34634,8,293,59,.0001,329}
}
-- I know this example is pretty stupid but it might help you get my point
Wich one would be better? Are there even better ways to solve something like this?

3) Is there a difference when using t.abc or t["abc"]?
Spoiler

local t = {foo = "bar"}
-- Is this faster?
t.foo
-- Or this?
t["foo"]

4) How fast is math.abs?
SpoilerFaster than a function like this?

local function pos(num)
   return ((num<0) and num*-1 or num)
end

5) How about returning something after else
Spoiler

local function positive(num)
	if(num<0)then
		return num*-1
	else
		return num
	end
end
or

local function positive(num)
	if(num<0)then
		return num*-1
	end
	return num
end
Edited on 11 January 2016 - 02:10 PM
SquidDev #27
Posted 11 January 2016 - 04:05 PM
Putting my answers in a spoiler to prevent massive walls of text:
Spoiler
1) Is it recommend to cache a array's index amount?
I tested this quite recently. Yes:

-- Basic indexing
-- table.insert (cached to a local)
for i = 1, times do insert(tbl, i) end -- 3.6 seconds

-- Using #tbl + 1
for i = 1, times do tbl[#tbl + 1] = i end -- 2.7 seconds

-- Table length as a variable
local y, n = {}, 0 for i = 1, times do n = n + 1 y[n] = i  end -- 0.6 seconds
If you can, save the length as an upvalue. Otherwise, storing it in the table should be fine.

2) Objects/Classes with "optimized" tables + get/setters or …not?
Which one would be better? Are there even better ways to solve something like this?

Just use hashes instead of an array unless you really need to. With a table of 1000000 items and 100000000 accesses it takes 3.7 seconds for a hash and 2.4 for an array - not much difference.
3) Is there a difference when using t.abc or t["abc"]?
No. Both get converted into a TABLEGET instruction. There is a performance difference between table["abc"] and table[abc], but not much.

4) How fast is math.abs?
5) How about returning something after else
You can use -x instead of -1*x. As long as you are caching the math.abs access, it is quicker than calling a Lua function. An inlined implementation is twice as fast, but takes more time to write.

You can use os.clock() to get the current computer time, then try an operation n times and calculate the change in time: I like to do something 100000000 times, but you may need to change this to prevent too long without yielding errors. You can also use Linux's time command to test in the command line.

Also: Just to quote myself from another topic:

Gotta agree with ElvishJerricco, you should work on making the algorithms you use as fast as possible. There is a famous quote which is pulled out every time someone asks about optimisation, to warn people off it.
Premature optimization is the root of all evil

However, the full quote is as follows:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
Profile your code - write stubs for CC only methods and then use LuaProfiler to check how well it runs, optimise them, and repeat. Don't optimise for the sake of readability, and don't optimise a method because it might be slow, you haven't checked.
Edited on 11 January 2016 - 03:06 PM
Bomb Bloke #28
Posted 12 January 2016 - 12:24 AM
Definitely test things if in doubt - especially in this thread, people've got into the habit of digging up what other people have said online, but most people around the web are talking about C implementations of Lua - and ComputerCraft uses LuaJ, version 2.0.3.

Which is why using LuaProfiler will not give you accurate results, and nor will tests performed in many CC "emulators". You can get "hints" that way, but not accuracy. If you want best results within the CC environment, then test within the CC environment.

-- Using
t.amount
-- instead of
#t.elements
If i use the cached version X times, does it become faster?

Not as much as if you defined "amount" outside of the "t" table - pulling it out of there is slower than just defining it as a variable within the closest scope you can stick it in.

2) Objects/Classes with "optimized" tables + get/setters or …not?

As a general rule, function calls are slower than table lookups. As SquidDev points out, if things get complicated, simply take advantage of your table's ability to act as a hashmap.

4) How fast is math.abs?

Without testing it, I'd say your function is faster (assuming you fix it per SquidDev's comment) - because math.abs() involves a lookup into the "math" table to get the "abs" function. Even this would lead to a speed increase:

local pos = math.abs

Of course, sticking the (val>0 and val or -val) construct directly into your code where you need it would be faster again - because that eliminates the function call, too.
HDeffo #29
Posted 23 January 2016 - 09:50 AM
Alright so now that I am back to computercraft again I ran my benchmark tests on all my suggestions again found something interesting… Basically every version of CC even small updates seems to change what saves time and what doesn't. Which would also explain the many debates on some of the methods depending on date and version. Since I come and go from the community and don't have as much time as an extensive list like this requires I would to enlist some help in keeping this updated as much as possible. Any volunteers? I feel a guide like this can be handy but obviously if its outdated it'll only cause more problems
SquidDev #30
Posted 23 January 2016 - 10:35 AM
Alright so now that I am back to computercraft again I ran my benchmark tests on all my suggestions again found something interesting… Basically every version of CC even small updates seems to change what saves time and what doesn't. Which would also explain the many debates on some of the methods depending on date and version. Since I come and go from the community and don't have as much time as an extensive list like this requires I would to enlist some help in keeping this updated as much as possible. Any volunteers? I feel a guide like this can be handy but obviously if its outdated it'll only cause more problems

Its odd that CC versions change the speed of the Lua VM. Dan doesn't really touch the LuaJ code. The latest versions have changed the string encoding methods when converting to and from Java - though this should only change performance when calling Java methods.

Do you have a GitHub repo with your benchmarks? I'd be happy to help put some more together. There are some other benchmarking programs around which might be worth looking into - but they focus more on comparing implementations of CC rather than actual code.
Bomb Bloke #31
Posted 23 January 2016 - 12:10 PM
I've also noticed speed differences between builds of ComputerCraft, but how much of them have to do with "ComputerCraft" and how much of them have to do with the ton of other mods in the packs I was using is unclear to me. I believe it to be faster in general than it was when I first start using it, back at CC version 1.5.

Probably the most important changes involve rendering. For example, CC 1.6 introduced the window API and rigged all advanced computers to render through it by default. Then CC 1.74 came along and said API was heavily optimised using the new term.blit() command - it's now actually easier to get better rendering rates using the window API than without, depending on the complexity of what you're drawing.

(Tip: Use term.getCurrent() term.current() to get a hold of the window object your multishell tab is using, set it to invisible before performing complex render operations, then set it visible to have them blitted to the screen altogether in one go. Whoosh.)

But I suspect the event-handling backend of ComputerCraft has undergone some changes as well, and certainly, much has changed within the Lua-based APIs we use on a regular basis.
Edited on 23 January 2016 - 11:20 PM
Lyqyd #32
Posted 23 January 2016 - 06:29 PM
You may have meant term.current, unless the function name was changed in the most recent versions.
Bomb Bloke #33
Posted 24 January 2016 - 12:16 AM
Yeah, that.
HDeffo #34
Posted 24 January 2016 - 06:34 PM
It would actually be good to know why versions are such a big impact on speed of individual functions as well. As for benchmarks I have been meaning to start a raw table of different tests but as I said I just hadn't had much free time lately and my schedule is only just now starting to free up. I currently use pepperfish for benchmarking however since that relies on a slight change in computercraft to enable the debug API I also test as a secondary with various timing functions such as os.clock. Both of these methods unfortunately do have some overhead to them however I am not technical enough to completely rebuild Computercraft while staying as true to function speeds as possible while adding profiling tests that wont add any overhead
Sewbacca #35
Posted 11 July 2016 - 04:20 PM
When it is more useful to save the type of a var in a loop then asking it every time?
Example:

-- Saving the type
for i = 1, #tab do
  local typ = type(tab[i])
   if typ == 'table' then
  elseif typ == 'function' then
  <...>
end
-- predeclaring a var
local typ;
for i = 1, #tab do
  typ = type(tab[i])
   if typ == 'table' then
  elseif typ == 'function' then
  <...>
end
-- Asking the type every time
for i = 1, #tab do
   if type(tab[i]) == 'table' then
  elseif type(tab[i]) == 'function' then
  <...>
end
Exerro #36
Posted 11 July 2016 - 04:43 PM
Predeclaring a variable does nothing performance wise, it just changes which variables are 'visible' to sub-blocks (unless it's taken outside of a function in which case upvalues are involved, but not here).

The first example should be better every time: rather than getting a global (type), calling it for every block, then performing the comparison, you're just performing the comparison. Because of how Lua treats locals, there's no performance difference in these two:


local t = type( x )
print( t == v )

and


print( type( x ) == v )

…and that's because in the first case, the result of the call is stored in the register for 't' which is then used in the comparison, and in the latter, the result of the call is stored in a newly allocated register and then used in the comparison. Basically, they're both stored in a register, so it's all the same. However, if you were to use the result of type( x ) again, it'd be easier to just use the register for 't' rather than calling the function again.

At the same time, if you only have a few cases in that if statement, there's no significant advantage performance-wise to storing the type in a variable, so it's just down to what looks best and what's more readable really.
Sewbacca #37
Posted 11 July 2016 - 05:40 PM
Predeclaring a variable does nothing performance wise, it just changes which variables are 'visible' to sub-blocks (unless it's taken outside of a function in which case upvalues are involved, but not here).

The first example should be better every time: rather than getting a global (type), calling it for every block, then performing the comparison, you're just performing the comparison. Because of how Lua treats locals, there's no performance difference in these two:


local t = type( x )
print( t == v )

and


print( type( x ) == v )

…and that's because in the first case, the result of the call is stored in the register for 't' which is then used in the comparison, and in the latter, the result of the call is stored in a newly allocated register and then used in the comparison. Basically, they're both stored in a register, so it's all the same. However, if you were to use the result of type( x ) again, it'd be easier to just use the register for 't' rather than calling the function again.

At the same time, if you only have a few cases in that if statement, there's no significant advantage performance-wise to storing the type in a variable, so it's just down to what looks best and what's more readable really.

Thank You =)
CrazedProgrammer #38
Posted 11 July 2016 - 08:14 PM
Wow, never saw this topic, great job!
Although I already do around 90% of the optimizations you pointed out, this will help me make even faster code!
Thanks! :D/>
Edited on 11 July 2016 - 06:14 PM
The Crazy Phoenix #39
Posted 11 July 2016 - 11:58 PM
If I'm trying to initialize a table of dynamic or insanely high size, how can I optimize it such that LuaJ will initialize its array with the size I want? For example, a 1,048,576-sized array.

Whilst you address many interesting optimizations, to some, there really is no way of abusing them when using dynamic sizes.
Bomb Bloke #40
Posted 12 July 2016 - 01:31 AM
It can be done, but I can't think of any method that'd actually be faster than simply sticking the values in one index at a time.

For example, an array the size you're talking about can be built within a twentieth of a second via a simple "for" loop - you'd need to go a lot larger than that before it'd start to matter.
Edited on 11 July 2016 - 11:32 PM
The Crazy Phoenix #41
Posted 12 July 2016 - 08:42 PM
It can be done, but I can't think of any method that'd actually be faster than simply sticking the values in one index at a time.

For example, an array the size you're talking about can be built within a twentieth of a second via a simple "for" loop - you'd need to go a lot larger than that before it'd start to matter.

Timed it and it took 0.15 seconds, 15x more than you said. Initializing the array with the correct size initially could reduce that time by quite a bit.
KingofGamesYami #42
Posted 12 July 2016 - 09:00 PM
That's only 3 times more than 0.05. And it varies depending on the computer - it even varies depending on the test. Also, tables are heavily optimized from the Java side of LuaJ.
I performed a simple test, the script test being

local t = {}
local time = os.clock()
for i = 1, 1048576 do
t[ i ] = i
end
print( os.clock() - time )
And I got these (rather varied) results:
Edited on 12 July 2016 - 08:09 PM
The Crazy Phoenix #43
Posted 13 July 2016 - 10:37 AM
I was also creating 1,048,576 zeros in my test, so that may have something to do with it. I got similar results every time I ran it (about 0.15 seconds).
Edited on 13 July 2016 - 08:38 AM
HDeffo #44
Posted 19 July 2016 - 09:34 AM
If I'm trying to initialize a table of dynamic or insanely high size, how can I optimize it such that LuaJ will initialize its array with the size I want? For example, a 1,048,576-sized array.

Whilst you address many interesting optimizations, to some, there really is no way of abusing them when using dynamic sizes.

please note the following is all untested. I remember reading on the lua talk pages of a little "hack" where you could initialize an empty table then use some bytecode hacking to set it to a predefined size of nil values then save it again as the new larger table. That being said as it is untested I am not sure the proper bytecode to achieve this nor am I sure how this would affect run times. Unless you feel like delving into the really complicated and needless parts of Lua you are best off sticking with bomb bloke's suggestion. In LuaJ a for loop doesn't increase run times at all not even to declare its running variable or at least short enough time that a profiler couldn't pick it up. Alternatively as stated declaring it full of nil values prior to running that for loop would still be faster but no one wants to type out tbl = {(1 million+ nils)}. For loops seem the most practical solution to this in any case