This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
nothign's profile picture

Efficiency and Tokenizing strings

Started by nothign, 26 June 2013 - 02:25 PM
nothign #1
Posted 26 June 2013 - 04:25 PM
Title: Efficiency and Tokenizing strings

I've written some prototypes and methods to split a string into tokens, then color them if they're comments, numbers, strings or they match predefined strings.
I'm writing it for an in-game editor.
All the code works fine, and as intended.

But I'm new to lua, so my question here is; Is there a better way to do this? Can I make my code more efficient?

I've written what I've done up in a separate file so it contains only what it needs, and can be run to see the results.
Note: an advance terminal is needed. Otherwise it serializes the results to a file "tableSaved"

The code can be found here:
http://pastebin.com/ZKKKLwb5
Cranium #2
Posted 26 June 2013 - 06:00 PM
Split to new topic.
GopherAtl #3
Posted 26 June 2013 - 09:35 PM
lua patterns could probably help make this more efficient; you can learn about lua's patterns (which are quite similar to, but ultimately not, regular expressions) from the lua reference manual, or if you're more comfortable with it, you can check out this tutorial by 1lann.
nothign #4
Posted 27 June 2013 - 02:44 AM
I implemented a solution using patterns and timed it running vs my original and its much faster. Often over two times as fast. So that's great.

Buuuuut I could not find on the internet anywhere or create myself a pattern that matches strings very well. That is to say.. double quote/single quote pairs, taking into account escapes.

What I've done is here http://pastebin.com/e4Qqahzv

Or if that's too much to look at then the specific pattern I'm using to match strings is

"^\".*[^\\]?\""

Running the original method with this string

"local test = \"\\\"\"..(17 + 47 + tonumber(\"932\"))..\"\\\" is my number!\" --and this is my comment!"
Returns these strings (others omitted)

"\"\\\"\""    ".."   "("   "17"   " "   "+"   " "   "47"   " "   "+"   " "   "tonumber"   "("   "\"932\""   ")"   ")"   ".."   "\"\\\" is my number!\""
While running the string through the pattern based method returns this string in that specific area

"\"\\\"\"..(17 + 47 + tonumber(\"932\"))..\"\\\"
GopherAtl #5
Posted 27 June 2013 - 03:52 AM
I'm afraid if there's a way to properly match quoted strings that allows for escaped quotes and empty strings with a single lua pattern, I have never found it, so you may have to fall back on a more manual approach there, or at least do a series of pattern matches. The only way I can think of to avoid stopping and checking for a \ before each " after the first is to first do a sub, where you replace all \" and \' with some other patterns, then match strings, then reverse the substitution, but any speed benefit may be lost by that point.

If you do come up with a single pattern to correctly match these in a single pass, be sure to share it, I'd be very interested to see it!