This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
Bye.'s profile picture

How to split a string at specified character

Started by Bye., 03 July 2016 - 09:04 AM
Bye. #1
Posted 03 July 2016 - 11:04 AM
Hi everyone,
I have a question for you: how can I split an input string at a specified character like in the code beneath

output = {}
input = read()
-- input = "command arg1 arg2"
output = slitString(input," ")
-- output = {"command","arg1","arg2"}

Thank you!
Edited on 03 July 2016 - 09:04 AM
LBPHacker #2
Posted 03 July 2016 - 11:27 AM
There is a solution that involves a for loop checking every letter. It'd add a letter to a buffer if it wasn't a space, and would flush the buffer (would insert its content into a table and empty it) if it was a space. That's the more tedious one, and I won't be explaining that.

Here's one with string.gmatch (read about it here):
local words = {}
for word in str:gmatch("%S+") do
    table.insert(words, word)
end

The above code splits str into words not containing anything that matches the character class "%s" (remeber, not containing, hence the "%S" with the uppercase S), which would be space, newline, tabs and form feed, if I recall correctly. It also ignores empty matches, so splitting
"a  b" -- note the double space
would result in
{"a", "b"}
, not in
{"a", "", "b"}
.

You could use "[^ ]+" instead of "%S" to allow anything but space in the resulting words, or also anything else, for example "[^;]+" would split str along semicolons.

Since you're planning to parse command lines with this, I must add that this solution is a bit complicated to extend to allow commands with quotes or backslashes to escape spaces, such as
program arg1 arg2 "this is arg3"
program arg1 arg2 this\ is\ arg3
Edited on 03 July 2016 - 09:29 AM
Bye. #3
Posted 03 July 2016 - 11:33 AM
There is a solution that involves a for loop checking every letter. It'd add a letter to a buffer if it wasn't a space, and would flush the buffer (would insert its content into a table and empty it) if it was a space. That's the more tedious one, and I won't be explaining that.

Here's one with string.gmatch (read about it here):
local words = {}
for word in str:gmatch("%S+") do
	table.insert(words, word)
end

The above code splits str into words not containing anything that matches the character class "%s" (remeber, not containing, hence the "%S" with the uppercase S), which would be space, newline, tabs and form feed, if I recall correctly. It also ignores empty matches, so splitting
"a  b" -- note the double space
would result in
{"a", "b"}
, not in
{"a", "", "b"}
.

You could use "[^ ]+" instead of "%S" to allow anything but space in the resulting words, or also anything else, for example "[^;]+" would split str along semicolons.

Since you're planning to parse command lines with this, I must add that this solution is a bit complicated to extend to allow commands with quotes or backslashes to escape spaces, such as
program arg1 arg2 "this is arg3"
program arg1 arg2 this\ is\ arg3

Thank you!
Where can I find a for loop tutorial? I mean a tutorial about advanced loops like what you used.
LBPHacker #4
Posted 03 July 2016 - 11:36 AM
string.gmatch (or anyString:gmatch) return iterators (read them about here). This "advanced" for loop is just another variant that can be used to, well, iterate using these iterators. The page I linked explains it better.
Edited on 03 July 2016 - 09:37 AM
Bye. #5
Posted 03 July 2016 - 11:38 AM
string.gmatch (or anyString:gmatch) return iterators (read them about here). This "advanced" for loop is just another variant that can be used to, well, iterate using these iterators. The page I linked explains it better.
Thank you!