Regarding the Lua encoding - ComputerCraft Forums (archive)

SavinaRoja #1

15 posts

Posted 30 July 2012 - 02:03 PM

It appears to me that ComputerCraft can't use Lua scripts that are encoded in UTF-8. I re-encoded one of my scripts from ANSI to UTF-8 this morning and it failed to execute, always erroring on the first line due to an unrecognized character.

Has anyone else observed this? Is it generally understood that you should not encode in UTF-8 for CC?

ChiknNuggets #2

ChiknNuggets's profile picture

127 posts

Posted 30 July 2012 - 03:13 PM

UTF-8 doesnt have as many characters as ANSI the ones that are missing are things like " , and replaces them with a symbol such as • which when run as a program is not the right symbol so it throws an error if you want to see other info on this check the codepage layout section on these 2 links http://en.wikipedia.org/wiki/UTF-8#Codepage_layout http://en.wikipedia.org/wiki/Windows-1252#Codepage_layout

I hope this explains what you wanted to know

MysticT #3

1604 posts

Posted 30 July 2012 - 05:13 PM

ChiknNuggets, on 30 July 2012 - 03:13 PM said:
UTF-8 doesnt have as many characters as ANSI the ones that are missing are things like " , and replaces them with a symbol such as • which when run as a program is not the right symbol so it throws an error if you want to see other info on this check the codepage layout section on these 2 links http://en.wikipedia....Codepage_layout http://en.wikipedia....Codepage_layout

I hope this explains what you wanted to know

???
UTF-8 is unicode encoding, so it can encode more characters than ANSI, since it uses multi-byte characters.
The LuaJ compiler/interpreter is made to read ANSI files, that's why you can't use UTF-8.

SavinaRoja #4

15 posts

Posted 30 July 2012 - 06:08 PM

MysticT, on 30 July 2012 - 05:13 PM said:
ChiknNuggets, on 30 July 2012 - 03:13 PM said:
UTF-8 doesnt have as many characters as ANSI the ones that are missing are things like " , and replaces them with a symbol such as • which when run as a program is not the right symbol so it throws an error if you want to see other info on this check the codepage layout section on these 2 links http://en.wikipedia....Codepage_layout http://en.wikipedia....Codepage_layout

I hope this explains what you wanted to know
???
UTF-8 is unicode encoding, so it can encode more characters than ANSI, since it uses multi-byte characters.
The LuaJ compiler/interpreter is made to read ANSI files, that's why you can't use UTF-8.

Yeah, it didn't seem to make sense to me to say that a unicode encoding had fewer characters than a latin-only encoding. Thanks both of you though for helping clear up ANSI and why it doesn't work. I wasn't sure if this was OS environment dependent (ie. server running on Windows -> ANSI) or just specific to CC. If I understand correctly now, this is general to LuaJ but not Lua.