This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
theoriginalbit's profile picture

Extended String Library

Started by theoriginalbit, 04 January 2013 - 11:15 AM
theoriginalbit #1
Posted 04 January 2013 - 12:15 PM
Thread Closed in favour of merging to a single, easier to manage, thread and website
SpoilerExtended String Library
v2.3

This library has been inspired by me trying to do some common high-level language string functions that don't exist in Lua. So at 2am, I decided to go through Java, C# and Objective-C string classes and implement most of the functions in Lua.

Some functions that are implemented are already in Lua, but I did them anyway. If there are any you wish to be added, if you find any bugs, or any incorrect implementations please report them here

Java Functions ( from here )
Spoiler
function charAt( str, index )
Returns the character at the specified index of the string

function compareTo( oneString, anotherString )
Compares two strings lexicographically

function compareToIgnoreCase( oneString, anotherString )
Compares two strings lexicographically ignoring case differences

function concat( oneString, anotherString )
Returns a new string with the second string concatenated to the first

function contains( str, seq )
Returns true if, and only if, this string contains the specified sequence of character values

function contentEquals( oneValue, anotherValue )
Compares the string against the specified value

function copyValueOf( tData )
Returns a String that represents the characters in the table specified

function copySubsetOf( tData, offset, count )
Returns a String of the same length as count that represents the characters in the table specified starting from offset

function endsWith( str, suffix )
Tests if the string ends with the specified suffix

function equals( str, value )
Compares the string to the specified value

function equalsIgnoreCase( oneStr, anotherStr )
Compares the string to another string, ignoring case considerations

function getBytes( str )
Encodes the string into a sequence of bytes, storing the result into a new byte array

function getChars( str )
Converts this string to a new character table.

function hashCode( str )
Returns a hash code for this string ( Java's algorithm )

function indexOf( str, ch )
Returns the index within the string of the first occurrence of the specified character

function indexOfAfter( str, ch, index )
Returns the index within the string of the first occurrence of the specified character, starting the search at the specified index.

function indexOfSubstring( str, sub )
Returns the index within the string of the first occurrence of the specified substring.

function indexOfSubstringFrom( str, sub, index )
Returns the index within the string of the first occurrence of the specified substring, starting at the specified index.

function isEmpty( str )
Returns true if, and only if, length is 0.

function lastIndexOf( str, ch )
Returns the index within the string of the last occurrence of the specified character.

function lastIndexOfFrom( str, ch, index )
Returns the index within the string of the last occurrence of the specified character, searching backward starting at the specified index.

function lastIndexOfSubstring( str, sub )
Returns the index within the string of the rightmost occurrence of the specified substring.

function lastIndexOfSubstringFrom( str, sub, index )
Returns the index within the string of the last occurrence of the specified substring, searching backward starting at the specified index.

function regionMatches( str, sOffset, other, oOffset, len)
Tests if two string regions are equal.

function replace( str, old, new )
Returns a new string resulting from replacing all occurrences of old in the string with new

function replaceFirst( str, old, new )
Returns a new string resulting from replacing the first occurrence of old in the string with new

function split( str, regex )
Returns table of the string split around matches of the given regular expression

function splitLimit( str, regex, lim )
Returns table of the string split around matches of the given regular expression, until limit - 1 is reached

function startsWith( str, prefix )
Tests if this string starts with the specified prefix.

function subSequence( str, beginIndex, endIndex )
Returns a new character sequence that is a subsequence of the string.

function toCharTable( str )
Converts this string to a new character table.

function trim( str )
Returns a copy of the string, with leading and trailing whitespace omitted.

function valueOf( tData )
Returns a string representation of the table supplied

function valueOfWith( tData, offset, count )
Returns a string representation of length count of the table supplied from offset

C# Functions ( from here )
Spoiler
function isWhitespace( str )
Indicates whether a specified string consists only of white-space characters.

function padLeft( str, count )
Returns a new string that right-aligns the characters in this instance by padding them with spaces on the left, for a specified total length.

function padLeftWith( str, count, char )
Returns a new string that right-aligns the characters in this instance by padding them on the left with a specified character, for a specified total length.

function padRight( str, count )
Returns a new string that left-aligns the characters in this string by padding them with spaces on the right, for a specified total length.

function padRightWith( str, count, char )
Returns a new string that left-aligns the characters in this string by padding them on the right with a specified Unicode character, for a specified total length.

function trimEnd( str )
Removes all trailing occurrences of a set of characters specified in an array from the current Stringobject.

function trimStart( str )
Removes all leading occurrences of a set of characters specified in an array from the current Stringobject.

Objective-C Functions ( from here )
Spoiler
function stringWithContentsOfFile( path )
Returns a string with contents of the file at the given path

function stringWithContentsOfURL( url )
Returns a string with the contents of the given url

function writeToFile( str, path )
Writes the given string to a file at the given path

function characterAtIndex( str, index )
Returns the character at the given index of the string

function getCharacters( str, range )
Converts this string to a new character table.

function componentsSeparatedByString( str, sub )
Returns a table containing substrings from the string that have been divided by a given separator.

function substringFromIndex( str, index )
Returns a new string containing the characters of the string from the one at a given index to the end.

function substringToIndex( str, index )
Returns a new string containing the characters of the string up to, but not including, the one at a given index.

function substringWithRange( str, index, len )
Returns a string object containing the characters of the string that lie within a given range.

function rangeOfString( str, sub )
Finds and returns the range of the first occurrence of a given string within the string.

function rangeOfStringWithinRange( str, sub, index, len )
Finds and returns the range of the first occurrence of a given string, within the given range of the string.

function stringByReplacingOccurrencesOfStringWithString( str, old, new )
Returns a new string in which all occurrences of a target string in the string are replaced by another given string.

function capitalizedString( str )
Returns a capitalized representation of the string.

function uppercaseString( str )
Returns an uppercased representation of the string.

function lowercaseString( str )
Returns lowercased representation of the string.

function intValue( str )
Returns a floored number of the given string

function boolValue( str )
Returns a boolean based on the string, if the string is not a boolean, returns nil

function pathComponents( str )
Returns a table of strings containing, in order, each path component of the string.

function lastPathComponent( str )
Returns the last path component of the string.

function isAbsolutePath( str )
Returning a Boolean value that indicates whether the string represents an absolute path.

function pathExtension( str )
Interprets the string as a path and returns the string's extension, if any.

function stringByAppendingPathComponent( str, comp )
Returns a new string made by appending to the string a given string.

function stringByAppendingPathExtension( str, ext )
Returns a new string made by appending to the string an extension separator followed by a given extension.

function stringByDeletingLastPathComponent( str )
Returns a new string made by deleting the last path component from the string, along with any final path separator.

function stringByDeletingPathExtension( str )
Returns a new string made by deleting the extension (if any, and only the last) from the string.

function stringByAddingPercentEscapes( str )
Returns a representation of the string converted into a legal URL string.

function stringByReplacingPercentEscapes( str )
Returns a new string made by replacing in the string all percent escapes with the matching characters

Functions added by me, that I felt would be useful:
Spoiler
function containsIgnoreCase( str, seq )
Returns true if, and only if, this string contains the specified sequence of character values, ignoring case differences


function startsWithIgnoreCase( str, prefix )
Tests if this string starts with the specified prefix, ignoring case differences


function endsWithIgnoreCase( str, suffix )
Tests if the string ends with the specified suffix, ignoring case differences


function regionMatchesIgnoreCase( str, sOffset, other, oOffset, len)
Tests if two string regions are equal, ignoring case differences


function sentenceCase( str )
Returns a string with the first letter of each sentence capital and the rest lowercase.


function titleCase( str )
Returns a string with the first letter of each word capital and the rest lowercase.


function splitLineToTable( str, width )
Returns a table containing lines of text that do not exceed the length of the supplied width or a word being broken across two lines.


function splitLine( str, width )
Returns a string containing line breaks at the given width without splitting a word over the line


function count( str, regex )
Returns the number of times the regex appears in the string


function countIgnoreCase( str, regex )
Returns the number of times the regex appears in the string ignoring the case


function isLower( str )
Returns a boolean representing if all characters in the string are lowercase


function isUpper( str )
Returns a boolean representing if all characters in the string are uppercase


function isAlpha( str )
Returns a boolean representing if all characters in the string are within the standard ASCII alphabet


function isAlphaNumeric( str )
Returns a boolean representing if all characters in the string are either within the standard ASCII alphabet or a number


function isNumeric( str )
Returns a boolean representing if all characters in the string are numbers


function isPunctuation( str )
Returns a boolean representing if all character in the string are punctuation

Requested
SpoilerRequested by GravityScore:
function isHexadecimal( str )
Returns a boolean representing if all characters in the string are hexadecimal ( NOTE: Cannot support 0x20 formatted hexadecimal )

Using the OO (Object-Oriented) Library
Spoilercreating an extended string object
local output = str:new( "Yes this is string!" )

interacting with an extended string object
output:charAt( 1 )
this can be done with any function*

*the following functions are not OO and should be called like normal
  • copyValueOf( tData )
  • copySubsetOf( tData, offset, count )
  • valueOf( tData )
  • valueOfWith( tData, offset, count )
  • stringWithContentsOfFile( path )
  • stringWithContentsOfURL( url )


Download the standard library here
Download the String Object library here
Bug and Suggestion Reporter here

Change log:
Spoilerv2.3
  • FIXED: isLower, isUpper, isAlpha, isAlphaNumeric — all now use pattern matching
  • ADDED: isNumeric, isPunctuation, isHexadecimal
v2.2
  • ADDED: isLower, isUpper, isAlpha, isAlphaNumeric
v2.1
  • I can't remember… oops…
v2.0
  • ADDED: New download, string object library - deal with this api as an object
  • FIXED: General bugfix
v1.3.2
  • ADDED: 2 new functions
v1.3.1
  • ADDED: 2 new functions
v1.3
  • ADDED: Implemented a selection of Objective-C methods
  • FIXED: Misc bugfixes
v1.2
  • ADDED: Implemented a selection of C# methods
  • FIXED: Misc bugfixes
v1.1.1
  • FIXED: Bug in compareTo and compareToIgnoreCase functions, weren't comparing lexicographically
v1.1
  • ADDED: hashCode function to Java's implementation
  • FIXED: bug in isEmpty now returns true if length = 0
v1.0
  • Initial release

Edited on 23 January 2013 - 06:27 PM
theoriginalbit #2
Posted 04 January 2013 - 10:12 PM
Update 1.1

Adds Java standard hashCode function!
Fixes bug in isEmpty, now evaluates as per Java implementation
kornichen #3
Posted 04 January 2013 - 11:34 PM
This is so great!
theoriginalbit #4
Posted 05 January 2013 - 12:21 AM
This is so great!

Why thank you. :D/>

Currently out to dinner, when I get back home I'll fix up OP to explain functions and give some nicer details. So check back in a few hrs ;)/>
GravityScore #5
Posted 05 January 2013 - 09:18 AM
Wow this is useful :D/>. Good luck with the Objective-C library - it's like 180 functions long!

Please, in the Objective-C library add the "working with paths" functions (like [string stringByAppendingPathComponent:@"hello"], etc…).
theoriginalbit #6
Posted 05 January 2013 - 11:40 AM
Wow this is useful :D/>

Thanx, I was hoping it would be useful to others. :)/>

Good luck with the Objective-C library - it's like 180 functions long!

Please, in the Objective-C library add the "working with paths" functions (like [string stringByAppendingPathComponent:@"hello"], etc…).
Yeh Obj-C isn't tiny, but it has some nice stuff. I started programming with Obj-C, and I've always liked what it had.

I shall give it a go… After I figure out exactly what it does ;)/> that's the hardest part sometimes. Replicating exactly what the functions do. On that note I just noticed a wrong implementation on one of the functions. Damn.
theoriginalbit #7
Posted 07 January 2013 - 05:20 PM
UPDATE!

Changes:
All Versions: BUGFIXES!
v1.2: Implemented some C# functions
v1.3: Implemented some Obj-C functions

Again, if you find any bugs or have any suggestions for functions in this library, report them here
theoriginalbit #8
Posted 09 January 2013 - 04:50 PM
Minor Update

Added 2 new functions:

function splitLineToTable( str, width )
Returns a table containing lines of text that do not exceed the length of the supplied width or a word being broken across two lines.

function splitLine( str, width )
Returns a string containing line breaks at the given width without splitting a word over the line
anonimo182 #9
Posted 09 January 2013 - 04:59 PM
Nice library of new functions!
theoriginalbit #10
Posted 09 January 2013 - 05:04 PM
Thank you :)/>

If there are any that you wish to see, just post them on the suggestions page. :)/>
theoriginalbit #11
Posted 10 January 2013 - 10:07 PM
Minor Update
v1.3.2

Added 2 new functions:


count( str, regex )
Returns the number of times the regex appears in the str


countIgnoreCase( str, regex )
Returns the number of times the regex appears in the str ignoring the case
NeverCast #12
Posted 11 January 2013 - 12:43 PM
Loading this API in cc-emu crashes the computer. not tried it in actual minecraft yet. Know anything about this? Because I don't, I can't even protected call the os.loadAPI without issues.. I'll try running the code globally.. one moment

Right so after testing it seems I cannot override the global string instance with this api, correct?
theoriginalbit #13
Posted 11 January 2013 - 01:16 PM
Loading this API in cc-emu crashes the computer. not tried it in actual minecraft yet. Know anything about this? Because I don't, I can't even protected call the os.loadAPI without issues.. I'll try running the code globally.. one moment

Right so after testing it seems I cannot override the global string instance with this api, correct?
I was gunna say "What?! I'm running cc-emu and its perfectly fine!"

Yes that is correct the string metatable is protected and cannot be modified. Doing this will work though


if not strLib then os.loadAPI( "path/to/lib/strLib" ) end

s = "Test:string"
s = strLib.replace( s, ":", " " )
NeverCast #14
Posted 11 January 2013 - 02:45 PM
That explains it then! At least I can use it fine :)/> Thanks
theoriginalbit #15
Posted 11 January 2013 - 03:26 PM
That explains it then! At least I can use it fine :)/> Thanks

Your welcome… I'm constantly adding new functions to this, if there are any you want added just report them on the suggestions page (see OP) … One of my little side projects is to make this OO so they are dealt with like objects. so instead of doing
var = strLib.replace( var, "\r\n", "\n" ) 
you would be able to do
var = var:replace( "\r\n", "\n" )
theoriginalbit #16
Posted 12 January 2013 - 02:36 AM
Update!
v2.0
  • Now has an Object version! ( See OP )
  • General bug-fixes
NeverCast #17
Posted 14 January 2013 - 04:00 PM
After messing around I think you can make your string the only implementation.

lets assume that your lib is called stringx internally.

-- remember the native string table
local nativestring = string
-- create an index to it so we carry on the functions using a metatable
local stringx_mt = { __index = nativestring }
-- set the metatable to stringx
setmetatable(stringx, stringx_mt)
-- use rawset to set the string instance without invoke any metatable functions
rawset(_G, "string", stringx)

-- now later on outside the api, we can test that we successfully overrode the string table
v = "hello   "
print(v:trim())

Good luck :)/>
NeverCast #18
Posted 14 January 2013 - 04:03 PM
It seems I may have been wrong, although you can override functions that string already has, you can't seem to create new ones. Probably because of the metatable that string has internally. I got my hopes up when I managed to make string.len return the length and print out hello world.
theoriginalbit #19
Posted 14 January 2013 - 04:13 PM
yeh the string metatable is protected… believe me if it wasn't i would have been overriding it! ;)/>
NeverCast #20
Posted 14 January 2013 - 04:16 PM
There must be a way to remove this ( Besides replace bios.lua )! *Hacker Face!*
theoriginalbit #21
Posted 14 January 2013 - 04:18 PM
There must be a way to remove this ( Besides replace bios.lua )! *Hacker Face!*
glhf and tell me if you manage it without rewrites…
Eric #22
Posted 14 January 2013 - 08:47 PM
There must be a way to remove this ( Besides replace bios.lua )! *Hacker Face!*
glhf and tell me if you manage it without rewrites…

Something like this could probably do it, with:


real_getmetatable = select(some_n, steal_stack(getmetatable))
theoriginalbit #23
Posted 14 January 2013 - 08:52 PM
Something like this could probably do it, with:


real_getmetatable = select(some_n, steal_stack(getmetatable))

Is there meant to be a code download on that page or something? Because last I checked steal_stack wasn't a function in Lua
Eric #24
Posted 15 January 2013 - 11:38 AM
Something like this could probably do it, with:


real_getmetatable = select(some_n, steal_stack(getmetatable))

Is there meant to be a code download on that page or something? Because last I checked steal_stack wasn't a function in Lua

Yep, the code is:
local steal_stack = loadstring("\27\76\117\97\81\0\1\4\4\4\8\0\0\0\0\0\0\0\0\0\7\0\0\0\0\1\2\250\4\0\0\0\65\62\0\0\101\0\0\0\28\64\0\0\94\0\0\0\1\0\0\0\0\0\0\0\0\4\0\0\0\3\0\0\0\4\0\0\0\5\0\0\0\7\0\0\0\0\0\0\0\0\0\0\0");

Unfortunately, that doesn't work in CC, presumably since luaJ uses a stackless VM
theoriginalbit #25
Posted 15 January 2013 - 11:51 AM
Yep, the code is:
local steal_stack = loadstring("\27\76\117\97\81\0\1\4\4\4\8\0\0\0\0\0\0\0\0\0\7\0\0\0\0\1\2\250\4\0\0\0\65\62\0\0\101\0\0\0\28\64\0\0\94\0\0\0\1\0\0\0\0\0\0\0\0\4\0\0\0\3\0\0\0\4\0\0\0\5\0\0\0\7\0\0\0\0\0\0\0\0\0\0\0");
Where was that?! I couldn't see that on there… never mind I now see it… damn byte code, didn't stand out to me :P/>

Unfortunately, that doesn't work in CC, presumably since luaJ uses a stackless VM
Damn… oh well…
NeverCast #26
Posted 15 January 2013 - 11:54 AM
Byte code is what I'll be using, the Endianess of LuaJ has put me a bit behind, but spitting out functions to compiled bytecode, pulling out the chunk string and changing the bytecode, then reloading it back in to a function and reassigning the function should do it. It's a long shot and I'll be a while before I'm confident enough with my Lua Bytecode
theoriginalbit #27
Posted 15 January 2013 - 11:56 AM
Byte code is what I'll be using, the Endianess of LuaJ has put me a bit behind, but spitting out functions to compiled bytecode, pulling out the chunk string and changing the bytecode, then reloading it back in to a function and reassigning the function should do it. It's a long shot and I'll be a while before I'm confident enough with my Lua Bytecode
Yeh I'm not even going to bother with bytecode…
Eric #28
Posted 15 January 2013 - 11:59 AM
Byte code is what I'll be using, the Endianess of LuaJ has put me a bit behind, but spitting out functions to compiled bytecode, pulling out the chunk string and changing the bytecode, then reloading it back in to a function and reassigning the function should do it. It's a long shot and I'll be a while before I'm confident enough with my Lua Bytecode
Not sure you can pull off the "reloading it back in to a function" step. A quick test shows it's more complicated:

> ms = string.dump(getmetatable)
> loaded_getmetatable = loadstring(ms)
> getmetatable({})
> loaded_getmetatable({})
string:-1: vm error: java.lang.NullPointerException
> string.dump(loaded_getmetatable) == string.dump(getmetatable)
true
theoriginalbit #29
Posted 18 January 2013 - 11:36 AM
Update!
v2.2
  • Added 4 new functions: isLower, isUpper, isAlpha, is AlphaNumeric
GravityScore #30
Posted 18 January 2013 - 10:43 PM
Request: isHexidecimal
theoriginalbit #31
Posted 19 January 2013 - 01:11 AM
Request: isHexidecimal
Done! And more! :P/>

Update
v2.3
  • FIXED: isLower, isUpper, isAlpha, isAlphaNumeric — all now use pattern matching
  • ADDED: isNumeric, isPunctuation, isHexadecimal
AfterLifeLochie #32
Posted 25 January 2013 - 12:48 AM
Locked by request.