This is a read-only snapshot of the ComputerCraft forums, taken in April 2020.
Gumball's profile picture

bCode Interpreter Renewed

Started by Gumball, 22 November 2014 - 06:41 PM
Gumball #1
Posted 22 November 2014 - 07:41 PM
Alright, first of all, I KNOW. My last bCode interpreter was such a piece of crap, but, this one is much better. It has variables, and strings, and some built in functions. And the root base of the variables is a table/array. To create a variable, do:
createVar(string)

That creates a variable, and remember to make the variable a string. But when you use the variable, it wont be a string.
To define a variable thats been created, do this:
defineVar(variable,string)

Other functions that are built in:
read(string/variable, or nil) or getInput
print(variable/string)
write(variable/string)
OK, and most of the comments below, are for the old bCode interpreter so just ignore them.

Pastebin: pastebin get GATC4FS2 int
Edited on 17 February 2015 - 04:20 AM
Lignum #2
Posted 22 November 2014 - 08:11 PM
Okay, here's a list of problems I've found:
  • It's hardcoded. I have to type import {api{screen}} exactly. If there's one space too much, it won't work, it has to be exactly that line.
  • What's the point of importing anyway? Looking at your code it's just a statement that enables a certain feature. What's wrong with keeping all features enabled at the same time?
  • Your code has global variables. This would be a huge security leak if the program had more impacting features (such as filesystem access)…
  • … which is another problem. It's incomplete. All you can do is print text and compute a sum (which also prints it…?). I understand that you'll probably be adding more features in the future but you shouldn't release it on the same day you made it.
So in summary, this project needs a lot of work in order for it to be useful. I don't want to discourage you but I don't think the project is quite on the right track… you'd probably manage to create a working language but the code would just be a giant mess in the end.
Gumball #3
Posted 23 November 2014 - 03:40 AM
Okay, here's a list of problems I've found:
  • It's hardcoded. I have to type import {api{screen}} exactly. If there's one space too much, it won't work, it has to be exactly that line.
  • What's the point of importing anyway? Looking at your code it's just a statement that enables a certain feature. What's wrong with keeping all features enabled at the same time?
  • Your code has global variables. This would be a huge security leak if the program had more impacting features (such as filesystem access)…
  • … which is another problem. It's incomplete. All you can do is print text and compute a sum (which also prints it…?). I understand that you'll probably be adding more features in the future but you shouldn't release it on the same day you made it.
So in summary, this project needs a lot of work in order for it to be useful. I don't want to discourage you but I don't think the project is quite on the right track… you'd probably manage to create a working language but the code would just be a giant mess in the end.

You try to make something like this. It was hard for me just to figure out how to read the file lines. I MEANT THIS IS A WORK IN PROGRESS. And also im going to add privates (basically local in LUA or private/protected in Java. Thats right, I know some Java :)/> ) And im still adding things at this moment. So yah, and i'm also trying to implement LOADING ACTUALY FILES AS AN API, and the import I KNEW THAT ITS JUST A VARIABLE. Thats for if you don't want so many "APIs" available. Try that. HARD! So yeah. WORK IN PROGRESS. Sorry if im a little agressive. Its just that tons of people never really point out the POSITIVE, quite some do, just whenever somethings new they always point out the negative. -bluebird

EDIT:
And how would the global variables be a "huge security leak"? Holy shit call the biggest program McAfee has.
Edited on 23 November 2014 - 05:11 AM
MKlegoman357 #4
Posted 23 November 2014 - 09:29 AM
Making your own programming language is hard. Making an interpreter for it is harder. What Lignum pointed out was things you should be looking at. Yes, those things are negative, but they are needed for you so you would be able to make something better, not to make you feel bad or anything.

Using globals variables is bad because:
  • local variables are faster than global variables
  • global variables can be accessed outside the program

If you, lets say, overwrite the FS API and save the original (not overwritten) FS API in a global variable, all someone would have to do is use that global variable to bypass your FS overwrite. This is just an example, but from it you should imagine why using globals is bad.

And about the language itself. It looks quite interesting, not as bad as other 'custom' programming languages might be. I have some suggestions actually:

how about ending function (or whatever they are called) with a bracket?


screen.log {
Hello
} <-- right here

Also, instead of an arithmetic library, why not use symbols (+, -, *, /)?

As I said, creating a language might be quite a challenge. I would suggest you to first fully design it, and then making an interpreter for it. By designing I mean something like this (just an example):


# this is my language
# all comments start with ( # )

# calling a function "print" which is in a library called "system"
# giving it a string argument "Hello World!"
system-print{ <Hello World!> }

# defining a variable of type "int" (integer) called "myNumber" and assigning '3' to it
def int myNumber -> 3

So this is an example of a design of a language that I made up while writing this post.

Good luck with it :)/>
zekesonxx #5
Posted 23 November 2014 - 10:24 AM
While I'm all for new programming concepts, I think going against well-established patterns, like "/' for strings, is a bad idea.
ElvishJerricco #6
Posted 23 November 2014 - 10:41 AM
You're going to need to learn about the theory of programming languages. The main thing is understanding language grammars. How do you define what a statement is and how it's written? Understanding this is the key to designing a language. Take a look at this file (from a project of mine) to get a quick look at how these things work. The main parts are:
  1. The lexer
  2. The lexer is responsible for reading input and turning it into a series of tokens. Think of any program as one long string. The lexer reads the characters in this string one at a time and recognizes when a substring of characters makes a "token". A token is an entity in the language. For example, in the English language, each word and punctuation mark is a token. In Lua, each variable name, operator or control statement (like if, else, or while) is a token, and there's many more in Lua.
  3. The parser
  4. The parser gets to see a program as a series of tokens. This is where the knowledge of grammars becomes useful. A grammar is a mathematical description of how different tokens are capable of coming together. We can define a rudimentary grammar for English like this:
    
    G = sentence
    sentence = subject predicate
    subject = noun | adjective subject
    predicate = verb | adverb predicate | predicate adverb
    
    G is the grammar, describing all that can be our simple English. The grammar starts as "sentence". A sentence can be a subject followed by a predicate, and nothing else. A subject can be a noun, or it can be another subject prefixed with and adjective (blue car, for example). Note that noun and adjective are each a "terminal", which means that there isn't a grammar definition for it because it directly represents a token that a lexer would generate. The sentence "Jake ran swiftly" fits our simple english like this:
    
           G
           |
        sentence
         |    |
    subject  predicate
    |               |
    noun(Jake)      (predicate adverb(swiftly))
                      |
                     verb(ran)
    
    This is known as the "abstract syntax tree", or "AST". Notice that each node descends into further nodes, except the "terminal" nodes noun, verb, and adverb. And notice that if you read the terminal nodes from the leftmost one to the rightmost one with no regard for depth, you get our sentence back "Jake ran swiftly".
  5. The generator
  6. Only compiled languages worry about this part. They have to take the AST generated by the parser and turn it into machine code. However interpreted languages are allowed to just read the AST and do the jobs that each node calls for on the fly. So you probably don't need to worry about generation. The file I linked above doesn't have a generator. Rather it emits an AST and a different file interprets that AST.

Anyway I hope this helps. Having a solid understanding of tokenization and parsing should make your language infinitely more capable. Languages are some of my favorite projects (hence my involvement in four of them in CC alone =P). Good luck!
Edited on 23 November 2014 - 09:46 AM
Gumball #7
Posted 23 November 2014 - 07:37 PM
I know, I have the abilty to get as advanced as oeed, I can understand his code, but not make it, atleast not easily. I'm just trying to use simple API libraries, such as string.find, and tables to figure out what lines to ignore and to make the error detector not care about the line, like the # message, heres what I would use:


elseif(line == string.find(string.sub(0,1),"#")) then
  ignoreLine[lineA] = line
end

something like that. What it does:

I have multiple variables that go down to rely on one variable, like lineA, if I don't what the interpreter to read one line and forget another, I use lineA to help me keep track of that. So it just basically ignores anything with # at the beggining.

That code might give errors, but yeah. Work in progress.

While I'm all for new programming concepts, I think going against well-established patterns, like "/' for strings, is a bad idea.

And yeah, I agree the / to begin and \ to end a string or something like that is a pretty weird and dumb idea. USE THE "s. Which I will try to implement.
Saldor010 #8
Posted 23 November 2014 - 07:50 PM
I know, I have the abilty to get as advanced as oeed, I can understand his code, but not make it, atleast not easily. I'm just trying to use simple API libraries, such as string.find, and tables to figure out what lines to ignore and to make the error detector not care about the line, like the # message, heres what I would use:


elseif(line == string.find(string.sub(0,1),"#")) then
  ignoreLine[lineA] = line
end

something like that. What it does:

I have multiple variables that go down to rely on one variable, like lineA, if I don't what the interpreter to read one line and forget another, I use lineA to help me keep track of that. So it just basically ignores anything with # at the beggining.

That code might give errors, but yeah. Work in progress.

.. So I guess if someone tries to make a comment after an already written line, it won't be recognized as a comment?
A better way might be to use string.find to find the first # in the line, and then ignore everything after that. Of course, some tinkering would have to be made if someone wanted to include a # in a string or other variable.
wieselkatze #9
Posted 23 November 2014 - 08:26 PM
So when I start your program without arguments, it doesn't even work.
Looking into the code, I saw you used "crash("Usage: bCode [file]")" - that function does not even exist.
I think you meant "error("Usage: bCode [file]")" , so please test your code before you upload it.

Other than that I would encourage you to do some stuff with the string library before you go onto making your own language.
Learning it is especially helpful if you're trying to read some stuff from a file and want to bring it to a universal form - e.g. getting that import thing to work with spaces.

For example converting

{ api   {screen  } }
to

{api{screen}}
with

string.gsub( "{ api   {screen  } }", "%s+", "")

This should also be really helpful in understanding things, as you already know the basics of a language.

~wieselkatze
Gumball #10
Posted 24 November 2014 - 02:21 AM
I know, I have the abilty to get as advanced as oeed, I can understand his code, but not make it, atleast not easily. I'm just trying to use simple API libraries, such as string.find, and tables to figure out what lines to ignore and to make the error detector not care about the line, like the # message, heres what I would use:


elseif(line == string.find(string.sub(0,1),"#")) then
  ignoreLine[lineA] = line
end

something like that. What it does:

I have multiple variables that go down to rely on one variable, like lineA, if I don't what the interpreter to read one line and forget another, I use lineA to help me keep track of that. So it just basically ignores anything with # at the beggining.

That code might give errors, but yeah. Work in progress.

.. So I guess if someone tries to make a comment after an already written line, it won't be recognized as a comment?
A better way might be to use string.find to find the first # in the line, and then ignore everything after that. Of course, some tinkering would have to be made if someone wanted to include a # in a string or other variable.

You might not have seen the string.sub, that says: start chopping from the string at column 0, then stop at 1, then check if the string has a # as what it found while chopping, so tinkering would not have to happen. :)/>

So when I start your program without arguments, it doesn't even work.
Looking into the code, I saw you used "crash("Usage: bCode [file]")" - that function does not even exist.
I think you meant "error("Usage: bCode [file]")" , so please test your code before you upload it.

Other than that I would encourage you to do some stuff with the string library before you go onto making your own language.
Learning it is especially helpful if you're trying to read some stuff from a file and want to bring it to a universal form - e.g. getting that import thing to work with spaces.

For example converting

{ api   {screen  } }
to

{api{screen}}
with

string.gsub( "{ api   {screen  } }", "%s+", "")

This should also be really helpful in understanding things, as you already know the basics of a language.

~wieselkatze

Check your code before you use it. The string.sub didnt do anything but give an error.
ElvishJerricco #11
Posted 24 November 2014 - 04:51 AM
Check your code before you use it. The string.sub didnt do anything but give an error.

That's string.gsub…
Gumball #12
Posted 24 November 2014 - 05:46 AM
oh i thought that was a typo, well, ADDING :D/>
Saldor010 #13
Posted 24 November 2014 - 01:09 PM
I know, I have the abilty to get as advanced as oeed, I can understand his code, but not make it, atleast not easily. I'm just trying to use simple API libraries, such as string.find, and tables to figure out what lines to ignore and to make the error detector not care about the line, like the # message, heres what I would use:


elseif(line == string.find(string.sub(0,1),"#")) then
  ignoreLine[lineA] = line
end

something like that. What it does:

I have multiple variables that go down to rely on one variable, like lineA, if I don't what the interpreter to read one line and forget another, I use lineA to help me keep track of that. So it just basically ignores anything with # at the beggining.

That code might give errors, but yeah. Work in progress.

.. So I guess if someone tries to make a comment after an already written line, it won't be recognized as a comment?
A better way might be to use string.find to find the first # in the line, and then ignore everything after that. Of course, some tinkering would have to be made if someone wanted to include a # in a string or other variable.

You might not have seen the string.sub, that says: start chopping from the string at column 0, then stop at 1, then check if the string has a # as what it found while chopping, so tinkering would not have to happen. :)/>

… I'm not sure you even read my post when you replied to it. I know how your current comment system works (or how you would make it work), what I'm trying to do is suggest a BETTER system for comments. All you really did, was just restate how your comment system works…
MKlegoman357 #14
Posted 24 November 2014 - 04:00 PM
I would like to point out one thing that ElvishJerricco said in his very nice post:

[*]The lexer
… Think of any program as one long string. The lexer reads the characters in this string one at a time and recognizes when a substring of characters makes a "token". …

Reading the code by one character at a time is probably the best strategy instead of using string.find or something similar.

The link that ElvishJerricco gave (his own shell script implementation) is a very good example on how to make a lexer and a parser.

The lexer of that file:
Spoiler

local function lexer(sProgram)
    local lex = {}
    lex.t = {}
    local cursor = 1
    local c

    function lex.nextc()
        c = sProgram:sub(cursor, cursor)
        if c == "" then
            c = "EOF"
        else
            cursor = cursor + 1
        end
        return c
    end

    function lex._next()
        while true do
            if c == "\n" or c == ";" then
                lex.nextc()
                return "TK_NEWLINE", ";"
            elseif c:find("%s") then
                lex.nextc()
            elseif c == "EOF" then
                return "EOF", "EOF"
            elseif c == "\"" then
                local s = ""
                lex.nextc()
                while c ~= "\"" do
                    if c == "\n" or c == "EOF" then
                        error("Unfinished string", 0)
                    end
                    if c == "\\" then
                        lex.nextc()
                        if c ~= "\"" then
                            s = s .. "\\"
                        end
                    end
                    s = s .. c
                    lex.nextc()
                end
                lex.nextc() -- skip trailing quote
                return "TK_STRING", s
            else
                local s = c
                lex.nextc()
                if symbolChars:find(s, 1, true) then
                    -- symbol
                    while symbolChars:find(c, 1, true) do
                        s = s .. c
                        lex.nextc()
                    end
                    grin.assert(nameTokenMap[s], "Unknown token: " .. s, 0)
                else
                    while not (c:find("[%s;]") or symbolChars:find(c, 1, true) or c == "EOF") do
                        if c == "\\" then
                            c = lex.nextc()
                        end
                        s = s .. c
                        lex.nextc()
                    end
                end

                if nameTokenMap[s] then
                    return nameTokenMap[s], s
                end

                return "TK_STRING", s
            end
        end
    end

    function lex.next()
        local token, data = lex._next()
        lex.t = lex.lookahead
        lex.lookahead = {token=token, data=data}
    end

    lex.nextc()
    lex.next() -- fill the first lookahead
    return lex
end
ElvishJerricco #15
Posted 24 November 2014 - 05:27 PM
-snip-

Yea I remember trying to use string functions the first time I made a lexer. That was a nightmare. Wasn't long before I switched over to custom recognition. It really does help. Being able to deduce tokens in very particular ways is important.
Gumball #16
Posted 25 November 2014 - 04:23 AM
I know, I have the abilty to get as advanced as oeed, I can understand his code, but not make it, atleast not easily. I'm just trying to use simple API libraries, such as string.find, and tables to figure out what lines to ignore and to make the error detector not care about the line, like the # message, heres what I would use:


elseif(line == string.find(string.sub(0,1),"#")) then
  ignoreLine[lineA] = line
end

something like that. What it does:

I have multiple variables that go down to rely on one variable, like lineA, if I don't what the interpreter to read one line and forget another, I use lineA to help me keep track of that. So it just basically ignores anything with # at the beggining.

That code might give errors, but yeah. Work in progress.

.. So I guess if someone tries to make a comment after an already written line, it won't be recognized as a comment?
A better way might be to use string.find to find the first # in the line, and then ignore everything after that. Of course, some tinkering would have to be made if someone wanted to include a # in a string or other variable.

You might not have seen the string.sub, that says: start chopping from the string at column 0, then stop at 1, then check if the string has a # as what it found while chopping, so tinkering would not have to happen. :)/>

… I'm not sure you even read my post when you replied to it. I know how your current comment system works (or how you would make it work), what I'm trying to do is suggest a BETTER system for comments. All you really did, was just restate how your comment system works…

Oh yeah, I see. I thought of that when I was just now adding the comment system.
Gumball #17
Posted 02 February 2015 - 08:07 PM
Alright, first of all, I KNOW. My last bCode interpreter was such a piece of crap, but, this one is much better. It has variables, and strings, and some built in functions. And the root base of the variables is a table/array. To create a variable, do:
createVar(string)

That creates a variable, and remember to make the variable a string. But when you use the variable, it wont be a string.
To define a variable thats been created, do this:
defineVar(variable,string)

Other functions that are built in:
read(string/variable, or nil) or getInput
print(variable/string)
write(variable/string)

Pastebin: pastebin get GATC4FS2 int
Lyqyd #18
Posted 02 February 2015 - 09:14 PM
Threads merged. Feel free to edit the original post and the topic title as you update things, but please stick to one thread per program.