when: throw Error

Maybe I need an exception block for my header.

While it would be great to assume that everything will work perfectly fine and that errors will be an exception, the real fact is that errors are more of a rule and a reality than an exception. Computers are nice in that they require perfect input. Why shouldn’t interpreters be the same? Rather than crashing outright, however, interpreters have the nice benefit of perhaps knowing when they will fall and being able to catch themselves. I find it disappointing that even the most mature language interpreters still segfault, not due to runtime engine internals, but due to negligent programming in the language itself. Of course, the first problem is having a null pointer, but even if that happens to be handled, there are two fates that usually seem inevitable: loss of important memory or… loss of important memory. In the first case, the memory is deleted, so the host language encounters a segfault, and in the second case, variables are so tangled in a mess of reference counters that things don’t die until the program itself does. Both can be avoided by proper memory management, but as I discovered, it does require doing some odd things with the language that can hide those bugs. I can cross my fingers and hope that the resulting bugs are acceptable because the errors they create should ideally be trivial, but that would be foolish. No doubt the creative people who do dare to use this mound of code will be creative enough to break it. When that happens, there needs to be ways to catch those errors and inform the programmer.

Rather than resulting in a segfault, the attempt to access destroyed memory in Copper results in the creation of new, unique data. This prevents the usual security hole of restoring what should be dead. Second, it allows me to generate a warning for the programmer while continuing processing. Who knows, the programmer may have intentionally designed his application to utilize this feature. I’m not going to stand in the way in that regard.

Errors need to be returned for things the programmer really should not do. First, errors should be returned for syntax errors. As simple as I intend to make my language, there are always things you can do wrong. If even a single character is wrong, there has to be an error. No exceptions. Any token out of place must return an error. Visually, there may be some flexibility due to the prevalence and play of whitespace, but that’s about it. Second, errors should be returned for unrecoverable runtime failures, with exceptions. Many languages tend to complicate exception handling by throwing all sorts of errors, requiring the programmer to catch each type of error, depending on what they are doing. That’s helpful if you want to return your own error messages, but if error messages and proper memory deallocation are handled for you, I can’t think of other cases where you would need a specific error returned from a function. The alternative, which I chose, is to return an expected default value that isn’t likely to break anything if not caught. Consider the following chunk of pseudo-code:

variable f = file("Myfile")
variable r = read(f)
variable d = splitLines(r)
for variable i in d: print(i)

Suppose the program failed to open the file. The variable “f” would normally break in other languages. A generic default return (an empty variable) from all functions – in this case, from “file” – and passed to the “read” function solves this problem. How? The “read” function would, upon receiving the default, return the default, which is then passed to “splitLines”, which would also return the default. Finally, this default would prematurely halt the for-loop without breaking any code or requiring any exception-block code. A similar thing would happen if “read” failed instead of “file” or if “splitLines” failed instead of “read”. If memory management is automatically handled, there is no need to worry about cleanup in this sort of scenario. Wonderful, no? (Ok, ok. I can already hear some of you griping about hidden bugs.)

Every program will inevitably encounter errors from the operating system. Some can be handled, and in the scenario above, you can see how a default return from all functions is suitable for handling both operating system errors (such as file-loading) and errors from built-in functions. The interpreter can be designed such that user-created functions also return the default value when they return nothing else.

To make checking for errors easier, I decided to add an is_empty() function that checks if the return from a function is an empty function (the default return in case of failure).

At this point we must ask, what happens to the internal state of the engine when an error is encountered? In languages such as JavaScript, all code immediately stops. In the Python REPL, something may throw an error, but the state of the system remains the same. Why the difference? Aside from the mere whim of the original developers, in my opinion, there is a more subtle issue that forces the hand of developers, so to speak. In a language such as Python, both class and function definitions end when the indentation ceases. Two things can cause this: either something is encountered in the file with less indentation or an error is thrown in REPL mode. From the perspective of the interpreter, all it receives are tokens and it tries to maintain an internal state that depends very much on the stream of tokens it receives. Because classes and function bodies only depend on indentation, they can technically end at any time, which allows for an easy way to handle errors without destroying the full internal state. In a language such as JavaScript, however, there is no way to tell where the end of a class or function is without scanning ahead. There might not even be one! If an error occurs, the interpreter is left with no reasonable choice except to stop execution there. Without knowledge of the function ending – and no way to guarantee the function ends – the interpreter must destroy the stack. Since data is tied to the stack (via variables), all the data is also destroyed. It’s a total loss. Oh well.

The jobs of informing programmers of their errors and helping correct them are the duties of an IDE. Rather than building a stand-alone interpreter, it makes more sense to tweak it with some addenda that make IDE integration easier, along with providing some complementary documentation on how to do so. The primary information needed by the IDE are: (1) the index of the character where the line occurred and (2) a message pertaining to the internal state of the engine. The first bit of information is so the IDE can reconstruct the surrounding context and display it to the programmer. The second bit of information should be an error code indicating the type of error that occurred. The IDE can then customize a message for the user based on this information. The task of sorting errors from warnings and other information can be made easier by passing the IDE an indicator of the info type along with the error code.

~ Misuse ~

Any system with only one data type can result in a number of hidden bugs, obviously. how do you tell when you’ve assigned something wrong? Some language designers deliberately chose to have static typing for this reason. I understand their reasoning. I also understand that programmers aren’t so terrible that the standards committee for C++ would not allow the addition of an “auto” keyword. The keyword “auto” in a static language like C++ is bizarre. It is not the same as the dynamic typing in Python, but it is enough to either simplify the code or make it less readable. In any case, if you have a function that requires two parameters of the same type (be it integer, decimal numbers, strings, etc – non const or both const, assuming the language supports “const”), static typing won’t help you. That said, static typing only relieves some issues. Honestly, I can’t say that having only one type is better, or that focusing purely on logic will help you avoid bugs from parameters being in the wrong order. Consistent interfaces may help solve this issue, but at the very least, built-in functions will need to report misplaced parameters (which may indicate a much deeper bug).

~ Final Remarks ~

I could receive a great deal of criticism for this article. After all, we’re so ingrained with expecting errors and distinguishing between them within the actual code itself that turning a blind eye to problems “in the wild” seems counter-intuitive. I haven’t actually seen any such a system tested before, so criticisms here may very likely be both well warranted and yet not based on experience. (Maybe they all figured out something I didn’t.) Time will tell if, in the long run, it turned out to be a sorry idea.

I may also receive criticism for being “lazy” because I didn’t bother adding an exception-checking system. After all, such a system is complicated. Rest assured, however, I did consider such an option, and it would not have been super complicated to implement. However, I feel more like it’s mixing dirty work in with logic. We all know (or should know) that each link in the chain is a weak point and can break in dozens of ways. Handling every such option distracts us from focusing on the logic of the program. Therefore, it’s a deliberate exclusion.

On top of that (yes, there’s more), if you really want to handle errors, you can always do so with a callback system. Let’s suppose “file” allowed a callback function for success and a callback function for failure.

file( locals, fn(openFile, locals) { do stuff on success }, fn(failMsg, locals) { do stuff on failure } )

There’s no need for a try-catch block setup if functions are setup like this. (Locals refers to local variables passed in since the function will be executed outside the scope it was created in.)

In future articles, I may cover specific examples of errors and how I distinguish between what should trigger an error and what should merely result in a warning.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s