I did some speed profiling… – something I probably should have done a before outright stating my virtual machine is “not improvable” on its blog. Of course, I know better, and that’s why I performed speed profiling.
The profiling narrowed down the bulk of the speed bottleneck to function calls (actually, it was a confirmed educated guess, but I think I know where the other bottlenecks are).
Not to the surprise of any expert, the primary issue turned out to be string comparison. Strings add up to be amazingly expensive. Currently, there are too many function calls that depend on them in the script execution section of the engine.
The current type system of Copper is very dependent on string-based typing, which makes it very easy to extend and keep as a nice self-contained box, but also very slow.
If you recall from a previous post (“FFI = cu.translate(other)”), I had debated on what type system to use. Strings are the most flexible, but the consequences were enormous speed slow downs for all these type checks. Despite the integration, it wouldn’t be difficult to replace it. However, switching away from it still leaves the problem of figuring out exactly what typing system to replace it with.
One possibility is to have the foreign function instances have their parameter names all converted internally based on a table of type names (that are really just hashed names). The problem is that new objects passed into the engine would still need to be converted to such a system.
Another option is to require that the only objects that can be added are those from a creation function that must first obtain the object type number, which can then be shared with all of the functions requiring this type. Internally, typenames for objects would be converted from strings into this special type number. This is too complicated and too restrictive.
Finally, and perhaps the best option, is to allow users to define their own type system via an enumeration. I can have the basic type system (functions + data) and have a user-data type enumeration (bool + string + number) for which the first three can’t be initialized. Or I could combine them into one and let the user decide, since this would be faster for the engine anyways.
It’s simple to extend enumerations (just cast, assuming the root one is forced to compile to a large size (which can be done)), and putting the enum in a namespace (as opposed to a struct) would easily permit such an extension. However, it would conflict with compiler settings that prevent cross-casting (I seem to recall something along those lines creeping into the standard, but given the need for backwards compatibility, it may have no effect on what I’m proposing). This is when I start picking friends. The actual type itself could be an integer. I’m actually quite certain this would work. While this limits the number of unique object types to the number of bits in an unsigned integer, I highly doubt users are going to need 65000 or more types, which is as small as int comes.
Switching away from a string-based type system should relieve some speed bottlenecks in other areas of the engine. For example, if-structures are dependent on boolean values, which must be identified by their type string like all other objects.
One of the downsides is that it doesn’t allow for adding extensions willy-nilly without direction. The type number will need to be passed to the structs of the foreign function instances (these are objects that contain the call() method that the engine calls when the foreign function is called from within Copper) so that these instances know what value to use for type checking. Still, I don’t think this will add too much complexity that the user can’t handle.
What a mess for the sake of speed. Alas, such is the nature of a typing system.
Note on string speed: For strings, even short strings require multiple functions to complete the comparison because you have to iterate between each character. Even if you have only two characters, you have five function calls: a iterator creation, a loop condition, two string character access calls, and a increment call. Compare this to a single function call for integer or enum comparison.
The good news is that I’m starting to believe I can actually improve this thing enough for it to have reasonably usable speed, despite the fact that anything I do I know will restrict the system in ways I don’t want.
The bad news is that this will be a new branch, so the branch name “Cheetah” will be deceptive, even though it’s a useful and powerful branch. Also, it will be a ton of work (easy, but time-consuming) to replace the current system.