Progress Report #6

A number of things have changed with regards to the internals. A great deal of time has been spent trying to optimize the engine, resulting in some interesting discoveries.

The Opcodes system was reduced down to a single class. It had been a number of classes and was, in a way, more readable. In an attempt to increase the speed of the system, I eliminated all of the implementation classes. Though this didn’t increase the current speed. I tried doing this to eliminate the need for containers so that the opcodes could be on the heap. However, the containers were designed as such to make moving the opcodes around the program easy. Consequently, they aren’t going away.

Variable Addresses

The variable address code was changed. Variables are accessed by their “address”. For Copper and a host of other languages, you access members of classes by using the period (“.”). When written out, it’ll be a nice chain of names with dots in between. Copper saves this entire chain as a variable “address”. It’s quite important that this be fast during runtime. At first, it was held in a doubly-linked list. This is slow to allocate per node, but it’s fast for building at the parse stage. To increase speed, I implemented an array-based version. Oddly, on my benchmark tests (see return(benchmark) and Number^Power), it ran a full second slower than the list version! I checked for areas where I thought it might slow things down, but it was hard to figure out. My guess is that it was probably slower because the array was heap-allocated, so it took longer to access it’s elements by index than it did accessing them with an iterator. To fix the issue and save time, I decided to make the variable address class a container for a singly-linked list. Profiling code showed that this new version ran as fast as the original code. Bizarre.

On the bright side, making an independent class for the variable addresses allows them to be made into objects that can be reference-counted. I changed the opcode processing to utilize this such that now addresses are shared. The result was almost a full second in speed increase (going from around 7 seconds to around 6) in my benchmarks.

Math

The math extension functions were reimplemented for this new branch to account for numbers being built-in.

Lists

A list class for user objects has been built in! With this, users will be able to create lists natively in Copper. This was badly needed. There was a list class implementation for older versions of Copper, which I may talk about later, but programming with it was tedious do to the underlying memory management model of Copper. That model has changed slightly. At one point, only Variables could “own” function-objects, which meant that the lifetime of such objects could never be tied to the lifetime of the list, and any assignment from the list to a variable would cause that data to be tied to the lifetime of that variable. The new ownership model abstracts ownership and makes the nodes of the built-in list object owners of their function-objects. That way, any function-objects assigned to the list have their lifetimes tied to the lifetime of the list.

Does it work? I don’t know yet. It has to be tested.

I’ve debated about whether to add functions for handling lists directly to the engine or making them an extension, just like numbers. Adding them internally is more work but likely to be slightly faster, so I’ve favoring adding them as internal.

Foreign Function Interface

The FFI has a new helper class and helper function. These two make it really easy to add functions to the engine, but they add another layer of indirection. I would like to make the FFI faster, but there doesn’t seem to be a “safe” approach to doing this. Currently, foreign functions are setup like so:

class ForeignFunc {
virtual bool call( FFIServices& ffi );
}

There are some other minor methods in that class. The helper function allows adding a foreign function to the engine in a single function call by utilizing the new function wrapper. For example, consider the user function:

bool AreZero( FFIServices& ffi );

This can be added as:

addForeignFuncInstance( engine, "are_zero", AreZero );

Speed

My existing benchmark (a simple loop) was probably unfair for comparing Copper to Python, at least for Copper. Python has lots of optimizations built-in, but these things don’t always come out when processing isn’t straight-forward. I decided to try someone else’s benchmarks for a change.

First, I tried out the benchmarks from carlesmateo. Old benchmarks, yes, and definitely sketchy, but the overall code design is the interesting part.

from datetime import datetime
import time
print ("Starting at: " + str(datetime.now()))
s_unixtime_start = str(time.time())
i_counter = 0
# From 0 to 31999
for i_loop1 in range(0, 10):
    for i_loop2 in range(0,32000):
         for i_loop3 in range(0,32000):
             i_counter += 1
             if ( i_counter > 50 ) :
                 i_counter = 0
print ("Ending at: " + str(datetime.now()))
s_unixtime_end = str(time.time())
i_seconds = long(s_unixtime_end) - long(s_unixtime_start)
s_seconds = str(i_seconds)
print ("Total seconds:" + s_seconds)
The benchmark time for this ran too long. I eventually just killed the process after several minutes. The blog writer claims it took roughly 807 seconds, but I waited longer than that and got nothing. Interesting.
For comparison, I tried it out in Copper.
benchmark = {
 i_counter = 0
 i_loop1 = 0
 i_loop2 = 0
 i_loop3 = 0
 loop {
 if ( equal( i_loop1: 10 ) ) { stop }
 loop {
 if ( equal( i_loop2: 32000 ) ) { stop }
 loop {
 if ( equal( i_loop3: 32000 ) ) { stop }
 ++(i_counter:)
 if ( equal( i_counter: 50 ) ) {
 i_counter = 0
 }
 ++(i_loop3:)
 }
 ++(i_loop2:)
 }
 ++(i_loop1:)
 }
}
The resulting time? Under 1 second! And almost consistently, no less.

Then I tried another benchmark test, this one a little more varied in what it asks for. I also used a better clock for Python.

#! /bin/env/Python3
# from https://nileshgr.com/2012/08/25/a-benchmark-about-speed-of-programming-languages
import time
t1 = time.process_time()
mx = 100000
a = b = c = 0
for i in range(mx):
 a = a + b + i
 b = a * 5678 * i
 c = i * i
 print(i)
t2 = time.process_time()
print((t2 - t1) * 1000)

The benchmark time? 118816.02004 milliseconds… or about 118 seconds. That’s a long time.

The Copper version admittedly gets some convenient shortcuts for this benchmark, like the fact that you can pass several arguments to the addition function.

benchmark = {
mx = 100000
i = 0
a = b = c = dcml(0)
loop {
 if ( equal(i: mx:) ) { stop }
 a = +(a: b: i:)
 b = *(a: 5678 i:)
 c = *(dcml(i:) i:)
 ++(i:)
 print(i: "\n")
}
}

The benchmark time? Roughly 3.5 seconds on average (tested a few times). That’s crazy fast compared to Python. In this case, I attribute some of the speed to the fact that Copper uses double for the big built-in numbers, whereas Python has it’s own built-in numbers that get larger and larger. I intend to incorporate MPFR (or at least GMP) one day, but it’s not needed for most use-cases anyways. Another factor in the time difference is probably the streamlined nature of Copper versus the fact that Python is doing code analysis and discovered that it can’t cheat in this test.

While I can’t say the Python measurements were perfectly accurate, I at least have a ballpark for where my Copper interpreter stands. In short, I’m quite pleased with it.

Technically, I should be comparing Copper to ChaiScript, AngelScript, and maybe Lua, but I have yet to set those up.

A Look Ahead

The main branch of Copper has been changed to “Enu”. I name my branches after animals. It’s easier to remember this way. The predecessors of “Enu” are “Dolphin” and “Cheetah”, and these differed from Enu primarily in internals. Enu is likely to change in interface also.

One critical part needs to change before the prototype chapter of Copper closes: The foreign object type system. I have a new type system in mind for testing, though it will require some additions to the engine that are going to require more work from the user for adding a new object type to the engine – all in the name of speed and keeping the internals opaque to the user (which I think is a good thing).

Once this is all done and tested, the Copper engine will finally be ready for its first project!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s