List and Tuple(1,2,3)

Lists are tuples are very common features for languages. Arrays and lists have been around for some time, so they are almost expected. Python does away with arrays, understandably so, and it almost makes little sense to keep they around for Copper. Tuples, on the other hand, seem primarily for mental models, but can be useful in preventing misuse and accidental deletions or additions of elements as well as providing convenient means for extraction. They sort of fill the gaps for simplicity that lists, arrays, and general objects don’t fill. Both of these should be discussed in detail.

Lists

In programming, you iterate through stuff all the time. I had plenty of time to think about how it would all fit together, and I discovered that it would be both restrictive to implement (in the sense that it creates unwanted dependencies on the numeric classes) and slow in execution to allow lists to be accessed by index. In fact, it makes even more sense to use iterators. These can be handled safely by having the list class and the iterator class reference count the instances of the list node class (without the nodes themselves applying reference counting on each other, since that would create a cyclic loop of reference and a memory sink). I had to abandon a couple days of work on a list class, but I’m not starting totally from scratch again. I created a couple of classes and functions to make adding extension functions much, much easier.

With a new design for lists, the function list would look something like this:

  • List( [item1, item2,…] )
  • Length( list )
  • Iter( list )
  • Next( iter )
  • Prev( iter )
  • Elem( iter )
  • Insert_next( iter, elem )
  • Insert_prev( iter, elem )

That’s not a complete list, but those are the main functions.

Access Time

While thinking about list design, I started to question a number of things. First was the need to be accessible via numeric index. This would be rather slow as it would both force Copper for-loops to an O(n) operation and then another O(n) operation in C++ itself (unless I used a “midix” – an iterator hidden in the list class that points to the last-accessed node – as I did in a linked list called LinusListix). Given how little this is used in comparison with iterators (though it is used, yes), it makes more sense to force to user to perform alittle extra work for an index access and let iteration be the fast work. Furthermore, tailoring to iterators allows keeping the list class independent of the numeric classes on the C++ side.

Copper does not come with built-in for-loops (though I intend to add one for the list extension, which I’ll talk about in a moment). Here’s a comparison in Copper between the different approaches to implementing one, starting with the old way for doing things:
foreach = fn(items, act) {
num = Length(items())
i = 0
loop {
if ( Equal(i() num()) ) { stop }
act( Item(items() i()) )
i = Sum(i() 1)
}
}

This code assumes a number of things, but it should be clear what the function names would imply. The “items” is a list of items and the “act” is a function that is run on those items. You would use such a foreach function as follows:
foreach( List(1 2 3) fn(p) { print(p()) } )

As you can see, it requires passing in a function. At first, it might seem like local variables are inaccessible, but that’s where the convenience of assigned-parameters in functions comes into play. For example, suppose we had a variable “data” that needed to be used in the “act” function:
foreach( List(1 2 3) fn(p data=data()) { print(Sum(p() this.data()) } )

While this for loop is relatively small, it is still verbose and, more than that, it is, as I said, an O(n) + O(n) operation. (Note: I find it deceiving to call it an O(2n) operation because so much more happens on the C++ side for a single iteration that the time for a Copper O(n) is significantly longer than the C++ O(n) even if, according to the notation rules, they may be considered “equivalent”, but then again, I’ve never heard of comparing big-O notation for a combined interpreted+compiled running).

Of course, this loop is doesn’t allow for breaking at a bad or triggered point of the loop. Taking into account extension names for clarity, the actual loop looked as follows:
foreach = fn(list act) {
i = 0
cnt = Length(list())
loop { if ( Int_equal(i() cnt()) ) { stop }
result = act(Item(list() i()))
if ( all(is_bool(result()) not(result())) ) {stop}
i = Int_sum(i() 1)
}
}

Compare to C++ …
template
void foreach(List& list, bool (act*)(T&)) {
unsigned i, cnt; // or "unsigned long" depending on list limits
i = 0;
cnt = list.size()
for(; i < cnt; i++) {
if( act( list[i] ) ) break;
}
}

The C++ version is shorter, albeit useless (who would even use this thing?).

Supposing that we use iterators, some of the loop code disappears and the code becomes much faster:
foreach = fn( items act ) {
i = Iter(list:)
loop {
act(Elem(i:))
if ( not(Next(i:)) ) { stop }
}
}

I’ve excluded the result-checking code for the sake of clarity. This new method removes two lines of code.

A more practical C++ comparison…
for(Iter i = list.start(); i != list.end(); ++i) {
act( *i );
}

… assuming you don’t have to write out something like util::List<Object*>::Iter i = list.start();

The C++ code is still shorter (and faster, duh), but now we aren’t playing a game of double access time in either case. You might also notice that the C++ loop isn’t wrapped. Indeed, the Copper loop doesn’t need to be either.
i = Iter(list:)
loop {
# do stuff #
if ( not(Next(i:)) ) { stop }
}

The reason for wrapping in a function is the convenience of a one-line call. However, I do plan on making a for-loop function for lists on the C++ side. It would look identical to the foreach call I gave above (with the exception of being written “Foreach” instead of “foreach”).

Naming Conventions

Another thing I considered about lists was nomenclature. Lists are thought of on a next-previous basis, but we have functions like “push_front” and “push_back” which are entirely meaningless unless you know what they do. Is the side to which you are adding the “front” or the “back”? What about “pop” for stacks? If lists are used as stacks, would you pop the “front” or “back”? Sure, you might have the answer after programming for some time, but I felt like, if you really wanted these names, you could just change them in the extensions. lol. Oh the beauty of an embedded language. Anyways… I’ve greatly considered simplifying the mental model to “left” and “right” or “top” and “bottom”. In fact, the latter sound better. Consider the function names with this:

  • List( [item1, item2,…] )
  • Length( list )
  • Iter( list )
  • Go_up( iter )
  • Go_down( iter )
  • Elem( iter )
  • Insert_above( iter, elem )
  • Insert_below( iter, elem )

Admittedly, “Next” makes more sense in the context of a list, so an alias is fine (and the Copper VM allows you to add any function more than once under different names). However, in the case that I use these names, the bottom of the list would be the start. The mental model seems just as easy to teach because you can compare it to a stack of papers on a desk, to which you always add to the top and can work your way down if you wish.

You may be wondering how the iterators work. In C++, you might run the iterator right off the end of the list. In Copper, no such activity happens. For this reason, the iteration and check for the next element are done at the end of the loop: where the iterator can return true only if there is another element. If there isn’t the iterator doesn’t need to jump off the end of the list. If the list doesn’t have any elements, there doesn’t need to be an iterator. If you attempt to create an iterator for an empty list, you get an empty function. If you destroy a list while there are still iterators, each iterator saves the node it controlled, so you effectively don’t destroy anything.

I continue to sit around and daydream about how to break the system. The breakage is most likely to occur with iterators attempting to delete nodes themselves. Deletion by iterator is a necessary feature, in my opinion. After all, lists can be very long. But if the operation isn’t carefully handled, it’s easy to get a segfault – the thing at the top of the list of things I don’t want out of Copper.

Tuples

As an extension, tuples might provide a unique feature to Copper that I’m excluding with lists. Remember, lists aren’t going to have access by index, which makes them a poor choice for return values of small quantities. Consider:
a = { ret( List("name" Date(10)) ) }

Suppose the calling function only needs the second argument. Rather than require making an iterator, it would make more sense to allow direct access. Plus, it would make the “second argument” more meaningful (because something couldn’t be inserted before it). This is different from using function-classes and calling from the hash table, as in the following:
a = { ret( fn(name="name" date=Date(10)) ) }

Returning a tuple means that you can set a function without disconnecting its owning variable from the pointers that share the function. This is very important from a design standpoint, in my opinion, but at the moment, I don’t want to contrive what might be a complex (and contestable) example.

Assigning member names for a function also means locking the meaning of the alleged “second argument” into a name (not to mention, it may not actually be the second argument anymore), which, for lack of interest, programmers often end up naming something generic like “data”.

The tuple syntax could be very simple and have some operations that mimic tuples in other languages.
a = Tuple(1,2,"name")
Unpack(Tuple("name" Date()), b, c)
a = Pack(Tuple("name",1), "name", 2, Tuple("name", 3))
a = Item(Tuple("name", 1) 0)

The “Pack” function here would unpack any tuples it was given and combine all the data into a single tuple, as opposed to the Tuple function, which should not do any unpacking. I haven’t decided on whether contents being unpacked should be recombined into another tuple if there aren’t enough functions to unpack them into. There is some overhead cost associated with this, but it might make more sense in the long run.

One problem with creating a tuple extension is that it will rely on a numeric extension for obvious reasons. I suppose I could limit tuple sizes to a certain size and perform simple operations for extracting numbers. After all, who is going to make tuples huge other than crackers? For the sake of simplicity, I have greatly considered requiring that all number classes at least have a function that returns an unsigned int – a safe conversion – thereby making the number class more useful. Ironically, it would allow for indexing lists and thereby reduce the need for tuples.

If it does so happen that I implement list indexing, then the priority of implementing tuples would depend on the usefulness of features like “Pack” and “Unpack” and the need for a constant ordering of elements. I tend to avoid using tuples or similar structures in C++, and it is possible to avoid using them in Copper, but the important points here are if it is convenient and provides a simple alternative not otherwise available.

Advertisements

2 thoughts on “List and Tuple(1,2,3)

    1. foreach() just returns empty function. not(empty function) returns true (because default is false). But I don’t expect people to pass foreach to not() in any language. That doesn’t have any meaning.

      Notably, I forgot to check for empty lists before iterating in Copper. That’s another bool check.

      Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s