Zach Beane — Generic vs. plain functions
@20190822 13:28 · 2 days agoI really like this comment from Stas Boukarev on Methods or Functions, when do I use which on reddit:
[You] don’t make an extensible program by just making each function generic, if each function still makes some assumption on the data. You can’t say “oh, it’s extensible, but you have to define a hundred methods of your own”, an extensible protocol will require defining a handful of pertinent methods and then the hundred of other functions will automatically work on new classes.
Lispers.de — Berlin Lispers Meetup, Monday, 26th August 2019
@20190821 08:00 · 4 days agoWe meet again on Monday 8pm, 26th August. Our host this time is James Anderson (www.dydra.com).
Berlin Lispers is about all flavors of Lisp including Clojure, Scheme and Common Lisp.
Willem Broekema will talk about his workinprogress on "Developments in the AllegroGraph query engine".
We meet in the TautHaus at Engeldamm 70 in BerlinMitte, the bell is "James Anderson". It is located in 10min walking distance from U Moritzplatz or U Kottbusser Tor or Ostbahnhof. In case of questions call Christian +49 1578 70 51 61 4.
Nicolas Hafner — The End of Daily Gamedev  Confession 87
@20190820 10:32 · 4 days ago
It's been two months now since I started to do daily game development streams. I've been trying my best, but it is time for this to come to a close. In this article I'll talk about the various things that happened, why I'm stopping, and the future of the Leaf game. Strap in!
It's actually been slightly longer than two months, but since I missed some days due to being sick, and some others because I didn't feel like streaming  more on that later  I'll just count it as two months. In any case, in this time I've done 56 streams, almost all of them two hours long. That's a lot of hours, and I'm truly impressed that some people stuck around for almost all of them. Thank you very much! A lot happened in that time too, and I think it would be interesting to go over some of the major features and talk about them briefly.
New Features in Leaf
Slopes and Collision
Collision detection was heavily revised from the previous version. The general procedure is to scan the current chunk for hits until there are no more hits to be found. If we have more than ten hits we assume that the player is in a wall somehow and just die. The number ten is obviously arbitrary, but somehow it seems sufficient and I haven't had any accidental deaths yet.
When a hit is detected, it dispatches on the type of tile or entity that was collided with. It does so in two steps, the first is a test whether the collision will happen at all, to allow subtile precision, and the second is the actual collision resolution, should a full hit have been detected. The first test can be used to elide collisions with jumpthrough platforms or slopes if the player moves above the actual slope surface. The actual collision resolution is typically comprised of moving the player to the collision point, updating velocity along the hit normal, and finally zipping out of the ground if necessary to avoid floating point precision issues.
The collision detection of the slopes itself is surprisingly simple and works on the same principle as swept AABB tests: we can enlarge the slope triangle by simply moving the line towards the player by the player's halfsize. Once this shift is done we only need to do a rayline collision test. During resolution there's some slight physics cheating going on to make the player stick to the ground when going down a slope, rather than flying off, but that's it.
Packets and File Formats
Leaf defines a multitude of file formats. These formats are typically all defined around the idea of a packet  a collection of files in a directory hierarchy. The idea of a packet allows me to define these formats as both directly on disk, inmemory as some data structure, or encapsulated within an archive. The packet protocol isn't that complicated and I intend on either at least putting it into Trial, or putting it into its own library altogether. Either way, it allows the transparent implementation of these formats regardless of backing storage.
The actual formats themselves also follow a very similar file structure: a meta.lisp
file for a brief metadata header, which identifies the format, the version, and some authoring metadata fields. This file is in typical sexpression form and can be used to create a version object, which controls the loading and writing process of the rest of the format. In the current v0
, this usually means an extra data.lisp
payload file, and a number of other associated payload files like texture images.
The beauty of using generic functions with methods that specialise both on the version and object at the same time is that it allows me to define new versions in terms of select overrides, so that I can specify new behaviour for select classes, rather than having to redo the entire de/serialisation process, or breaking compatibility altogether.
Dialogue and Quests
The dialogue and quests are implemented as very generic systems that should have the flexibility (I hope) to deal with all the story needs I might have in the future. Dialogue is written in an extended dialect of Markless. For instance, the following is a valid dialogue snippet:
~ Fi
 (:happy) Well isn't this a sight for sore eyes!
 Finally a bit of sunshine!
 I don't like rain
~ Player
 I don't mind the rain, actually.
 Makes it easier to think.
 Yeah!
~ Player
 Yeah, it's been too long! Hopefully this isn't announcing the coming of a sandstorm.
! incf (favour 'fi)
 ...
! decf (favour 'fi)
~ Fi
 ? (< 3 (favour 'fi))
  So, what's our next move?
 ?
  Alright, good luck out there!
The list is translated into a choice for the player to make, which can impact the dialogue later. The way this is implemented is through a syntax extension in the clmarkless parser, followed by a compiler from the Markless AST to an assembly language, and a virtual machine to execute the assembly. The user of the dialogue system only needs to implement the evaluation of commands, the display of text, and the presentation of choices.
The quest system on the other hand is based on node graphs. Each quest is represented as a directed graph of task nodes, each describing a task the player must fulfil through an invariant and a success condition. On success, one or more successor tasks can be unlocked. Tasks can also spawn dialogue pieces to become available as interactions with NPCs or items. The system is smart enough to allow different, competing branches, as well as parallel branches to complete a quest. I intend on building a graph editor UI for this once Alloy is further along.
Both of these systems are, again, detached enough that I'll either put them into Trial, or put them into a completely separate library altogether. I'm sure I'll need to adjust things once I actually have some written story on hand to use these systems with.
Platforming AI
The platforming AI allows characters to move along the terrain just like the player would. This is extremely useful for story reasons, so that characters can naturally move to select points, or idle around places rather than just standing still. The way this is implemented is through a node graph that describes the possible movement options from one valid position to the next. This graph is built through a number of scanline passes over the tile map that either add new nodes or connect existing nodes together in new ways.
The result is a graph with nodes that can connect through walk, crawl, fall, or jump edges. A character can be moved along this graph by first running A* to find a shortest path to the target node, and then performing a realtime movement through the calculated path. Generally the idea is to always move the player in the direction of the next target node until that node has been reached, in which case it's popped off the path. The jump edges already encode the necessary jump parameters to use, so when reaching a jump node the character just needs to assume the initial velocity and let standard physics do the rest.
The implementation includes a simple visualiser so that you can see how characters would move across the chunk terrain. When the chunk terrain changes, the node graph is currently just recomputed from scratch which isn't fast, but then again during gameplay the chunk isn't going to change anyway so it's only really annoying during editing. I'll think about whether I want to implement incremental updates.
Lighting
Leaf has gone through two lighting systems. The old one worked through signed distance fields that were implicitly computed through a light description. New light types required new shader code to evaluate the SDF, and each light required many operations in the fragment stage, which is costly.
The new system uses two passes, in the first lights are rendered to a separate buffer. The lights are rendered like regular geometry, so we can use discrete polygons to define light areas, and use other fancy tricks like textured lights. In the second pass the fragment shader simply looks up the current fragment position in the light texture and mixes the colours together.
In effect this new system is easier to implement, more expressive, and much faster to run. Overall it's a massive win in almost every way I can imagine. There's further improvements I want to make still, such as shadow casting, dynamic daylights, and light absorption mapping to allow the light to dissipate into the ground gradually.
Alloy
Alloy is a new user interface toolkit that I've been working on as part of Leaf's development. I've been in need for a good UI toolkit that I can use within GL (and otherwise) for a while, and a lot of Leaf's features had to be stalled because I didn't have one yet. However, a lot of Alloy's development is also only very distantly related to game development itself, and hardly at all related to the game itself. Thus I think I'll talk more about Alloy in other articles sometime.
Why I'm Stopping
I initially started this daily stuff to get myself out of a rut. At the time I wasn't doing much at all, and that bothered me a lot, so committing to a daily endeavour seemed like a good way to kick myself out of it. And it was! For a long time it worked really well. I enjoyed the streams and made good progress with the game.
Unfortunately I have the tendency to turn things like this into enormous burdens for myself. The stream turned from something I wanted to do into something I felt I had to do, and then ultimately into something I dreaded doing. This has happened before with all of my projects, especially streaming ones. With streams I quickly feel a lot of pressure because I get the idea that people aren't enjoying the content, that it's just a boring waste of time. Maybe it is, or maybe it isn't, I don't know. Either way, having to worry about the viewers and not just the project I'm working on, especially trying to constrain tasks to interesting little features that can fit into two hours turns into a big constraint that I can't keep up anymore.
There's a lot of interesting work left to be done, sure, but I just can't bear things anymore at the moment. Dreading the stream poisoned a lot of the rest of my days and ultimately started to hurt my productivity and wellbeing over the past two weeks. Maybe I'll do more streams again at some point in the future, but for now I need a break for an indeterminate amount of time.
The Future of Leaf
Leaf isn't dead, though. I intend to keep working on it on my own, and I really do want to see it finished one day, however far away that day may be. Currently I feel like I need to focus on writing, which is a big challenge for me. I'm a very, very inexperienced writer, especially when it comes to longform stories and worldbuilding. There I have practically no idea on how to do anything. If you are a writer, or are interested in talking shop about stories, please contact me.
Other than writing I'm probably going to mostly work on Alloy in the immediate future. I hope to have a better idea of the writing once I'm done, and that should give rise to more features to implement in Leaf directly. I'll try to keep posting updates on the blog here as things progress in any case, and there's a few systems I'd like to elaborate on in technical articles as well.
Thanks to everyone who read my summaries, watched the streams or recordings, and chatted live during this time. It means a lot to me to see people genuinely interested in what I do.
Vsevolod Dyomkin — Programming Algorithms: Linked Lists
@20190819 12:32 · 5 days agoLinked data structures are in many ways the opposite of the contiguous ones that we have explored to some extent in the previous chapter using the example of arrays. In terms of complexity, they fail where those ones shine (first of all, at random access) — but prevail at scenarios when a repeated modification is necessary. In general, they are much more flexible and so allow the programmer to represent almost any kind of a data structure, although the ones that require such level of flexibility may not be too frequent. Usually, they are specialized trees or graphs.
The basic linked data structure is a singlylinked list.
Just like arrays, lists in Lisp may be created both with a literal syntax for constants and by calling a function — makelist
— that creates a list of a certain size filled with nil
elements. Besides, there's a handy list
utility that is used to create lists with the specified content (the analog of vec
).
CLUSER> '("hello" world 111)
("hello" WORLD 111)
CLUSER> (makelist 3)
(NIL NIL NIL)
CLUSER> (list "hello" 'world 111)
("hello" WORLD 111)
An empty list is represented as ()
and, interestingly, in Lisp, it is also a synonym of logical falsehood (nil
). This property is used very often, and we'll have a chance to see that.
If we were to introduce our own lists, which may be quite a common scenario in case the builtin ones' capabilities do not suit us, we'd need to define the structure "node", and our list would be built as a chain of such nodes. We might have wanted to store the list head and, possibly, tail, as well as other properties like size. All in all, it would look like the following:
(defstruct listcell
data
next)
(defstruct ourownlist
(head nil :type (or listcell null))
(tail nil :type (or listcell null)))
CLUSER> (let ((tail (makelistcell :data "world")))
(makeourownlist
:head (makelistcell
:data "hello"
:next tail)
:tail tail))
#S(OUROWNLIST
:HEAD #S(LISTCELL
:DATA "hello"
:NEXT #S(LISTCELL :DATA "world" :NEXT NIL))
:TAIL #S(LISTCELL :DATA "world" :NEXT NIL))
Lists as Sequences
Alongside arrays, list is the other basic data structure that implements the sequence abstract data type. Let's consider the complexity of basic sequence operations for linked lists:
 socalled random access, i.e. access by index of a random element, requires
O(n)
time as we have to traverse all the preceding elements before we can reach the desired one (n/2
operations on average)  yet, once we have reached some element, removing it or inserting something after it takes
O(1)
 subsequencing is also
O(n)
Getting the list length, in the basic case, is also O(n)
i.e. it requires full list traversal. It is possible, though, to store list length as a separate slot, tracking each change on the fly, which means O(1)
complexity. Lisp, however, implements the simplest variant of lists without size tracking. This is an example of a small but important decision that realworld programming is full of. Why is such a solution the right thing™, in this case? Adding the size counter to each list would have certainly made this common length
operation more effective, but the cost of doing that would've included: increase in occupied storage space for all lists, a need to update size in all list modification operations, and, possibly, a need for a more complex cons cell implementation[1]. These considerations make the situation with lists almost opposite to arrays, for which size tracking is quite reasonable because they change much less often and not tracking the length historically proved to be a terrible security decision. So, what side to choose? A default approach is to prefer the solution which doesn't completely rule out the alternative strategy. If we were to choose a simple conscell sans size (what the authors of Lisp did) we'll always be able to add the "smart" list data structure with the size field, on top of it. Yet, stripping the size field from builtin lists won't be possible. Similar reasoning is also applicable to other questions, such as: why aren't lists, in Lisp, doublylinked. Also, it helps that there's no security implication as lists aren't used as data exchange buffers, for which the problem manifests itself.
For demonstration, let's add the size field to ourownlist
(and, meanwhile, consider all the functions that will need to update it...):
(defstruct ourownlist
(head nil :type (or listcell nil))
(tail nil :type (or listcell nil))
(size 0 :type (integer 0)))
Given that obtaining the length of a list, in Lisp, is an expensive operation, a common pattern in programs that require multiple requests of the length field is to store its value in some variable at the beginning of the algorithm and then use this cached value, updating it if necessary.
As we see, lists are quite inefficient in random access scenarios. However, many sequences don't require random access and can satisfy all the requirements of a particular use case using just the sequential one. That's one of the reasons why they are called sequences, after all. And if we consider the special case of list operations at index 0 they are, obviously, efficient: both access and addition/removal is O(1)
. Also, if the algorithm requires a sequential scan, list traversal is rather efficient too, although not as good as array traversal for it still requires jumping over the memory pointers. There are numerous sequence operations that are based on sequential scans. The most common is map
, which we analyzed in the previous chapter. It is the functional programming alternative to looping, a more highlevel operation, and thus simpler to understand for the common cases, although less versatile.
map
is a function that works with different types of builtin sequences. It takes as the first argument the target sequence type (if nil
is supplied it won't create the resulting sequence and so will be used just for sideeffects). Here is a polymorphic example involving lists and vectors:
CLUSER> (map 'vector '+
'(1 2 3 4 5)
#(1 2 3))
#(2 4 6)
map
applies the function provided as its second argument (here, addition) sequentially to every element of the sequences that are supplied as other arguments, until one of them ends, and records the result in the output sequence. map
would have been even more intuitive, if it just had used the type of the first argument for the result sequence, i.e. be a "do what I mean" dwimmap
, while a separate advanced variant with resulttype selection might have been used in the background. Unfortunately, the current standard scheme is not for change, but we can define our own wrapper function:
(defun dwimmap (fn seq &rest seqs)
"A thin wrapper over MAP that uses the first SEQ's type for the result."
(apply 'map (typeof seq) fn seqs))
map
in Lisp is, historically, used for lists. So there's also a number of listspecific map variants that predated the generic map
, in the earlier versions of the language, and are still in wide use today. These include mapcar
, mapc
, and mapcan
(replaced in RUTILS by a safer flatmap
). Now, let's see a couple of examples of using mapping. Suppose that we'd like to extract odd numbers from a list of numbers. Using mapcar
as a listspecific map
we might try to call it with an anonymous function that tests its argument for oddity and keeps them in such case:
CLUSER> (mapcar (lambda (x) (when (oddp x) x))
(range 1 10))
(1 NIL 3 NIL 5 NIL 7 NIL 9)
However, the problem is that nonodd numbers still have their place reserved in the result list, although it is not filled by them. Keeping only the results that satisfy (or don't) certain criteria and discarding the others is a very common pattern that is known as "filtering". There's a set of Lisp functions for such scenarios: remove
, removeif
, and removeifnot
, as well as RUTILS' complements to them keepif
and keepifnot
. We can achieve the desired result adding remove
to the picture:
CLUSER> (remove nil (mapcar (lambda (x) (when (oddp x) x))
(range 1 10)))
(1 3 5 7 9)
A more elegant solution will use the removeif(not)
or keepif(not)
variants. removeifnot
is the most popular among these functions. It takes a predicate and a sequence and returns the sequence of the same type holding only the elements that satisfy the predicate:
CLUSER> (removeifnot 'oddp (range 1 10))
(1 3 5 7 9)
Using such highlevel mapping functions is very convenient, which is why there's a number of other if(not)
operations, like find(if(not))
, member(if(not))
, position(if(not))
, etc.
The implementation of mapcar
or any other list mapping function, including your own taskspecific variants, follows the same pattern of traversing the list accumulating the result into another list and reversing it, in the end:
(defun simplemapcar (fn list)
(let ((rez ()))
(dolist (item list)
(:= rez (cons (call fn item) rez)))
(reverse rez)))
The function cons
is used to add an item to the beginning of the list. It creates a new list head that points to the previous list as its tail.
From the complexity point of view, if we compare such iteration with looping over an array we'll see that it is also a linear traversal that requires twice as many operations as with arrays because we need to traverse the result fully once again, in the end, to reverse it. Its advantage, though, is higher versatility: if we don't know the size of the resulting sequence (for example, in the case of removeifnot
) we don't have to change anything in this scheme and just add a filter line ((when (oddp item) ...
), while for arrays we'd either need to use a dynamic array (that will need constant resizing and so have at least the same double number of operations) or preallocate the fullsized result sequence and then downsize it to fit the actual accumulated number of elements, which may be problematic when we deal with large arrays.
Lists as Functional Data Structures
The distinction between arrays and linked lists in many ways reflects the distinction between the imperative and functional programming paradigms. Within the imperative or, in this context, procedural approach, the program is built out of lowlevel blocks (conditionals, loops, and sequentials) that allow for the most finetuned and efficient implementation, at the expense of abstraction level and modularization capabilities. It also heavily utilizes inplace modification and manual resource management to keep overhead at a minimum. An array is the most suitable datastructure for such a way of programming. Functional programming, on the contrary, strives to bring the abstraction level higher, which may come at a cost of sacrificing efficiency (only when necessary, and, ideally, only for noncritical parts). Functional programs are built by combining referentially transparent computational procedures (aka "pure functions") that operate on more advanced data structures (either persistent ones or having special access semantics, e.g. transactional) that are also more expensive to manage but provide additional benefits.
Singlylinked lists are a simple example of functional data structures. A functional or persistent data structure is the one that doesn't allow inplace modification. In other words, to alter the contents of the structure a fresh copy with the desired changes should be created. The flexibility of linked data structures makes them suitable for serving as functional ones. We have seen the cons
operation that is one of the earliest examples of nondestructive, i.e. functional, modification. This action prepends an element to the head of a list, and as we're dealing with the singlylinked list the original doesn't have to be updated: a new cons cell is added in front of it with its next
pointer referencing the original list that becomes the new tail. This way, we can preserve both the pointer to the original head and add a new head. Such an approach is the basis for most of the functional data structures: the functional trees, for example, add a new head and a new route from the head to the newly added element, adding new nodes along the way — according to the same principle.
It is interesting, though, that lists can be used in destructive and nondestructive fashion likewise. There are both low and highlevel functions in Lisp that perform list modification, and their existence is justified by the use cases in many algorithms. Purely functional lists render many of the efficient list algorithms useless. One of the highlevel list modification function is nconc
. It concatenates two lists together updating in the process the next
pointer of the last cons cell of the first list:
CLUSER> (let ((l1 (list 1 2 3))
(l2 (list 4 5 6)))
(nconc l1 l2) ; note no assignment to l1
l1) ; but it is still changed
(1 2 3 4 5 6)
There's a functional variant of this operation, append
, and, in general, it is considered distasteful to use nconc
for two reasons:
 the risk of unwarranted modification
 funny enough, the implementation of
nconc
, actually, isn't mandated to be more efficient than that ofappend
So, forget nconc
, append
all the lists!
Using append
we'll need to modify the previous piece of code because otherwise the newly created list will be garbagecollected immediately:
CLUSER> (let ((l1 (list 1 2 3))
(l2 (list 4 5 6)))
(:= l1 (append l1 l2))
l1)
(1 2 3 4 5 6)
The lowlevel list modification operations are rplaca
and rplacd
. They can be combined with listspecific accessors nth
and nthcdr
that provide indexed access to list elements and tails respectively. Here's, for example, how to add an element in the middle of a list:
CLUSER> (let ((l1 (list 1 2 3)))
(rplacd (nthcdr 0 l1)
(cons 4 (nthcdr 1 l1)))
l1)
(1 4 2 3)
Just to reiterate, although functional list operations are the default choice, for efficient implementation of some algorithms, you'll need to resort to the ugly destructive ones.
Different Kinds of Lists
We have, thus far, seen the most basic linked list variant — a singlylinked one. It has a number of limitations: for instance, it's impossible to traverse it from the end to the beginning. Yet, there are many algorithms that require accessing the list from both sides or do other things with it that are inefficient or even impossible with the singlylinked one, hence other, more advanced, list variants exist.
But first, let's consider an interesting tweak to the regular singlylinked list — a circular list. It can be created from the normal one by making the last cons cell point to the first. It may seem like a problematic data structure to work with, but all the potential issues with infinite looping while traversing it are solved if we keep a pointer to any node and stop iteration when we encounter this node for the second time. What's the use for such structure? Well, not so many, but there's a prominent one: the ring buffer. A ring or circular buffer is a structure that can hold a predefined number of items and each item is added to the next slot of the current item. This way, when the buffer is completely filled it will wrap around to the first element, which will be overwritten at the next modification. By our bufferfilling algorithm, the element to be overwritten is the one that was written the earliest for the current item set. Using a circular linked list is one of the simplest ways to implement such a buffer. Another approach would be to use an array of a certain size moving the pointer to the next item by incrementing an index into the array. Obviously, when the index reaches array size it should be reset to zero.
A more advanced list variant is a doublylinked one, in which all the elements have both the next
and previous
pointers. The following definition, using inheritance, extends our original listcell
with a pointer to the previous element. Thanks to the basic objectoriented capabilities of structs, it will work with the current definition of ourownlist
as well, and allow it to function as a doublylinked list.
(defstruct (listcell2 (:include listcell))
prev)
Yet, we still haven't shown the implementation of the higherlevel operations of adding and removing an element to/from ourownlist
. Obviously, they will differ for singly and doublylinked lists, and that distinction will require us to differentiate the doublylinked list types. That, in turn, will demand invocation of a rather heavy OOmachinery, which is beyond the subject of this book. Instead, for now, let's just examine the basic list addition function, for the doublylinked list:
(defun ourcons2 (data list)
(when (null list) (:= list (makeourownlist)))
(let ((newhead (makelistcell2
:data data
:next (when list @list.head))))
(:= @list.head.prev newhead)
(makeourownlist
:head newhead
:tail @list.tail
:size (1+ @list.size))))
The first thing to note is the use of the @
syntactic sugar, from RUTILS, that implements the mainstream dot notation for slotvalue access (i.e. @list.head.prev
refers to the prev
field of the head
field of the provided list
structure of the assumed ourownlist
type, which in a more classically Lispy, although cumbersome, variants may look like one of the following: (ourcons2prev (ourownlisthead list))
or (slotvalue (slotvalue list 'head) 'prev)
[2]).
More important here is that, unlike for the singlylinked list, this function requires an inplace modification of the head element of the original list: setting its prev
pointer. Immediately making doublylinked lists nonpersistent.
Finally, the first line is the protection against trying to access the null list (that will result in a muchfeared, especially in Javaland, nullpointer exception class of error).
At first sight, it may seem that doublylinked lists are more useful than singlylinked ones. But they also have higher overhead so, in practice, they are used quite sporadically. We may see just a couple of use cases on the pages of this book. One of them is presented in the next part — a doubleended queue.
Besides doublylinked, there are also association lists that serve as a variant of keyvalue data structures. At least 3 types may be found in Common Lisp code, and we'll briefly discuss them in the chapter on keyvalue structures. Finally, a skip list is a probabilistic data structure based on singlylinked lists, that allows for faster search, which we'll also discuss in a separate chapter on probabilistic structures. Other more esoteric list variants, such as selforganized list and XORlist, may also be found in the literature — but very rarely, in practice.
FIFO & LIFO
The flexibility of lists allows them to serve as a common choice for implementing a number of popular abstract data structures.
Queue
A queue or FIFO has the following interface:
enqueue
an item at the enddequeue
the first element: get it and remove it from the queue
It imposes a firstinfirstout (FIFO) ordering on the elements. A queue can be implemented directly with a singlylinked list like ourownlist
. Obviously, it can also be built on top of a dynamic array but will require permanent expansion and contraction of the collection, which, as we already know, isn't the preferred scenario for their usage.
There are numerous uses for the queue structures for processing items in a certain order (some of which we'll see in further chapters of this book).
Stack
A stack or LIFO (lastinfirstout) is even simpler than a queue, and it is used even more widely. Its interface is:
push
an item on top of the stack making it the first elementpop
an item from the top: get it and remove it from the stack
A simple Lisp list can serve as a stack, and you can see such uses in almost every file with Lisp code. The most common pattern is result accumulation during iteration: using the stack interface, we can rewrite simplemapcar
in an even simpler way (which is idiomatic Lisp):
(defun simplemapcar (fn list)
(let ((rez ()))
(dolist (item list)
(push (call fn item) rez))
(reverse rez)))
Stacks hold elements in reversechronological order and can thus be used to keep the history of changes to be able to undo them. This feature is used in procedure calling conventions by the compilers: there exists a separate segment of program memory called the Stack segment, and when a function call happens (beginning from the program's entry point called the main
function in C) all of its arguments and local variables are put on this stack as well as the return address in the program code segment where the call was initiated. Such an approach allows for the existence of local variables that last only for the duration of the call and are referenced relative to the current stack head and not bound to some absolute position in memory like the global ones. After the procedure call returns, the stack is "unwound" and all the local data is forgotten returning the context to the same state in which it was before the call. Such stackbased historykeeping is a very common and useful pattern that may be utilized in userland code likewise.
Lisp itself also uses this trick to implement global variables with a capability to have contextdependent values through the extent of let
blocks: each such variable also has a stack of values associated with it. This is one of the most underappreciated features of the Lisp language used quite often by experienced lispers. Here is a small example with a standard global variable (they are called special in Lisp parlance due to this special property) *standardoutput*
that stores a reference to the current output stream:
CLUSER> (print 1)
1
1
CLUSER> (let ((*standardoutput* (makebroadcaststream)))
(print 1))
1
In the first call to print, we see both the printed value and the returned one, while in the second — only the return value of the print function, while it's output is sent, effectively, to /dev/null.
Stacks can be also used to implement queues. We'll need two of them to do that: one will be used for enqueuing the items and another — for dequeuing. Here's the implementation:
(defstruct queue
head
tail)
(defun enqueue (item queue)
(push item @queue.head))
(defun dequeue (queue)
;; Here and in the next condition, we use the property that an empty list
;; is also logically false. This is discouraged by many Lisp styleguides,
;; but, in many cases, such code is not only more compact but also more clear.
(unless @queue.tail
(do ()
((null @queue.head)) ; this loop continues until head becomes empty
(push (pop @queue.head) @queue.tail)))
;; By pushing all the items from the head to the tail we reverse
;; their order — this is the second reversing that cancels the reversing
;; performed when we push the items to the head and it restores the original order.
(when @queue.tail
(values (pop @queue.tail)
t))) ; this second value is used to indicate that the queue was not empty
CLUSER> (let ((q (makequeue)))
(print q)
(enqueue 1 q)
(enqueue 2 q)
(enqueue 3 q)
(print q)
(print q)
(dequeue q)
(print q)
(enqueue 4 q)
(print q)
(dequeue q)
(print q)
(dequeue q)
(print q)
(dequeue q)
(print q)
(dequeue q))
#S(QUEUE :HEAD NIL :TAIL NIL)
#S(QUEUE :HEAD (3 2 1) :TAIL NIL)
#S(QUEUE :HEAD (3 2 1) :TAIL NIL)
#S(QUEUE :HEAD NIL :TAIL (2 3))
#S(QUEUE :HEAD (4) :TAIL (2 3))
#S(QUEUE :HEAD (4) :TAIL (3))
#S(QUEUE :HEAD (4) :TAIL NIL)
#S(QUEUE :HEAD NIL :TAIL NIL)
NIL ; no second value indicates that the queue is now empty
Such queue implementation still has O(1)
operation times for enqueue
/dequeue
. Each element will experience exactly 4 operations: 2 push
s and 2 pop
s (for the head
and tail
).
Another stackbased structure is the stack with a minimum element, i.e. some structure that not only holds elements in LIFO order but also keeps track of the minimum among them. The challenge is that if we just add the min
slot that holds the current minimum, when this minimum is pop
ped out of the stack we'll need to examine all the remaining elements to find the new minimum. We can avoid this additional work by adding another stack — a stack of minimums. Now, each push
and pop
operation requires us to also check the head of this second stack and, in case the added/removed element is the minimum, push
it to the stack of minimums or pop
it from there, accordingly.
A wellknown algorithm that illustrates stack usage is fullyparenthesized arithmetic expressions evaluation:
(defun aritheval (expr)
"EXPR is a list of symbols that may include:
square brackets, arithmetic operations, and numbers."
(let ((ops ())
(vals ())
(op nil))
(dolist (item expr)
(case item
([ ) ; do nothing
((+  * /) (push item ops))
(] (:= op (pop ops)
val (pop vals))
(case op
(+ (:+ val (pop vals)))
( (: val (pop vals)))
(* (:* val (pop vals)))
(/ (:/ val (pop vals))))
(push val vals))
(otherwise (push item vals))))
(pop vals)))
CLUSER> (aritheval '([ 1 + [ [ 2 + 3 ] * [ 4 * 5 ] ] ] ]))
101
Deque
A deque is a short name for a doubleended queue, which can be traversed in both orders: FIFO and LIFO. It has 4 operations: pushfront
and pushback
(also called shift
), popfront
and popback
(unshift
). This structure may be implemented with a doublylinked list or likewise a simple queue with 2 stacks. The difference for the 2stacks implementation is that now items may be pushed back and forth between head
and tail
depending on the direction we're pop
ping from, which results in worstcase linear complexity of such operations: when there's constant alteration of front and back directions.
The use case for such structure is the algorithm that utilizes both direct and reverse ordering: a classic example being jobstealing algorithms, when the main worker is processing the queue from the front, while other workers, when idle, may steal the lowest priority items from the back (to minimize the chance of a conflict for the same job).
Stacks in Action: SAX Parsing
Custom XML parsing is a common task for those who deal with different datasets, as many of them come in XML form, for example, Wikipedia and other Wikidata resources. There are two main approaches to XML parsing:
 DOM parsing reads the whole document and creates its tree representation in memory. This technique is handy for small documents, but, for huge ones, such as the dump of Wikipedia, it will quickly fill all available memory. Also, dealing with the deep tree structure, if you want to extract only some specific pieces from it, is not very convenient.
 SAX parsing is an alternative variant that uses the stream approach. The parser reads the document and, upon completing the processing of a particular part, invokes the relevant callback: what to do when an open tag is read, when a closed one, and with the contents of the current element. These actions happen for each tag, and we can think of the whole process as traversing the document tree utilizing the socalled "visitor pattern": when visiting each node we have a chance to react after the beginning, in the middle, and in the end.
Once you get used to SAX parsing, due to its simplicity, it becomes a tool of choice for processing XML, as well as JSON and other formats that allow for a similar stream parsing approach. Often the simplest parsing pattern is enough: remember the tag we're looking at, and when it matches a set of interesting tags, process its contents. However, sometimes, we need to make decisions based on the broader context. For example, let's say, we have the text marked up into paragraphs, which are split into sentences, which are, in turn, tokenized. To process such a threelevel structure, with SAX parsing, we could use the following outline (utilizing CXML library primitives):
(defclass textsax (sax:saxparsermixin)
((parags :initform nil :accessor saxparags)
(parag :initform nil :accessor saxparag)
(sent :initform nil :accessor saxsent)
(tag :initform nil :accessor saxtag)))
(defmethod sax:startelement ((sax textsax)
namespaceuri localname qname attrs)
(declare (ignore namespaceuri qname attrs))
(:= (saxtag sax) (mkeyw localname))
(defmethod sax:endelement ((sax textsax)
namespaceuri localname qname)
(declare (ignore namespaceuri qname))
(withslots (tag parags sent) sax
(case tag
(:paragraph (push (reverse parag) parags)
(:= parag nil))
(:sentence (push (reverse sent) parag)
(:= sent nil)))))
(defmethod sax:characters ((sax textsax) text)
(when (eql :token (saxtag sax))
(push text (saxsent sax)))
(defmethod sax:enddocument ((sax textsax))
(reverse (saxparags sax)))
This code will return the accumulated structure of paragraphs from the sax:enddocument
method. And two stacks: the current sentence and the current paragraph are used to accumulate intermediate data while parsing. In a similar fashion, another stack of encountered tags might have been used to exactly track our position in the document tree if there were such necessity. Overall, the more you'll be using SAX parsing, the more you'll realize that stacks are enough to address 99% of the arising challenges.
Lists as Sets
Another very important abstract data structure is a Set. It is a collection that holds each element only once no matter how many times we add it there. This structure may be used in a variety of cases: when we need to track the items we have already seen and processed, when we want to calculate some relations between groups of elements,s and so forth.
Basically, its interface consists of settheoretic operations:
 add/remove an item
 check whether an item is in the set
 check whether a set is a subset of another set
 union, intersection, difference, etc.
Sets have an interesting aspect that an efficient implementation of elementwise operations (add/remove/member) and setwise (union/...) require the use of different concrete datastructures, so a choice should be made depending on the main use case. One way to implement sets is by using linked lists. Lisp has standard library support for this with the following functions:
adjoin
to add an item to the list if it's not already theremember
to check for item presence in the setsubsetp
for subset relationship queryunion
,intersection
,setdifference
, andsetexclusiveor
for set operations
This approach works well for small sets (up to tens of elements), but it is rather inefficient, in general. Adding an item to the set or checking for membership will require O(n)
operations, while, in the hashset (that we'll discuss in the chapter on keyvalue structures), these are O(1)
operations. A naive implementation of union
and other settheoretic operations will require O(n^2)
as we'll have to compare each element from one set with each one from the other. However, if our set lists are in sorted order settheoretic operations can be implemented efficiently in just O(n)
where n
is the total number of elements in all sets, by performing a single linear scan over each set in parallel. Using a hashset will also result in the same complexity.
Here is a simplified implementation of union
for sets of numbers built on sorted lists:
(defun sortedunion (s1 s2)
(let ((rez ()))
(do ()
((and (null s1) (null s2)))
(let ((i1 (first s1))
(i2 (first s2)))
(cond ((null i1) (dolist (i2 s2)
(push i2 rez))
(return))
((null i2) (dolist (i1 s1)
(push i1 rez))
(return))
((= i1 i2) (push i1 rez)
(:= s1 (rest s1)
s2 (rest s2)))
((< i1 i2) (push i1 rez)
(:= s1 (rest s1)))
;; just T may be used instead
;; of the following condition
((> i1 i2) (push i2 rez)
(:= s2 (rest s2))))))
(reverse rez)))
CLUSER> (sortedunion '(1 2 3)
'(0 1 5 6))
(0 1 2 3 5 6)
This approach may be useful even for unsorted listbased sets as sorting is a merely O(n * log n)
operation. Even better though, when the use case requires primarily settheoretic operations on our sets and the number of changes/membership queries is comparatively low, the most efficient technique may be to keep the lists sorted at all times.
Merge Sort
Speaking about sorting, the algorithms we discussed for array sorting in the previous chapter do not work as efficient for lists for they are based on swap operations, which are O(n)
, in the list case. Thus, another approach is required, and there exist a number of efficient list sorting algorithms, the most prominent of which is Merge sort. It works by splitting the list into two equal parts until we get to trivial oneelement lists and then merging the sorted lists into the bigger sorted ones. The merging procedure for sorted lists is efficient as we've seen in the previous example. A nice feature of such an approach is its stability, i.e. preservation of the original order of the equal elements, given the proper implementation of the merge procedure.
(defun mergesort (list comp)
(if (null (rest list))
list
(let ((half (floor (length list) 2)))
(mergelists (mergesort (subseq seq 0 half) comp)
(mergesort (subseq seq half) comp)
comp))))
(defun mergelists (l1 l2 comp)
(let ((rez ())
(do ()
((and (null l1) (null l2)))
(let ((i1 (first l1))
(i2 (first l2)))
(cond ((null i1) (dolist (i l2)
(push i rez))
(return))
((null i2) (dolist (i l1)
(push i rez))
(return))
((call comp i1 i2) (push i1 rez)
(:= l1 (rest l1)))
(t (push i2 rez)
(:= l2 (rest l2))))))
(reverse rez)))
The same complexity analysis as for binary search applies to this algorithm. At each level of the recursion tree, we perform O(n)
operations: each element is pushed into the resulting list once, reversed once, and there are at most 4 comparison operations: 3 null checks and 1 call of the comp
function. We also need to perform one copy per element in the subseq
operation and take the length of the list (although it can be memorized and passed down as the function call argument) on the recursive descent. This totals to not more than 10 operations per element, which is a constant. And the height of the tree is, as we already know, (log n 2)
. So, the total complexity is O(n * log n)
.
Let's now measure the real time needed for such sorting, and let's compare it to the time of prodsort
(with optimal array accessors) from the Arrays chapter:
CLUSER> (with ((lst (randomlist 10000))
(vec (makearray 10000 :initialcontents lst)))
(printsorttimings "Prod" 'prodsort vec)
(printsorttimings "Merge " 'mergesort lst))
= Prodsort of random vector =
Evaluation took:
0.048 seconds of real time
= Prodsort of sorted vector =
Evaluation took:
0.032 seconds of real time
= Prodsort of reverse sorted vector =
Evaluation took:
0.044 seconds of real time
= Merge sort of random vector =
Evaluation took:
0.007 seconds of real time
= Merge sort of sorted vector =
Evaluation took:
0.007 seconds of real time
= Merge sort of reverse sorted vector =
Evaluation took:
0.008 seconds of real time
Interestingly enough, Merge sort turned out to be around 5 times faster, although it seems that the number of operations required at each level of recursion is at least 23 times bigger than for quicksort. Why we got such result is left as an exercise to the reader: I'd start from profiling the function calls and looking where most of the time is wasted...
It should be apparent that the mergelists
procedure works in a similar way to settheoretic operations on sorted lists that we've discussed in the previous part. It is, in fact, provided in the Lisp standard library. Using the standard merge
, Merge sort may be written in a completely functional and also generic way to support any kind of sequences:
(defun mergesort (seq comp)
(if (or (null seq) ; avoid expensive length calculation for lists
(<= (length seq) 1))
seq
(let ((half (floor (length seq) 2)))
(merge (typeof seq)
(mergesort (subseq seq 0 half) comp)
(mergesort (subseq seq half) comp)
comp))))
There's still one substantial difference of Merge sort from the array sorting functions: it is not inplace. So it also requires the O(n * log n)
additional space to hold the half sublists that are produced at each iteration. Sorting and merging them inplace is not possible. There are ways to somewhat reduce this extra space usage but not totally eliminate it.
Parallelization of Merge Sort
The extraspace drawback of Merge sort may, however, turn irrelevant if we consider the problem of parallelizing this procedure. The general idea of parallelized implementation of any algorithm is to split the work in a way that allows reducing the runtime proportional to the number of workers performing those jobs. In the ideal case, if we have m
workers and are able to spread the work evenly the running time should be reduced by a factor of m
. For the Merge sort, it will mean just O(n/m * log n)
. Such ideal reduction is not always achievable, though, because often there are bottlenecks in the algorithm that require all or some workers to wait for one of them to complete its job.
Here's a trivial parallel Merge sort implementation that uses the eagerfuture2
library, which adds highlevel data parallelism capabilities based on the Lisp implementation's multithreading facilities:
(defun parallelmergesort (seq comp)
(if (or (null seq) (<= (length seq) 1))
seq
(with ((half (floor (length seq) 2))
(thread1 (eagerfuture2:pexec
(mergesort (subseq seq 0 half) comp)))
(thread2 (eagerfuture2:pexec
(mergesort (subseq seq half) comp))))
(merge (typeof seq)
(eagerfuture2:yield thread1)
(eagerfuture2:yield thread2)
comp))))
The eagerfuture2:pexec
procedure submits each mergesort
to the thread pool that manages multiple CPU threads available in the system and continues program execution not waiting for it to return. While eagerfuture2:yield
pauses execution until the thread performing the appropriate mergesort
returns.
When I ran our testing function with both serial and parallel merge sorts on my machine, with 4 CPUs, I got the following result:
CLUSER> (with ((lst1 (randomlist 10000))
(lst2 (copylist lst1)))
(printsorttimings "Merge " 'mergesort lst1)
(printsorttimings "Parallel Merge " 'parallelmergesort lst2))
= Merge sort of random vector =
Evaluation took:
0.007 seconds of real time
114.29% CPU
= Merge sort of sorted vector =
Evaluation took:
0.006 seconds of real time
116.67% CPU
= Merge sort of reverse sorted vector =
Evaluation took:
0.007 seconds of real time
114.29% CPU
= Parallel Merge sort of random vector =
Evaluation took:
0.003 seconds of real time
266.67% CPU
= Parallel Merge sort of sorted vector =
Evaluation took:
0.003 seconds of real time
266.67% CPU
= Parallel Merge sort of reverse sorted vector =
Evaluation took:
0.005 seconds of real time
220.00% CPU
A speedup of approximately 2x, which is also reflected by the rise in CPU utilization from around 100% (i.e. 1 CPU) to 250%. These are correct numbers as the merge procedure is still executed serially and remains the bottleneck. There are more sophisticated ways to achieve optimal m
times speedup, in Merge sort parallelization, but we won't discuss them here due to their complexity.
Lists and Lisp
Historically, Lisp's name originated as an abbreviation of "List Processing", which points both to the significance that lists played in the language's early development and also to the fact that flexibility (a major feature of lists) was always a cornerstone of its design. Why are lists important to Lisp? Maybe, originally, it was connected with the availability and the good support of this data structure in the language itself. But, quickly, the focus shifted to the fact that, unlike other languages, Lisp code is input in the compiler not in a custom stringbased format but in the form of nested lists that directly represent the syntax tree. Coupled with superior support for the list data structure, it opens numerous possibilities for programmatic processing of the code itself, which are manifest in the macro system, code walkers and generators, etc. So, "List Processing" turns out to be not about lists of data, but about lists of code, which perfectly describes the main distinctive feature of this language...
Footnotes:
[1] While, in the Lisp machines, cons cells even had special hardware support, and such change would have made it useless.
[2] Although, for structs, it is implementationdependent if this will work. In the major implementations, it will.
Quicklisp news — August 2019 Quicklisp dist update now available
@20190813 14:38 · 11 days ago cardiogram — Simple test framework — MIT
 cesdi — Provides a computeeffectiveslotdefinitioninitargs generic function that allows for more ergonomic initialization of effective slot definition objects. — Unlicense
 chameleon — Configuration management facilities for Common Lisp with multiple profile support. — MIT
 ciutils — A set of tools for using CI platforms — MIT
 clclsparse — Common Lisp bindings for clSPARSE — Apache License, Version 2.0
 clflattree — A flattree implementation in Common Lisp. — MIT
 clshlex — Lexical analyzer for simple shelllike syntax. — MIT
 datamuse — Common Lisp library for accessing the Datamuse wordfinding API — MIT
 datecalc — Package for simple date calculation — GPL or Artistic
 linearprogramming — A library for solving linear programming problems — MIT
 orizuruorm — An ORM for Common Lisp and PostgreSQL. — GPLv3
 stripe — A client for the Stripe payment API. — MIT
 trivialextensiblesequences — Portability library for the extensible sequences protocol. — zlib
 trivialpackagelocalnicknames — Portability library for packagelocal nicknames — Public domain
To get this update, use (ql:updatedist "quicklisp"). Enjoy!
Vsevolod Dyomkin — Programming Algorithms: Arrays
@20190812 13:37 · 12 days agoArrays are, alongside structs, the most basic data structure and, at the same time, the default choice for implementing algorithms. A onedimensional array that is also called a "vector" is a contiguous structure consisting of the elements of the same type. One of the ways to create such arrays, in Lisp, is this:
CLUSER> (makearray 3)
#(0 0 0)
The printed result is the literal array representation. It happens that the array is shown to hold 0's, but that's implementationdependent. Additional specifics can be set during array initialization: for instance, the :elementtype
, :initialelement
, and even full contents:
CLUSER> (makearray 3 :elementtype 'list :initialelement nil)
#(NIL NIL NIL)
CLUSER> (makearray 3 :initialcontents '(1.0 2.0 3.0))
#(1.0 2.0 3.0)
If you read back such an array you'll get a new copy with the same contents:
CLUSER> #(1.0 2.0 3.0)
#(1.0 2.0 3.0)
It is worth noting that the element type restriction is, in fact, not a limitation the default type is T
[1]. In this case, the array will just hold pointers to its elements that can be of arbitrary type. If we specify a more precise type, however, the compiler might be able to optimize storage and access by putting the elements in memory directly in the array space. This is, mainly, useful for numeric arrays, but it makes multiple orders of magnitude difference for them for several reasons, including the existence of vector CPU instructions that operate on such arrays.
The arrays we have created are mutable, i.e. we can change their contents, although we cannot resize them. The main operator to access array elements is aref
. You will see it in those pieces of code, in this chapter, where we care about performance.
CLUSER> (let ((vec (makearray 3 :initialcontents '(1.0 2.0 3.0))))
(print (aref vec 0))
(print (? vec 1))
(:= (aref vec 2) 4.0))
(print (? vec 2))
(aref vec 3))
1.0
2.0
4.0
; Evaluation aborted on #<SIMPLETYPEERROR expectedtype: (MOD 3) datum: 3>
In Lisp, array access beyond its boundary, as expected, causes an error.
It is also possible to create constant arrays using the literal notation #()
. These constants can, actually, be changed in some environments, but don't expect anything nice to come out of such abuse — and the compiler will warn you of that:
CLUSER> (let ((vec #(1.0 2.0 3.0)))
(:= (aref vec 2) nil)
(print vec))
; caught WARNING:
; Destructive function (SETF AREF) called on constant data.
; See also:
; The ANSI Standard, Special Operator QUOTE
; The ANSI Standard, Section 3.2.2.3
;
; compilation unit finished
; caught 1 WARNING condition
#(1.0 2.0 NIL)
RUTILS provides more options to easily create arrays with a shorthand notation:
CLUSER> #v(1 2 3)
#(1 2 3)
CLUSER> (vec 1 2 3)
#(1 2 3)
Although the results seem identical they aren't. The first version creates a mutable analog of #(1 2 3)
, and the second also makes it adjustable (we'll discuss adjustable or dynamic arrays next).
Arrays as Sequences
Vectors are one of the representatives of the abstract sequence
container type that has the following basic interface:
 inquire the length of a sequence — performed in Lisp using the function
length
 access the element by index — the RUTILS
?
operator is the most generic variant while the native one for arrays isaref
and a more generalelt
, for all builtin sequences (this also includes lists and, in some implementations, userdefined, socalled, extensible sequences)  get the subsequence — the standard provides the function
subseq
for this purpose
These methods have some specific that you should mind:
 the
length
function, for arrays, works inO(1)
time as length is tracked in the array structure. There is an alternative (more primitive) way to handle arrays, employed, primarily, in C when the length is not stored, and, instead, there's a special termination "symbol" that indicates the end of an array. For instance, C strings have a'\0'
termination character, and arrays representing commandline arguments, in the Unix syscalls API for such functions asexec
, are terminated with nullpointers. Such an approach is, first of all, not efficient from the algorithmic point of view as it requiresO(n)
time to query the array's length. But, what's even more important, it has proven to be a source of a number of catastrophic security vulnerabilities — the venerable "buffer overflow" family of errors  the
subseq
function creates a copy of the part of its argument, which is an expensive operation. This is the functional approach that is a proper default, but many of the algorithms don't involve subarray mutation, and, for them, a more efficient variant would be to use a sharedstructure variant that doesn't make a copy but merely returns a pointer into the original array. Such option is provided, in the Lisp standard, via the socalled displaced arrays, but it is somewhat cumbersome to use, that's why a more straightforward version is present in RUTILS which is namedslice
CLUSER> (with ((vec (vec 1 2 3))
(part (slice vec 2)))
(print part)
(:= (? part 0) 4)
(print part)
vec)
#(3)
#(4)
#(1 2 4)
Beyond the basic operations, sequences in Lisp are the target of a number of higherorder functions, such as find
, position
, removeif
etc. We'll get back to discussing their use later in the book.
Dynamic Vectors
Let's examine arrays from the point of view of algorithmic complexity. Generalpurpose data structures are usually compared by their performance on several common operations and, also, space requirements. These common operations are: access, insertion, deletion, and, sometimes, search.
In the case of ordinary arrays, the space used is the minimum possible: almost no overhead is incurred except, perhaps, for some metainformation about array size. Array element access is performed by index in constant time because it's just an offset from the beginning that is the product of index by the size of a single element. Search for an element requires a linear scan of the whole array or, in the special case of a sorted array, it can be done in O(log n)
using binary search.
Insertion (at the end of an array) and deletion with arrays is problematic, though. Basic arrays are static, i.e. they can't be expanded or shrunk at will. The case of expansion requires free space after the end of the array that isn't generally available (because it's already occupied by other data used by the program) so it means that the whole array needs to be relocated to another place in memory with sufficient space. Shrinking is possible, but it still requires relocation of the elements following the deleted one. Hence, both of these operations require O(n)
time and may also cause memory fragmentation. This is a major drawback of arrays.
However, arrays definitely should be the default choice for most algorithms. Why? First of all, because of the other excellent properties arrays provide and also because, in many cases, lack of flexibility can be circumvented in a certain manner. One common example is iteration with accumulation of results in a sequence. This is often performed with the help of a stack (as a rule, implemented with a linked list), but, in many cases (especially, when the length of the result is known beforehand), arrays may be used to the same effect. Another approach is using dynamic arrays, which add array resizing capabilities. And only in the case when an algorithm requires contiguous manipulation (insertion and deletion) of a collection of items or other advanced flexibility, linked data structures are preferred.
So, the first approach to working around the static nature of arrays is possible when we know the target number of elements. For instance, the most common pattern of sequence processing is to map
a function over it, which produces the new sequence of the same size filled with results of applying the function to each element of the original sequence. With arrays, it can be performed even more efficiently than with a list. We just need to preallocate the resulting vector and set its elements one by one as we process the input:
(defun mapvec (fn vec)
"Map function FN over each element of VEC
and return the new vector with the results."
(let ((rez (makearray (length vec))))
(dotimes (i (length vec))
(:= (aref rez i) (call fn (aref vec i))))
rez))
CLUSER> (mapvec '1+ #(1 2 3))
#(2 3 4)
We use a specific accessor aref
here instead of generic ?
to ensure efficient operation in the socalled "inner loop" — although, there's just one loop here, but it will be the inner loop of many complex algorithms.
However, in some cases we don't know the size of the result beforehand. For instance, another popular sequence processing function is called filter
or removeif
(not
) in Lisp. It iterates over the sequence and keeps only elements that satisfy/don't satisfy a certain predicate. It is, generally, unknown how many elements will remain, so we can't predict the size of the resulting array. One solution will be to allocate the fullsized array and fill only so many cells as needed. It is a viable approach although suboptimal. Filling the result array can be performed by tracking the current index in it or, in Lisp, by using an array with a fillpointer:
(defun clumsyfiltervec (pred vec)
"Return the vector with only those elements of VEC
for which calling pred returns true."
(let ((rez (makearray (length vec) :fillpointer t)))
(dotimes (i (length vec))
(when (call pred (aref vec i))
(vectorpush (aref vec i) rez)))
rez))
CLUSER> (describe (clumsyfiltervec 'oddp #(1 2 3)))
#(1 3)
[vector]
Elementtype: T
Fillpointer: 2
Size: 3
Adjustable: yes
Displacedto: NIL
Displacedoffset: 0
Storage vector: #<(SIMPLEVECTOR 3) {100E9AF30F}>
Another, more general way, would be to use a "dynamic vector". This is a kind of an array that supports insertion by automatically expanding its size (usually, not one element at a time but proportionally to the current size of the array). Here is how it works:
CLUSER> (let ((vec (makearray 0 :fillpointer t :adjustable t)))
(dotimes (i 10)
(vectorpushextend i vec)
(describe vec)))
#(0)
[vector]
Elementtype: T
Fillpointer: 1
Size: 1
Adjustable: yes
Displacedto: NIL
Displacedoffset: 0
Storage vector: #<(SIMPLEVECTOR 1) {100ED9238F}>
#(0 1)
Fillpointer: 2
Size: 3
#(0 1 2)
Fillpointer: 3
Size: 3
#(0 1 2 3)
Elementtype: T
Fillpointer: 4
Size: 7
...
#(0 1 2 3 4 5 6 7)
Fillpointer: 8
Size: 15
#(0 1 2 3 4 5 6 7 8)
Elementtype: T
Fillpointer: 9
Size: 15
#(0 1 2 3 4 5 6 7 8 9)
Elementtype: T
Fillpointer: 10
Size: 15
For such "smart" arrays the complexity of insertion of an element becomes asymptotically constant: resizing and moving elements happens less and less often the more elements are added. With a large number of elements, this comes at a cost of a lot of wasted space, though. At the same time, when the number of elements is small (below 20), it happens often enough, so that the performance is worse than for a linked list that requires a constant number of 2 operations for each insertion (or 1 if we don't care to preserve the order). So, dynamic vectors are the solution that can be used efficiently only when the number of elements is neither too big nor too small.
Why Are Arrays Indexed from 0
Although most programmers are used to it, not everyone understands clearly why the choice was made, in most programming languages, for 0based array indexing. Indeed, there are several languages that prefer a 1based variant (for instance, MATLAB and Lua). This is quite a deep and yet very practical issue that several notable computer scientists, including Dijkstra, have contributed to.
At first glance, it is "natural" to expect the first element of a sequence to be indexed with 1, second — with 2, etc. This means that if we have a subsequence from the first element to the tenth it will have the beginning index 1 and the ending — 10, i.e. be a closed interval also called a segment: [1, 10]
. The cons of this approach are the following:
It is more straightforward to work with halfopen intervals (i.e. the ones that don't include the ending index): especially, it is much more convenient to split and merge such intervals, and, also, test for membership. With 0based indexing, our example interval would be halfopen:
[0, 10)
.If we consider multidimensional arrays that are most often represented using onedimensional ones, getting an element of a matrix with indices
i
andj
translates to accessing the element of an underlying vector with an indexi*w + j
ori + j*h
for 0based arrays, while for 1based ones, it's more cumbersome:(i1)*w + j
. And if we consider 3dimensional arrays (tensors), we'll still get the obviousi*w*h + j*h + k
formula, for 0based arrays, and, maybe,(i1)*w*h + (j1)*h + k
for 1based ones, although I'm not, actually, sure if it's correct (which shows how such calculations quickly become untractable). Besides, multidimensional array operations that are much more complex than mere indexing also often occur in many practical tasks, and they are also more complex and thus errorprone with base 1.
There are other arguments, but I consider them to be much more minor and a matter of taste and convenience. However, the intervals and multidimensional arrays issues are quite serious. And here is a good place to quote one of my favorite anecdotes that there are two hard problems in CS: cache invalidation and naming things,.. and offbyone errors. Arithmetic errors with indexing are a very nasty kind of bug, and although it can't be avoided altogether 0based indexing turns out to be a much more balanced solution.
Now, using 0based indexing, let's write down the formula for finding the middle element of an array. Usually, it is chosen to be (floor (length array) 2)
. This element will divide the array into two parts, left and right, each one having length at least (1 (floor (length array) 2)
: the left part will always have such size and will not include the middle element. The right side will start from the middle element and will have the same size if the total number of array elements is even or be one element larger if it is odd.
MultiDimensional Arrays
So far, we have only discussed onedimensional arrays. However, more complex datastructures can be represented using simple arrays. The most obvious example of such structures is multidimensional arrays. There's a staggering variety of other structures that can be built on top of arrays, such as binary (or, in fact, any nary) trees, hashtables, and graphs, to name a few. If we have a chance to implement the data structure on an array, usually, we should not hesitate to take it as it will result in constant access time, good cache locality contributing to faster processing and, in most cases, efficient space usage.
Multidimensional arrays are a contiguous datastructure that stores its elements so that, given the coordinates of an element in all dimensions, it can be retrieved according to a known formula. Such arrays are also called tensors, and in case of 2dimensional arrays — matrices. We have already seen one matrix example in the discussion of complexity:
#2A((1 2 3)
(4 5 6))
A matrix has rows (first dimension) and columns (second dimension). Accordingly, the elements of a matrix may be stored in the rowmajor or columnmajor order. In rowmajor, the elements are placed row after row — just like on this picture, i.e., the memory will contain the sequence: 1 2 3 4 5 6
. In columnmajor order, they are stored by column (this approach is used in many "mathematical" languages, such as Fortran or MATLAB), so raw memory will look like this: 1 4 2 5 3 6
. If rowmajor order is used the formula to access the element with coordinates i
(row) and j
(column) is: (+ (* i n) j)
where n
is the length of the matrix's row, i.e. its width. In the case of columnmajor order, it is: (+ i (* j m))
where m
is the matrix's height. It is necessary to know, which storage style is used in a particular language as in numeric computing it is common to intermix libraries written in many languages — C, Fortran, and others — and, in the process, incompatible representations may clash.[2]
Such matrix representation is the most obvious one, but it's not exclusive. Many languages, including Java, use iliffe vectors
to represent multidimensional arrays. These are vectors of vectors, i.e. each matrix row is stored in a separate 1dimensional array, and the matrix is the vector of such vectors. Besides, more specific multidimensional arrays, such as sparse or diagonal matrices, may be represented using more efficient storage techniques at the expense of a possible loss in access speed. Higherorder tensors may also be implemented with the described approaches.
One classic example of operations on multidimensional arrays is matrix multiplication. The simple straightforward algorithm below has the complexity of O(n^3)
where n
is the matrix dimension. The condition for successful multiplication is equality of height of the first marix and width of the second one. The cubic complexity is due to 3 loops: by the outer dimensions of each matrix and by the inner identical dimension.
(defun m* (m1 m2)
(let ((n (arraydimension m1 1))
(n1 (arraydimension m1 0))
(n2 (arraydimension m2 1))
(rez (makearray (list n1 n2))))
(assert (= n (arraydimension m2 1)))
(dotimes (i n1)
(dotimes (j n2)
(let ((cur 0))
(dotimes (k n)
;; :+ is the incrementing analog of :=
(:+ cur (* (aref m1 i k)
(aref m2 k j))))
(:= (aref rez i j) cur))))
rez))
There are more efficient albeit much more complex versions using divideandconquer approach that can work in only O(n^2.37)
, but they have significant hidden constants and, that's why, are rarely used in practice, although if you're relying on an established library for matrix operations, such as the Fortranbased BLAS/ATLAS, you will find one of them underthehood.
Binary Search
Now, let's talk about some of the important and instructive array algorithms. The most prominent ones are searching and sorting.
A common sequence operation is searching for the element either to determine if it is present, to get its position or to retrieve the object that has a certain property (keybased search). The simple way to search for an element in Lisp is using the function find
:
CLUSER> (let ((vec #v((pair :foo :bar) (pair :baz :quux))))
(print (find (pair :foo :bar) vec))
(print (find (pair :foo :bar) vec :test 'equal))
(print (find (pair :bar :baz) vec :test 'equal))
(print (find :foo vec :key 'lt)))
NIL
(:FOO :BAR)
NIL
(:FOO :BAR)
In the first case, the element was not found due to the wrong comparison predicate: the default eql
will only consider to structures the same if it's the same object, and, in this case, there will be two separate pairs with the same content. So, the second search is successful as equal
performs deep comparison. Then the element is not found as it is just not present. And, in the last case, we did the keybased search looking just at the lt
element of all pairs in vec
.
Such search is called sequential scan because it is performed in a sequential manner over all elements of the vector starting from the beginning (or end if we specify :fromend t
) until either the element is found or we have examined all the elements. The complexity of such search is, obviously, O(n)
, i.e. we need to access each element of the collection (if the element is present we'll look, on average, at n/2
elements, and if not present — always at all n
elements).
However, if we know that our sequence is sorted, we can perform the search much faster. The algorithm used for that is one of the most famous algorithms that every programmer has to know and use, from time to time — binary search. The more general idea behind it is called "divide and conquer": if there's some way, looking at one element, to determine the outcome of our global operation for more than just this element we can discard the part, for which we already know that the outcome is negative. In binary search, when we're looking at an arbitrary element of the sorted vector and compare it with the item we search for:
 if the element is the same we have found it
 if it's smaller all the previous elements are also smaller and thus uninteresting to us — we need to look only on the subsequent ones
 if it's greater all the following elements are not interesting
Thus, each time we can examine the middle element and, after that, can discard half of the elements of the array without checking them. We can repeat such comparisons and halving until the resulting array contains just a single element.
Here's the straightforward binary search implementation using recursion:
(defun binsearch (val vec &optional (pos 0))
(if (> (length vec) 1)
(with ((mid (floor (+ beg end) 2))
(cur (aref vec mid)))
(cond ((< cur val) (binsearch val
(slice vec (1+ mid))
(+ pos mid 1)))
((> cur val) (binsearch val
(slice vec 0 (1+ mid))
pos))
(t (+ pos mid))))
(when (= (aref vec 0) val)
pos)))
If the middle element differs from the one we're looking for it halves the vector until just one element remains. If the element is found its position (which is passed as an optional 3rd argument to the recursive function) is returned. Note that we assume that the array is sorted. Generally, there's no way to quickly check this property unless we examine all array elements (and thus lose all the benefits of binary search). That's why we don't assert the property in any way and just trust the programmer :)
An important observation is that such recursion is very similar to a loop that at each stage changes the boundaries we're looking inbetween. Not every recursive function can be matched with a similar loop so easily (for instance, when there are multiple recursive calls in its body an additional memory data structure is needed), but when it is possible it usually makes sense to choose the loop variant. The pros of looping is the avoidance of both the function calls' overhead and the danger of hitting the recursion limit or the stack overflow associated with it. While the pros of recursion are simpler code and better debuggability that comes with the possibility to examine each iteration by tracing using the builtin tools.
Another thing to note is interesting counterintuitive arithmetic of additional comparisons. In our naive approach, we had 3 cond
clauses, i.e. up to 2 comparisons to make at each iteration. In total, we'll look at (log n 2)
elements of our array, so we have no more than (/ (1 (log n 2)) n)
chance to match the element with the =
comparison before we get to inspect the final 1element array. I.e. with the probability of ( 1 (/ (1 (log n 2)) n))
we'll have to make all the comparisons up to the final one. Even for such small n
as 10 this probability is 0.77 and for 100 — 0.94. And this is an optimistic estimate for the case when the element searched for is actually present in the array, which may not always be so. Otherwise, we'll have to make all the comparisons. Effectively, these numbers prove the equality comparison meaningless and just a waste of computation, although from "normal" programmer intuition it might seem like a good idea to implement early exit in this situation...
Finally, there's also one famous nonobvious bug associated with the binary search that was still present in many production implementations, for many years past the algorithm's inception. It's also a good example of the dangers of forfeiting boundary conditions check that is the root of many severe problems plaguing our computer systems by opening them to various exploits. The problem, in our code, may manifest in systems that have limited integer arithmetic with potential overflow. In Lisp, if the result of summing two fixnums is greater than mostpositivefixnum
(the maximum number that can be represented directly by the machine word) it will be automatically converted to bignums, which are a slower representation but with unlimited precision:
CLUSER> mostpositivefixnum
4611686018427387903
CLUSER> (typeof mostpositivefixnum)
(INTEGER 0 4611686018427387903)
CLUSER> (+ mostpositivefixnum mostpositivefixnum)
9223372036854775806
CLUSER> (typeof (+ mostpositivefixnum mostpositivefixnum))
(INTEGER 4611686018427387904)
In many other languages, such as C or Java, what will happen is either silent overflow (the worst), in which case we'll get just the remainder of division of the result by the maximum integer, or an overflow error. Both of these situations are not accounted for in the (floor (+ beg end) 2)
line. The simple fix to this problem, which makes sense to keep in mind for future similar situations, is to change the computation to the following equivalent form: (+ beg (floor ( end beg) 2))
. It will never overflow. Why? Try to figure out on your own ;)
Taking all that into account and allowing for a custom comparator function, here's an "optimized" version of binary search that returns 3 values:
 the final element of the array
 its position
 has it, actually, matched the element we were searching for?
(defun binsearch (val vec &key (less '<) (test '=) (key 'identity))
(when (plusp (length vec))
(let ((beg 0)
(end (length vec)))
(do ()
((= beg end))
(let ((mid (+ beg (floor ( end beg) 2))))
(if (call less (call key (aref vec mid)) val)
(:= beg (1+ mid))
(:= end mid))))
(values (aref vec beg)
beg
(call test (call key (aref vec beg)) val)))))
How many loop iterations do we need to complete the search? If we were to take the final oneelement array and expand the array from it by adding the discarded half it would double in size at each step, i.e. we'll be raising 2 to the power of the number of expansion iterations (initially, before expansion — after 0 iterations — we have 1 element, which is 2^0
, after 1 iteration, we have 2 elements, after 2 — 4, and so on). The number of iterations needed to expand the full array may be calculated by the inverse of exponentiation — the logarithmic function. I.e. we'll need (log n 2)
iterations (where n
is the initial array size). Shrinking the array takes the same as expanding, just in the opposite order, so the complexity of binary search is O(log n)
.
How big is the speedup from linear to logarithmic complexity? Let's do a quickanddirty speed comparison between the builtin (and optimized) sequential scan fucntion find
and our binsearch
:
CLUSER> (with ((size 100000000)
(mid (1+ (/ size 2)))
(vec (makearray size)))
(dotimes (i size)
(:= (? vec i) i))
(time (find mid vec))
(time (binsearch mid vec)))
Evaluation took:
0.591 seconds of real time
0.595787 seconds of total run time (0.595787 user, 0.000000 system)
100.85% CPU
...
Evaluation took:
0.000 seconds of real time
0.000000 seconds of total run time (0.000000 user, 0.000000 system)
100.00% CPU
...
Unfortunately, I don't have enough RAM on my notebook to make binsearch
take at least a millisecond of CPU time. We can count nanoseconds to get the exact difference, but a good number to remember is that (log 1000000 2)
is approximately 20, so, for the million elements array, the speedup will be 50000x!
The crucial limitation of binary search is that it requires our sequence to be presorted because sorting before each search already requires at least linear time to complete, which kills any performance benefit we might have expected. There are multiple situations when the presort condition may hold without our intervention:
 all the data is known beforehand and we can sort it just once prior to running the search, which may be repeated multiple times for different values
 we maintain the sorted order as we add data. Such an approach is feasible only if addition is performed less frequently than search. This is often the case with databases, which store their indices in sorted order
A final note on binary search: obviously, it will only work fast for vectors and not linked sequences.
Binary Search in Action
In one consumer internet company I was working for, a lot of text processing (which was the company's breadandbutter) relied on access to a huge statistical dataset called "ngrams". Ngrams is a simple Natural Language Processing concept: basically, they are phrases of a certain length. A unigram (1gram) is a single word, a bigram — a pair of words, a fivegram — a list of 5 words. Each ngram has some weight associated with it, which is calculated (estimated) from the huge corpus of texts (we used the crawl of the whole Internet). There are numerous ways to estimate this weight, but the basic one is to just count the frequency of the occurance of a specific ngram phrase in the corpus.
The total number of ngrams may be huge: for our case, the whole dataset, on disk, measured in tens of gigabytes. And the application requires constant random access to it. Using an offtheshelf database would have incurred us too much overhead as such systems are generalpurpose and don't optimize for the particular use cases, like the one we had. So, a specialpurpose solution was needed. In fact, now there is readilyavailable ngrams handling software, such as KenLM. We have built our own, and, initially, it relied on binary search of the inmemory dataset to answer the queries. Considering the size of the data, what do you think was the number of operations required? I don't remember it exactly, but somewhere between 25 and 30. For handling tens of gigabytes or hundreds of millions/billions of ngrams — quite a decent result. And, most important, it didn't exceed our application's latency limits! The key property that enabled such solution was the fact that all the ngrams were known beforehand and hence the dataset could be presorted. Yet, eventually, we moved to an even faster solution based on perfect hashtables (that we'll discuss later in this book).
One more interesting property of this program was that it took significant time to initialize as all the data had to be loaded into memory from disk. During that time, which measured in several dozens of minutes, the application was not available, which created a serious bottleneck in the whole system and complicated updates as well as put normal operation at additional risk. The solution we utilized to counteract this was also a common one for such cases: lazy loading in memory using the Unix mmap
facility.
Sorting
Sorting is another fundamental sequence operation that has many applications. Unlike searching, the sorted sequence, there is no single optimal algorithm for sorting, and different data structures allow different approaches to it. In general, the problem of sorting a sequence is to place all of its elements in a certain order determined by the comparison predicate. There are several aspects that differentiate sorting functions:
 inplace sorting is a destructive operation, but it is often desired because it may be faster and also it preserves space (especially relevant when sorting big amounts of data at once). The alternative is copying sort
 stable: whether 2 elements, which are considered the same by the predicate, retain their original order or may be shuffled
 online: does the function require to see the whole sequence before starting the sorting process or can it work with each element onebyone always preserving the result of processing the already seen part of the sequence in the sorted order
One more aspect of a particular sorting algorithm is its behavior on several special kinds of input data: already sorted (in direct and reversed order), almost sorted, completely random. An ideal algorithm should show better than average performance (up to O(1)
) on the sorted and almost sorted special cases.
Over the history of CS, sorting was and still remains a popular research topic. Not surprisingly, several dozens of different sorting algorithms were developed. But before discussing the prominent ones, let's talk about "Stupid sort" (or "Bogosort"). It is one of the sorting algorithms that has a very simple idea behind, but an outstandingly nasty performance. The idea is that among all permutations of the input sequence there definitely is the completely sorted one. If we were to take it, we don't need to do anything else. It's an example of the socalled "generate and test" paradigm that may be employed when we know next to nothing about the nature of our task: then, put some input into the black box and see the outcome. In the case of bogosort, the number of possible inputs is the number of all permutations that's equal to n!
, so considering that we need to also examine each permutation's order the algorithm's complexity is O(n * n!)
— quite a bad number, especially, since some specialized sorting algorithms can work as fast as O(n)
(for instance, Bucket sort for integer numbers). On the other hand, if generating all permutations is a library function and we don't care about complexity such an algorithm will have a rather simple implementation that looks quite innocent. So you should always inquire about the performance characteristics of 3rdparty functions. And, by the way, your standard library sort
function is also a good example of this rule.
(defun bogosort (vec comp)
(dolist (variant (allpermutations vec))
(dotimes (i (1 (length variant)))
;; this is the 3rd optional argument of dotimes header
;; that is evaluated only after the loop finishes normally
;; if it does we have found a completely sorted permutation!
(returnfrom bogosort variant))
(when (call comp (? variant (1+ i)) (? variant i))
(return))))) ; current variant is not sorted, skip it
O(n^2) Sorting
Although we can imagine an algorithm with even worse complexity factors than this, bogosort gives us a good lower bound on the sorting algorithm's performance and an idea of the potential complexity of this task. However, there are much faster approaches that don't have a particularly complex implementation. There is a number of such simple algorithms that work in quadratic time. A very wellknown one, which is considered by many a kind of "Hello world" algorithm, is Bubble sort. Yet, in my opinion, it's quite a bad example to teach (sadly, often it is taught) because it's both not very straightforward and has poor performance characteristics. That's why it's never used in practice. There are two other simple quadratic sorting algorithms that you actually have a chance to encounter in the wild, especially, Insertion sort that is used rather frequently. Their comparison is also quite insightful, so we'll take a look at both, instead of focusing just on the former.
Selection sort is an inplace sorting algorithm that moves lefttoright from the beginning of the vector one element at a time and builds the sorted prefix to the left of the current element. This is done by finding the "largest" (according to the comparator predicate) element in the right part and swapping it with the current element.
(defun selectionsort (vec comp)
(dotimes (i (1 (length vec)))
(let ((best (aref vec i))
(idx i))
(dotimes (j ( (length vec) i 1))
(when (call comp (aref vec (+ i j 1)) best)
(:= best (aref vec (+ i j 1))
idx (+ i j 1)))))
(rotatef (aref vec i) (aref vec idx))) ; this is the lisp's swap operator
vec)
Selection sort requires a constant number of operations regardless of the level of sortedness of the original sequence: (/ (* n ( n 1)) 2)
— the sum of the arithmetic progression from 1 to n
, because, at each step, it needs to fully examine the remainder of the elements to find the maximum, and the remainder's size varies from n
to 1
. It handles equally well both contiguous and linked sequences.
Insertion sort is another quadratictime inplace sorting algorithm that builds the sorted prefix of the sequence. However, it has a few key differences from Selection sort: instead of looking for the global maximum in the righthand side it looks for a proper place of the current element in the lefthand side. As this part is always sorted it takes linear time to find the place for the new element and insert it there leaving the side in sorted order. Such change has great implications:
 it is stable
 it is online: the left part is already sorted, and, in contrast with selection sort, it doesn't have to find the maximum element of the whole sequence in the first step, it can handle encountering it at any step
 for sorted sequences it works in the fastest possible way — in linear time — as all elements are already inserted into proper places and don't need moving. The same applies to almost sorted sequences, for which it works in almost linear time. However, for reverse sorted sequences, its performance will be the worse. In fact, there is a clear proportion of the algorithm's complexity to the average offset of the elements from their proper positions in the sorted sequence:
O(k * n)
, wherek
is the average offset of the element. For sorted sequencesk=0
and for reverse sorted it's(/ ( n 1) 2)
.
(defun insertionsort (vec comp)
(dotimes (i (1 (length vec)))
(do ((j i (1 j)))
((minusp j))
(if (call comp (aref vec (1+ j)) (aref vec j))
(rotatef (aref vec (1+ j)) (aref vec j))
(return))))
vec)
As you see, the implementation is very simple: we look at each element starting from the second, compare it to the previous element, and if it's better we swap them and continue the comparison with the previous element until we reach the array's beginning.
So, where's the catch? Is there anything that makes Selection sort better than Insertion? Well, if we closely examine the number of operations required by each algorithm we'll see that Selection sort needs exactly (/ (* n ( n 1)) 2)
comparisons and on average n/2
swaps. For Insertion sort, the number of comparisons varies from n1
to (/ (* n ( n 1)) 2)
, so, in the average case, it will be (/ (* n ( n 1)) 4)
, i.e. half as many as for the other algorithm. In the sorted case, each element is already in its position, and it will take just 1 comparison to discover that, in the reverse sorted case, the average distance of an element from its position is (/ ( n 1) 2)
, and for the middle variant, it's in the middle, i.e. (/ ( n 1) 4)
. Times the number of elements (n
). But, as we can see from the implementation, Insertion sort requires almost the same number of swaps as comparisons, i.e. (/ (* ( n 1) ( n 2)) 4)
in the average case, and it matches the number of swaps of Selection sort only in the close to best case, when each element is on average 1/2 steps away from its proper position. If we sum up all comparisons and swaps for the average case, we'll get the following numbers:
 Selection sort:
(+ (/ (* n ( n 1)) 2) (/ n 2)) = (/ (+ (* n n) n) 2)
 Insertion sort:
(+ (/ (* n ( n 1)) 2) (+ (/ (* ( n 1) ( n 2)) 4) = (/ (+ (* 1.5 n n) (* 2.5 n) 1) 2)
The second number is slightly higher than the first. For small n
s it is almost negligible: for instance, when n=10
, we get 55 operations for Selection sort and 63 for Insertion. But, asymptotically (for huge n
s like millions and billions), Insertion sort will need 1.5 times more operations. Also, it is often the case that swaps are more expensive operations than comparisons (although, the opposite is also possible).
In practice, Insertion sort ends up being used more often, for, in general, quadratic sorts are only used when the input array is small (and so the difference in the number of operations) doesn't matter, while it has other good properties we mentioned. However, one situation when Selection sort's predictable performance is an important factor is in the systems with deadlines.
Quicksort
There is a number of other O(n^2)
sorting algorithms similar to Selection and Insertion sorts, but studying them quickly turns boring so we won't. As there's also a number of significantly faster algorithms that work in O(n * log n)
time (almost linear). They usually rely on the divideandconquer approach when the whole sequence is recursively divided into smaller subsequences that have some property, thanks to which it's easier to sort them, and then these subsequences are combined back into the final sorted sequence. The feasibility of such performance characteristics is justified by the observation that ordering relations are recursive, i.e. if we have compared two elements of an array and then compare one of them to the third element, with a probability of 1/2 we'll also know how it relates to the other element.
Probably, the most famous of such algorithms is Quicksort. Its idea is, at each iteration, to select some element of the array as the "pivot" point and divide the array into two parts: all the elements that are smaller and all those that are larger than the pivot; then recursively sort each subarray. As all left elements are below the pivot and all right — above when we manage to sort the left and right sides the whole array will be sorted. This invariant holds for all iterations and for all subarrays. The word "invariant", literally, means some property that doesn't change over the course of the algorithm's execution when other factors, e.g. bounds of the array we're processing, are changing.
There're several tricks in Quicksort implementation. The first one has to do with pivot selection. The simplest approach is to always use the last element as the pivot. Now, how do we put all the elements greater than the pivot after it if it's already the last element? Let's say that all elements are greater — then the pivot will be at index 0. Now, if moving left to right over the array we encounter an element that is not greater than the pivot we should put it before, i.e. the pivot's index should increment by 1. When we reach the end of the array we know the correct position of the pivot, and in the process, we can swap all the elements that should precede it in front of this position. Now, we have to put the element that is currently occupying the pivot's place somewhere. Where? Anywhere after the pivot, but the most obvious thing is to swap it with the pivot.
(defun quicksort (vec comp)
(when (> (length vec) 1)
(with ((pivoti 0)
(pivot (aref vec (1 (length vec)))))
(dotimes (i (1 (length vec)))
(when (call comp (aref vec i) pivot)
(rotatef (aref vec i)
(aref vec pivoti))
(:+ pivoti)))
;; swap the pivot (last element) in its proper place
(rotatef (aref vec (1 (length vec)))
(aref vec pivoti))
(quicksort (slice vec 0 pivoti) comp)
(quicksort (slice vec (1+ pivoti)) comp)))
vec)
Although recursion is employed here, such implementation is spaceefficient as it uses array displacement ("slicing") that doesn't create new copies of the subarrays, so sorting happens inplace. Speaking of recursion, this is one of the cases when it's not so straightforward to turn it into looping (this is left as an exercise to the reader :) ).
What is the complexity of such implementation? Well, if, on every iteration, we divide the array in two equal halves we'll need to perform n
comparisons and n/2
swaps and increments, which totals to 2n
operations. And we'll need to do that (log n 2)
times, which is the height of a complete binary tree with n
elements. At every level in the recursion tree, we'll need to perform twice as many sorts with twice as little data, so each level will take the same number of 2n
operations. Total complexity: 2n * (log n 2)
, i.e. O(n * log n)
. In the ideal case.
However, we can't guarantee that the selected pivot will divide the array into two ideally equal parts. In the worst case, if we were to split it into 2 totally unbalanced subarrays, with n1
and 0 elements respectively, we'd need to perform sorting n
times and had to perform a number of operations that will diminish in the arithmetic progression from 2n
to 2. Which sums to (* n ( n 1))
. A dreaded O(n^2)
complexity. So, the worstcase performance for quicksort is not just worse, but in a different complexity league than the averagecase one. Moreover, the conditions for such performance (given our pivot selection scheme) are not so uncommon: sorted and reversesorted arrays. And the almost sorted ones will result in the almost worstcase scenario.
It is also interesting to note that if, at each stage, we were to split the array into parts that have a 10:1 ratio of lengths this would have resulted in n * log n
complexity! How come? The 10:1 ratio, basically, means that the bigger part each time is shortened at a factor of around 1.1, which still is a powerlaw recurrence. The base of the algorithm will be different, though: 1.1 instead of 2. Yet, from the complexity theory point of view, the logarithm base is not important because it's still a constant: (log n x)
is the same as (/ (log n 2) (log x 2))
, and (/ 1 (log x 2))
is a constant for any fixed logarithm base x
. In our case, if x
is 1.1 the constant factor is 7.27. Which means that quicksort, in the quite bad case of recurring 10:1 splits, will be just a little more than 7 times slower than, in the best case, of recurring equal splits. Significant — yes. But, if we were to compare n * log n
(with base 2) vs n^2
performance for n=1000
we'd already get a 100 times slowdown, which will only continue increasing as the input size grows. Compare this to a constant factor of 7...
So, how do we achieve at least 10:1 split, or, at least, 100:1, or similar? One of the simple solutions is called 3medians approach. The idea is to consider not just a single point as a potential pivot but 3 candidates: first, middle, and last points — and select the one, which has the median value among them. Unless accidentally two or all three points are equal, this guarantees us not taking the extreme value that is the cause of the alltonothing split. Also, for a sorted array, this should produce a nice near to equal split. How probable is stumbling at the special case when we'll always get at the extreme value due to equality of the selected points? The calculations here are not so simple, so I'll give just the answer: it's extremely improbable that such condition will hold for all iterations of the algorithm due to the fact that we'll always remove the last element and all the swapping that is going on. More precisely, the only practical variant when it may happen is when the array consists almost or just entirely of the same elements. And this case will be addressed next. One more refinement to the 3medians approach that will work even better for large arrays is 9medians that, as is apparent from its name, performs the median selection not among 3 but 9 equidistant points in the array.
Dealing with equal elements is another corner case for quicksort that should be addressed properly. The fix is simple: to divide the array not in 2 but 3 parts, smaller, larger, and equal to the pivot. This will allow for the removal of the equal elements from further consideration and will even speed up sorting instead of slowing it down. The implementation adds another index (this time, from the end of the array) that will tell us where the equaltopivot elements will start, and we'll be gradually swapping them into this tail as they are encountered during array traversal.
Production Sort
I was always wondering how it's possible, for Quicksort, to be the default sorting algorithm when it has such bad worstcase performance and there are other algorithms like Merge sort or Heap sort that have guaranteed O(n * log n)
ones. With all the mentioned refinements, it's apparent that the worstcase scenario, for Quicksort, can be completely avoided (in the probabilistic sense) while it has a very nice property of sorting inplace with good cache locality, which significantly contributes to better realworld performance. Moreover, production sort implementation will be even smarter by utilizing Quicksort while the array is large and switching to something like Insertion sort when the size of the subarray reaches a certain threshold (1020 elements). All this, however, is applicable only to arrays. When we consider lists, other factors come into play that make Quicksort much less plausible.
Here's an attempt at such — let's call it "Production sort" — implementation (the function 3medians
is left as an excercise to the reader).
(defun prodsort (vec comp &optional (eq 'eql))
(cond ((< (length vec) 2)
vec)
((< (length vec) 10)
(insertionsort vec comp))
(t
(rotatef (aref vec (1 (length vec)))
(aref vec (3medians vec comp eq)))
(with ((pivoti 0)
(pivotcount 1)
(lasti (1 (length vec)))
(pivot (aref vec lasti)))
(do ((i 0 (1+ i)))
((> i ( lasti pivotcount)))
(cond ((call comp (aref vec i) pivot)
(rotatef (aref vec i)
(aref vec pivoti))
(:+ pivoti))
((call eq (aref vec i) pivot)
(rotatef (aref vec i)
(aref vec ( lasti pivotcount)))
(:+ pivotcount)
(: i)))) ; decrement i to reprocess newly swapped point
(dotimes (i pivotcount)
(rotatef (aref vec (+ pivoti i))
(aref vec ( lasti i))))
(prodsort (slice vec 0 pivoti) comp eq)
(prodsort (slice vec (+ pivoti pivotcount)) comp eq))))
vec)
All in all, the example of Quicksort is very interesting, from the point of view of complexity analysis. It shows the importance of analyzing the worstcase and other cornercase scenarios, and, at the same time, teaches that we shouldn't give up immediately if the worst case is not good enough, for there may be ways to handle such corner cases that reduce or remove their impact.
Performance Benchmark
Finally, let's look at our problem from another angle: simple and stupid. We have developed 3 sorting functions' implementations: Insertion, Quick, and Prod. Let's create a tool to compare their performance on randomly generated datasets of decent sizes. This may be done with the following code and repeated many times to exclude the effects of randomness.
(defun randomvec (size)
(let ((vec (makearray size)))
(dotimes (i size)
(:= (? vec i) (random size)))
vec))
(defun printsorttimings (sortname sortfn vec)
;; we'll use inplace modification of the input vector VEC
;; so we need to copy it to preserve the original for future use
(let ((vec (copyseq vec))
(len (length vec)))
(format t "= ~Asort of random vector (length=~A) =~%"
sortname len)
(time (call sortfn vec '<))
(format t "= ~Asort of sorted vector (length=~A) =~%"
sortname len)
(time (call sortfn vec '<))
(format t "= ~Asort of reverse sorted vector (length=~A) =~%"
sortname len)
(time (call sortfn vec '>))))
CLUSER> (let ((vec (randomvec 1000)))
(printsorttimings "Insertion " 'insertionsort vec)
(printsorttimings "Quick" 'quicksort vec)
(printsorttimings "Prod" 'prodsort vec))
= Insertion sort of random vector (length=1000) =
Evaluation took:
0.128 seconds of real time
...
= Insertion sort of sorted vector (length=1000) =
Evaluation took:
0.001 seconds of real time
...
= Insertion sort of reverse sorted vector (length=1000) =
Evaluation took:
0.257 seconds of real time
...
= Quicksort of random vector (length=1000) =
Evaluation took:
0.005 seconds of real time
...
= Quicksort of sorted vector (length=1000) =
Evaluation took:
5.429 seconds of real time
...
= Quicksort of reverse sorted vector (length=1000) =
Evaluation took:
2.176 seconds of real time
...
= Prodsort of random vector (length=1000) =
Evaluation took:
0.008 seconds of real time
...
= Prodsort of sorted vector (length=1000) =
Evaluation took:
0.004 seconds of real time
...
= Prodsort of reverse sorted vector (length=1000) =
Evaluation took:
0.007 seconds of real time
Overall, this is a really primitive approach that can't serve as conclusive evidence on its own, but it has value as it aligns well with our previous calculations. Moreover, it once again reveals some things that may be omitted in those calculations: for instance, the effects of the hidden constants of the BigO notation or of the particular programming vehicles used. We can see that, for their worstcase scenarios, where Quicksort and Insertion sort both have O(n^2)
complexity and work the longest, Quicksort comes 10 times slower, although it's more than 20 times faster for the average case. This slowdown may be attributed both to the larger number of operations and to using recursion. Also, our Prodsort algorithm demonstrates its expected performance. As you see, such simple testbeds quickly become essential in testing, debugging, and finetuning our algorithms' implementations. So it's a worthy investment.
Finally, it is worth noting that array sort is often implemented as inplace sorting, which means that it will modify (spoil) the input vector. We use that in our test function: first, we sort the array and then sort the sorted array in direct and reverse orders. This way, we can omit creating new arrays. Such destructive sort behavior may be both the intended and surprising behavior. The standard Lisp's sort
and stablesort
functions also exhibit it, which is, unfortunately, a source of numerous bugs due to the application programmer forgetfulness of the function's sideeffects (at least, this is an acute case, for myself). That's why RUTILS provides an additional function safesort
that is just a thin wrapper over standard sort
to free the programmer's mind from worrying or forgetting about this treacherous sort
's property.
A few points we can take away from this chapter:
 Array is a goto structure for implementing your algorithms. First, try to fit it before moving to other things like lists, trees, and so on.
 Complexity estimates should be considered in context: of the particular task's requirements and limitations, of the hardware platform, etc. Performing some realworld benchmarking alongside backofthenapkin abstract calculations may be quite insightful.
 It's always worth thinking of how to reduce the code to the simplest form: checking of additional conditions, recursion, and many other forms of code complexity, although, rarely are a game changer, often may lead to significant unnecessary slowdowns.
Footnotes:
[1] or void*
in C, or some other type that allows any element in your language of choice
[2] Such incompatibility errors are not a cheap thing: for instance, it is reported that the crash of the first Arian V rocket happened due to interoperation of two programs that used the metric and the imperial measurement systems without explicit conversion of the data. There's an elegant solution to such problem: "dimensional numbers", which a custom reader macro to encode the measure alongside the number. Here is a formula expressed with such numbers:
(defun runningdistancefor1kgweightloss (mass)
(* 1/4 (/ #M37600kJ (* #M0.98m/s2 mass))))
CLUSER> (runningdistancefor1kgweightloss #M80kg)
119897.96
CLUSER> (runningdistancefor1kgweightloss #I200lb)
105732.45
The output is, of course, in metric units. Unfortunately, this approach will not be useful for arrays encoded by different languages as they are obtained not by reading the input but by referencing external memory. Instead, a wrapper struct/class is, usually, used to specify the elements order.
Nicolas Hafner — An Extensible Particle System  Gamedev
@20190806 08:36 · 18 days ago
This article was originally published on GameDev.NET. In it, I illustrate a new particle system that was developed for my Lisp game engine, Trial. It contains quite a bit of graphics stuff, but also a lot of Lisp. I thought it would be worthwhile to share it here as well. For those unfamiliar, a particle system deals in orchestrating a lot of very similar things (particles). The challenge is to efficiently draw and update these particles.
For the drawing we consider two separate parts  the geometry used for each particle, and the data used to distinguish one particle from another. We pack both of these two parts into a singular vertex array, using instancing for the vertex attributes of the latter part. This allows us to use instanced drawing and draw all of the particles in one draw call. In the particle shader we then need to make sure to add the particle's location offset, and to do whatever is necessary to render the geometry appropriately as usual. This can be done easily enough in any game engine, though it would be much more challenging to create a generic system that can easily work with any particle geometry and any rendering logic. In Trial this is almost free.
There's two parts in Trial that allow me to do this: first, the ability to inherit and combine opaque shader parts along the class hierarchy, and second, the ability to create structures that are backed by an opaque memory region, while retaining the type information. The latter part is not that surprising for languages where you can cast memory and control the memory layout precisely, but nonetheless in Trial you can combine these structures through inheritance, something not typically possible without significant hassle. Trial also allows you to describe the memory layout precisely. For instance, this same system is used to represent uniform buffer objects, as well as what we're using here, which is attributes in a vertex buffer.
If you'll excuse the code dump, we'll now take a look at the actual particle system implementation:
(defineglstruct (particle (:layoutstandard :vertexbuffer))
(lifetime :vec2 :accessor lifetime))
(defineshadersubject particleemitter ()
((liveparticles :initform 0 :accessor liveparticles)
(vertexarray :accessor vertexarray)
(particlebuffer :initarg :particlebuffer :accessor particlebuffer)))
(defmethod initializeinstance :after ((emitter particleemitter) &key particlemesh particlebuffer)
(setf (vertexarray emitter)
(addvertexbindings
particlebuffer
(changeclass particlemesh 'vertexarray))))
(defmethod paint ((emitter particleemitter) pass)
(let ((vao (vertexarray emitter)))
(gl:bindvertexarray (glname vao))
(%gl:drawelementsinstanced (vertexform vao) (size vao) :unsignedint 0 (liveparticles emitter))))
(defgeneric initialparticlestate (emitter tick particle))
(defgeneric updateparticlestate (emitter tick input output))
(defgeneric newparticlecount (emitter tick)) ; => N
(definehandler (particleemitter tick) (ev)
(let ((vbo (particlebuffer particleemitter))
(writeoffset 0))
(let ((data (structvector vbo)))
(declare (type simplevector data))
(loop for readoffset from 0 below (liveparticles particleemitter)
for particle = (aref data readoffset)
do (when (< (vx2 (lifetime particle)) (vy2 (lifetime particle)))
(when (updateparticlestate particleemitter ev particle (aref data writeoffset))
(incf writeoffset))))
(loop repeat (newparticlecount particleemitter ev)
while (< writeoffset (length data))
do (initialparticlestate particleemitter ev (aref data writeoffset))
(incf writeoffset))
(setf (liveparticles particleemitter) writeoffset)
(updatebufferdata vbo T))))
Let's go over this real quick. We first define a base class for all particles. This only mandates the lifetime field, which is a vector composed of the current age and the max age. This is used by the emitter to check liveness. Any other attribute of a particle is specific to the usecase, so we leave that up to the user.
Next we define our main particleemitter class. It's called a "shader subject" in Trial, which means that it has shader code attached to the class, and can react to events in separate handler functions. Anyway, all we need for this class is to keep track of the number of live particles, the vertex array for all the particles, and the buffer we use to keep the perparticle data. In our constructor we construct the vertex array be combining the vertex attribute bindings of the particle buffer and the particle mesh.
The painting logic is very light, as we just need to bind the vertex array and do an instanced draw call, using the liveparticles count for our current number of instances.
The three functions defined afterwards specify the protocol users need to follow to actually create and update the particles throughout their lifetime. The first function fills the initial state into the passed particle instance, the second uses the info from the input particle instance to fill the update into the output particle info, and the final function determines the number of new particles per update. These particle instances are instances of the particle class the user specifies through the particlebuffer, but their fields are backed by a common byte array. This allows us to make manipulation of the particles feel native and remain extensible, without requiring complex and expensive marshalling.
Finally we come to the bulk of the code, which is the tick update handler. This does not do too much in terms of logic, however. We simply iterate over the particle vector, checking the current lifetime. If the particle is still alive, we call the updateparticlestate function. If this succeeds, we increase the writeoffset into the particle vector. If it does not succeed, or the particle is dead, the writeoffset remains the same, and the particle at that position will be overwritten by the next live, successful update. This in effect means that live particles are always at the beginning of the vector, allowing us to cut off the dead ones with the liveparticles count. Then, we simply construct as many new particles as we should without overrunning the array, and finally we upload the buffer data from RAM to the GPU by using updatebufferdata, which in effect translates to a glBufferSubData call.
Now that we have this base protocol in place we can define a simple standard emitter, which should provide a much easier interface.
(defineglstruct (simpleparticle (:include particle)
(:layoutstandard :vertexbuffer))
(location :vec3 :accessor location)
(velocity :vec3 :accessor velocity))
(defineshadersubject simpleparticleemitter (particleemitter)
())
(defmethod initialparticlestate :before ((emitter simpleparticleemitter) tick particle)
(setf (location particle) (vec 0 0 0)))
(defmethod updateparticlestate ((emitter simpleparticleemitter) tick particle output)
(setf (location output) (v+ (location particle) (velocity particle)))
(let ((life (lifetime particle)))
(incf (vx2 life) (dt tick))
(setf (lifetime output) life)
(< (vx2 life) (vy2 life))))
(defmethod paint :before ((emitter simpleparticleemitter) (pass shaderpass))
(let ((program (shaderprogramforpass pass emitter)))
(setf (uniform program "view_matrix") (viewmatrix))
(setf (uniform program "projection_matrix") (projectionmatrix))
(setf (uniform program "model_matrix") (modelmatrix))))
(defineclassshader (simpleparticleemitter :vertexshader)
"layout (location = 0) in vec3 vtx_location;
layout (location) in vec3 location;
uniform mat4 model_matrix;
uniform mat4 view_matrix;
uniform mat4 projection_matrix;
void main(){
vec3 position = vtx_location + location;
gl_Position = projection_matrix * view_matrix * model_matrix * vec4(position, 1.0f);
}")
Okey! Again we define a new structure, this time including the base particle so that we get the lifetime field as well. We add a location and velocity on to this, which we'll provide for basic movement. Then we define a subclass of our emitter, to provide the additional defaults. Using this subclass we can provide some basic updates that most particle systems based on it will expect: an initial location at the origin, updating the location by the velocity, increasing the lifetime by the delta time of the tick, and returning whether the particle is still live after that.
On the painting side we provide the default handling of the position. To do so, we first pass the three standard transform matrices used in Trial as uniforms, and then define a vertex shader snippet that handles the vertex transformation. You might notice here that the second vertex input, the one for the perparticle location, does not have a location assigned. This is because we cannot know where this binding lies ahead of time. The user might have additional vertex attributes for their perparticle mesh that we don't know about. The user must later provide an additional vertexshader snippet that does define this.
So, finally, let's look at an actual usecase of this system.
(defineasset (workbench particles) vertexstructbuffer
'simpleparticle
:structcount 1024)
(defineshadersubject fireworks (simpleparticleemitter)
()
(:defaultinitargs :particlemesh (changeclass (makesphere 1) 'vertexarray :vertexattributes '(location))
:particlebuffer (asset 'workbench 'particles)))
(defmethod initialparticlestate ((fireworks fireworks) tick particle)
(let ((dir (polar>cartesian (vec2 (/ (sxhash (fc tick)) (ash 2 60)) (mod (sxhash (fc tick)) 100)))))
(setf (velocity particle) (vec (vx dir) (+ 2.5 (mod (sxhash (fc tick)) 2)) (vy dir))))
(setf (lifetime particle) (vec 0 (+ 3.0 (random 1.0)))))
(defmethod updateparticlestate :before ((fireworks fireworks) tick particle output)
(let ((vel (velocity particle)))
(decf (vy3 vel) 0.005)
(when (< (abs ( (vx (lifetime particle)) 2.5)) 0.05)
(let ((dir (polar>cartesian (vec3 (+ 1.5 (random 0.125)) (random (* 2 PI)) (random (* 2 PI))))))
(vsetf vel (vx dir) (vy dir) (vz dir))))
(setf (velocity output) vel)))
(defmethod newparticlecount ((fireworks fireworks) tick)
(if (= 0 (mod (fc tick) (* 10 1)))
128 0))
(defineclassshader (fireworks :vertexshader 1)
"layout (location = 1) in vec2 in_lifetime;
layout (location = 2) in vec3 location;
out vec2 lifetime;
void main(){
lifetime = in_lifetime;
}")
(defineclassshader (fireworks :fragmentshader)
"out vec4 color;
in vec2 lifetime;
void main(){
if(lifetime.x <= 2.5)
color = vec4(1);
else{
float lt = lifetime.ylifetime.x;
color = vec4(lt*2, lt, 0, 1);
}
}")
First we define an asset that holds our perparticle buffer data. To do this we simply pass along the name of the particle class we want to use, as well as the number of such instances to allocate in the buffer. We then use this, as well as a simple sphere mesh, to initialize our own particle emitter. Then come the particle update methods. For the initial state we calculate a random velocity within a cone region, using polar coordinates. This will cause the particles to shoot out at various angles. We use a hash on the current frame counter here to ensure that particles generated in the same frame get bunched together with the same initial values. We also set the lifetime to be between three and four seconds, randomly for each particle.
In the update, we only take care of the velocity change, as the rest of the work is already done for us. For this we apply some weak gravity, and then check the lifetime of the particle. If it is within a certain range, we radically change the velocity of the particle in a random, spherical direction. In effect this will cause the particles, which were bunched together until now, to spread out randomly.
For our generator, we simply create a fixed number of particles every 10 frames or so. In a fixed framerate, this should look mean a steady generation of particle batches.
Finally, in the two shader code snippets we provide the aforementioned vertex attribute binding location, and some simple colouring logic to make the particles look more like fireworks. The final result of this exercise is this:
Quite nice, I would say. With this we have a system that allows us to create very different particle effects, with relatively little code. For Leaf, I intend on using this to create 2D spritebased particle effects, such as sparks, dust clouds, and so forth. I'm sure I'll revisit this at a later date to explore these different application possibilities.
Vsevolod Dyomkin — Programming Algorithms: Data Structures
@20190805 10:40 · 19 days agoThe next several chapters will be describing the basic data structures that every programming language provides, their usage and the most important algorithms relevant to them. And we'll start with the notion of a datastructure and tuples or structs that are the most primitive and essential one.
Data Structures vs Algorithms
Let's start with a somewhat abstract question: what's more important, algorithms or data structures?
From one point of view, algorithms are the essence of many programs, while data structures may seem secondary. Besides, although a majority of algorithms rely on certain features of particular data structures, not all do. Good examples of the datastructurerelying algorithms are heapsort, search using BSTs, and unionfind. And of the second type: the sieve of Erastophenes and consistent hashing.
At the same time, some seasoned developers state that when the right data structure is found, the algorithm will almost write itself. Linus Torvalds, the creator of Linux, is quoted saying:
Bad programmers worry about the code. Good programmers worry about data structures and their relationships.
A somewhat less poignant version of the same idea is formulated in the Art of Unix Programming by Eric S. Raymond as the "Rule of Representation":
Fold knowledge into data so program logic can be stupid and robust.
Even the simplest procedural logic is hard for humans to verify, but quite complex data structures are fairly easy to model and reason about. To see this, compare the expressiveness and explanatory power of a diagram of (say) a fiftynode pointer tree with a flowchart of a fiftyline program. Or, compare an array initializer expressing a conversion table with an equivalent switch statement. The difference in transparency and clarity is dramatic.
Data is more tractable than program logic. It follows that where you see a choice between complexity in data structures and complexity in code, choose the former. More: in evolving a design, you should actively seek ways to shift complexity from code to data.
Data structures are more static than algorithms. Surely, most of them allow change of their contents over time, but there are certain invariants that always hold. This allows reasoning by simple induction: consider only two (or at least a small number of) cases, the base one(s) and the general. In other words, data structures remove, in the main, the notion of time from consideration, and change over time is one of the major causes of program complexity. In other words, data structures are declarative, while most of the algorithms are imperative. The advantage of the declarative approach is that you don't have to imagine (trace) the flow of time through it.
So, this book, like most other books on the subject, is organized around data structures. The majority of the chapters present a particular structure, its properties and interface, and explain the algorithms, associated with it, showing its realworld use cases. Yet, some important algorithms don't require a particular data structure, so there are also several chapters dedicated exclusively to them.
The Data Structure Concept
Among data structures, there are, actually, two distinct kinds: abstract and concrete. The significant difference between them is that an abstract structure is just an interface (a set of operations) and a number of conditions or invariants that have to be met. Their particular implementations, which may differ significantly in efficiency characteristics and inner mechanisms, are provided by the concrete data structures. For instance, an abstract data structure queue
has just two operations: enqueue
that adds an item to the end of the queue and dequeue
that gets an item at the beginning and removes it. There's also a constraint that the items should be dequeued in the same order they are enqueued. Now, a queue may be implemented using a number of different underlying data structures: a linked or a doublelinked list, an array or a tree. Each one having different efficiency characteristics and additional properties beyond the queue interface. We'll discuss both kinds in the book, focusing on the concrete structures and explaining their usage to implement a particular abstract interface.
The term data structures has somewhat fallen from grace, in the recent years, being often replaced by conceptually more loaded notions of types, in the context of the functional programming paradigm, or classes, in objectorientated one. Yet, both of those notions imply something more than just algorithmic machinery we're exclusively interested in, for this book. First of all, they also distinguish among primitive values (numbers, characters, etc.) that are all nondistinct, in the context of algorithms. Besides, classes form a hierarchy of inheritance while types are associated with algebraic rules of category theory. So, we'll stick to a neutral data structures term, throughout the book, with occasional mentions of the other variants where appropriate.
Contiguous and Linked Data Structures
The current computer architectures consist of a central processor (CPU), memory and peripheral inputoutput devices. The data is someway exchanged with the outside world via the IOdevices, stored in memory, and processed by the CPU. And there's a crucial constraint, called the von Neumann's bottleneck: the CPU can only process data that is stored inside of it in a limited number of special basic memory blocks called registers. So it has to constantly move data elements back and forth between the registers and main memory (using intermediate cache to speed up the process). Now, there are things that can fit in a register and those that can't. The first ones are called primitive and mostly unite those items that can be directly represented with integer numbers: integers proper, floats, characters. Everything that requires a custom data structure to be represented can't be put in a register as a whole.
Another item that fits into the processor register is a memory address. In fact, there's an important constant — the number of bits in a generalpurpose register, which defines the maximum memory address that a particular CPU may handle and, thus, the maximum amount of memory it can work with. For a 32bit architecture it's 2^32
(4 GB) and for 64bit — you've guessed it, 2^64
. A memory address is usually called a pointer, and if you put a pointer in a register, there are commands that allow the CPU to retrieve the data inmemory from where it points.
So, there are two ways to place a data structure inside the memory:
 a contiguous structure occupies a single chunk of memory and its contents are stored in adjacent memory blocks. To access a particular piece we should know the offset of its beginning from the start of the memory range allocated to the structure. (This is usually handled by the compiler). When the processor needs to read or write to this piece it will use the pointer calculated as the sum of the base address of the structure and the offset. The examples of contiguous structures are arrays and structs
 a linked structure, on the contrary, doesn't occupy a contiguous block of memory, i.e. its contents reside in different places. This means that pointers to a particular piece can't be precalculated and should be stored in the structure itself. Such structures are much more flexible at the cost of this additional overhead both in terms of used space and time to access an element (which may require several hops when there's nesting, while in the contiguous structure it is always constant). There exists a multitude of linked data structures like lists, trees, and graphs
Tuples
In most languages, some common data structures, like arrays or lists, are "builtin", but, under the hood, they will mostly work in the same way as any userdefined ones. To implement an arbitrary data structure, these languages provide a special mechanism called records, structs, objects, etc. The proper name for it would be "tuple". It is the data structure that consists of a number of fields each one holding either a primitive value, another tuple or a pointer to another tuple of any type. This way a tuple can represent any structure, including nested and recursive ones. In the context of type theory, such structures are called product types.
A tuple is an abstract data structure and its sole interface is the field accessor function: by name (a named tuple) or index (an anonymous tuple). It can be implemented in various ways, although a contiguous variant with constanttime access is preferred. However, in many languages, especially dynamic, programmers often use lists or dynamic arrays to create throwaway adhoc tuples. Python has a dedicated tuple data type, that is often for this purpose, that is a linked data structure under the hood. The following Python function will return a tuple (written in parens) of a decimal and remainder parts of the number x
[1]:
def truncate(x):
dec = int(x)
rem = x  dec
return (dec, rem)
This is a simple and not very efficient way that may have its place when the number of fields is small and the lifetime of the structure is short. However, a better approach both from the point of view of efficiency and code clarity is to use a predefined structure. In Lisp, a tuple is called "struct" and is defined with defstruct
, which uses a contiguous representation by default (although there's an option to use a linked list underthehood). Following is the definition of a simple pair data structure that has two fields (called "slots" in Lisp parlance): left
and right
.
(defstruct pair
left right)
The defstruct
macro, in fact, generates several definitions: of the struct type, its constructor that will be called makepair
and have 2 keyword arguments :left
and :right
, and field accessors pairleft
and pairright
. Also, a common printobject
method for structs will work for our new structure, as well as a readermacro to restore it from the printed form. Here's how it all fits together:
CLUSER> (makepair :left "foo" :right "bar")
#S(PAIR :LEFT "foo" :RIGHT "bar")
CLUSER> (pairright (readfromstring (prin1tostring *)))
"bar"
prin1tostring
and readfromstring
are complimentary Lisp functions that allow to print the value in a computerreadable form (if an appropriate printfunction is provided) and read it back. Good printrepresentations readable to both humans and, ideally, computers are very important to code transparency and should never be neglected.
There's a way to customize every part of the definition. For instance, if we plan to use pairs frequently we can leave out the pair
prefix by specifying (:concname nil)
property. Here is an improved pair
definition and shorthand constructor for it from RUTILS, which we'll use throughout the book. It uses :type list
allocation to integrate with destructuring macros.
(defstruct (pair (:type list) (:concname nil))
"A generic pair with left (LT) and right (RT) elements."
lt rt)
(defun pair (x y)
"A shortcut to make a pair of X and Y."
(makepair :lt x :rt y))
Passing Data Structures in Function Calls
One final remark. There are two ways to use data structures with functions: either pass them directly via copying appropriate memory areas (callbyvalue) — an approach, usually, applied to primitive types — or pass a pointer (callbyreference). In the first case, there's no way to modify the contents of the original structure in the called function, while in the second variant it is possible, so the risk of unwarranted change should be taken into account. The usual way to handle it is by making a copy before invoking any changes, although, sometimes, mutation of the original data structure may be intended so a copy is not needed. Obviously, the callbyreference approach is more general, because it allows both modification and copying, and more efficient because copying is ondemand. That's why it is the default way to handle structures (and objects) in most programming languages. In a lowlevel language like C, however, both variants are supported. Moreover, in C++ the passbyreference has two kinds: pass the pointer and pass what's actually called a reference, which is syntax sugar over pointers that allows accessing the argument with nonpointer syntax (dot instead of arrow) and adds a couple of restrictions. But the general idea, regardless of the idiosyncrasies of particular languages, remains the same.
Structs in Action: UnionFind
Data structures come in various shapes and flavors. Here, I'd like to mention one peculiar and interesting example that is both a data structure and an algorithm, to some extent. Even the name speaks about certain operations rather than a static form. Well, most of the more advanced data structures all have this feature that they are defined not only by the shape and arrangement but also via the set of operations that are applicable. UnionFind is a family of datastructurealgorithms that can be used for efficient determination of set membership in sets that change over time. They may be used for finding the disjoint parts in networks, detection of cycles in graphs, finding the minimum spanning tree and so forth. One practical example of such problems is automatic image segmentation: separating different parts of an image, a car from the background or a cancer cell from a normal one.
Let's consider the following problem: how to determine if two points of the graph have a path between them? Given that a graph is a set of points (vertices) and edges between some of the pairs of these points. A path in the graph is a sequence of points leading from source to destination with each pair having an edge that connects them. If some path between two points exists they belong to the same component if it doesn't — to two disjoint ones.
A graph with 3 disjoint components
For two arbitrary points, how to determine if they have a connecting path? The naive implementation may take one of them and start building all the possible paths (this may be done in breadthfirst or depthfirst manner, or even randomly). Anyway, such procedure will, generally, require a number of steps proportional to the number of vertices of the graph. Can we do better? This is a usual question that leads to the creation of more efficient algorithms.
UnionFind approach is based on a simple idea: when adding the items record the id of the component they belong to. But how to determine this id? Use the id associated with some point already in this subset or the current point's id if the point is in a subset of its own. And what if we have the subsets already formed? No problem, we can simulate the addition process by iterating over each vertex and taking the id of an arbitrary point it's connected to as the subset's id. Below is the implementation of this approach (to simplify the code, we'll use the pointers to `point` structs instead of ids, but, conceptually, it's the same idea):
(defstruct point
parent) ; if the parent is null the point is the root
(defun ufunion (point1 point2)
"Join the subsets of POINT1 and POINT2."
(:= (pointparent point1) (or (pointparent point2)
point2)))
(defun uffind (point)
"Determine the id of the subset that a POINT belongs to."
(let ((parent (pointparent point)))
(if parent
(uffind parent)
point)))
Just calling (makepoint)
will add a new subset with a single item in it to our set.
Note that uffind
uses recursion to find the root of the subset, i.e. the point that was added first. So, for each vertex, we store some intermediary data and, to get the subset id, each time, we'll have to perform additional calculations. This way, we managed to reduce the averagecase find time, but, still, haven't completely excluded the possibility of it requiring traversal of every element of the set. Such socalled degraded case may manifest when each item is added referencing the previously added one. I.e. there will be a single subset with a chain of its members connected to the next one like this: a > b > c > d
. If we call uffind
on a
it will have to enumerate all of the set's elements.
Yet, there is a way to improve uffind
behavior: by compressing the tree depth to make all points along the path to the root point to it, i.e squashing each chain into a wide shallow tree of depth 1.
d
^ ^ ^
  
a b c
Unfortunately, we can't do that, at once, for the whole subset, but, during each run of uffind
, we can compress one path, which will also shorten all the paths in the subtree that is rooted in the points on it! Still, this cannot guarantee that there will not be a sequence of enough unions to grow the trees faster than finds can flatten them. But there's another tweak that, combined with path compression, allows to ensure sublinear (actually, almost constant) time of both operations: keep track of the size of all trees and link the smaller tree below the larger one. This will ensure that all trees' heights will stay below (log n)
. The rigorous proof of that is quite complex, although, intuitively, we can see the tendency by looking at the base case: if we add a 2element tree and a 1element one we'll still get the tree of the height 2.
Here is the implementation of the optimized version:
(defstruct point
parent
(size 1))
(defun uffind (point)
(let ((parent (pointparent point)))
(if parent
;; here, we use the fact that the assignment will also return
;; the value to perform both path compression and find
(:= (pointparent point) (uffind parent))
point)))
(defun ufunion (point1 point2)
(with ((root1 (uffind point1))
(root2 (uffind point2))
(major minor (if (> (pointsize root1)
(pointsize root2))
(values root1 root2)
(values root2 root1))))
(:+ (pointsize major) (pointsize minor))
(:= (pointparent minor) major)))
Here, Lisp multiple values
come handy, to simplify the code. See the footnote [1] for more details about them.
The suggested approach is quite simple in implementation but complex in complexity analysis. So, I'll have to give just the final result: m
union/find operations, with tree weighting and path compression, on a set of n
objects will work in O((m + n) log* n)
(where log*
is iterated logarithm — a very slowly increasing function, that can be considered a constant, for practical purposes).
Finally, this is how to check if none of the points belong to the same subset in almost O(n)
where n
is the number of points to check[2], so in O(1)
for 2 points:
(defun ufdisjoint (points)
"Return true if all of the POINTS belong to different subsets."
(let (roots)
(dolist (point points)
(let ((root (uffind point)))
(when (member root roots)
(returnfrom ufdisjoint nil))
(push root roots))))
t)
A couple more observations may be drawn from this simple example:
 Not always the clever idea that we, initially, have works flawlessly at once. It is important to check the edge cases for potential problems.
 We've seen an example of a data structre that, directly, doesn't exist: pieces of information are distributed over individual data points. Sometimes, there's a choice between storing the information, in a centralized way, in a dedicated structure like a hashtable and distributing it over individual nodes. The latter approach is often more elegant and efficient, although it's not so obvious.
Footnotes:
[1] Moreover, Python has special syntax for destructuring such tuples: dec, rem = truncate(3.14)
. However, this is not the optimal way to handle returning the primary and one or more secondary values from a function. Lisp provides a more elegant solution called multiple values: all the necessary values are returned via the values
form: (values dec rem)
and can be retrieved with (multiplevaluebind (dec rem) (truncate 3.14) ...)
or (with ((dec rem (truncate 3.14))) ...)
. It is more elegant because secondary values may be discarded at will by calling the function in a usual way: (+ 1 (truncate 3.14)) => 4
— not possible in Python, because you can't sum a tuple with something.
[2] Actually, the complexity here is O(n^2)
due to the use of the function member
that performs set membership test in O(n)
, but it's not essential to the general idea. If ufdisjoint
is expected to be called with tens or hundreds of points the roots
structure has to be changed to a hashset that has a O(1)
membership operation.
Lispers.de — LispMeetup in Hamburg on Monday, 5th August 2019
@20190805 00:00 · 20 days agoWe meet at Ristorante Opera, Dammtorstraße 7, Hamburg, starting around 19:00 CEST on 5th August 2019.
This is an informal gathering of Lispers of all experience levels.
Christophe Rhodes — holiday hacking swankr
@20190804 11:46 · 20 days agoI'm on holiday! And holidays, as seems to be my usual, involve trains!
I was on a train yesterday, wondering how I could possibly fill two hours of leisure time (it's more straightforward these days than a few years ago, when my fellow travellers were less able to occupy their leisure time with books), when help came from a thoroughly unexpected quarter: I got a bug report for swankr.
I wrote swankr nine years ago, mostly
at ISMIR2010. I was about to start
the
Teclo phase
of my ongoing adventures, and an academic colleague had recommended
that I learn R; and the moment where it all fell into place was when I
discovered that R supported handlers and restarts: enough that writing
support for slime's SLDB was a lowereffort endeavour than learning
R's default debugger. I implemented support for the bits of slime
that I use, and also for displaying images in the REPL  so that I
can present lattice
objects as thumbnails of the rendered graph, and
have
the
canned demo of
adding two of them together to get a combined graph: generates "oo"
sounds at every outing, guaranteed! (Now that I mostly use ggplot2
,
I need to find a new demo: incrementally building and styling a plot
by adding theme, scale, legend objects and so on is actually useful
but lacks a wow factor.)
SLIME works by exchanging messages between the emacs frontend and the backend lisp; code on the backend lisp is responsible for parsing the messages, executing whatever is needed to support the request, and sending a response. The messages are, broadly, sexpbased, and the message parsing and executing in the backends is mostly portable Common Lisp, delegating to implementationspecific code for specific implementationspecific support needed.
"Mostly portable Common Lisp" doesn't mean that it'll run in R. The R
code is necessarily completely separate, implementing just enough of a
Lisp reader to parse the messages. This works fine, because the
messages themselves are simple; when the frontend sends a message
asking for evaluation for the listener, say, the message is in the
form of a sexp, but the form to evaluate is a string of the
userprovided text: so as long as we have enough of a sexp
reader/evaluator to deal with the SLIME protocol messages, the rest
will be handled by the backend's eval
function.
... except in some cases. When slime sends a form to be evaluated
which contains an embedded presentation object, the presentation is
replaced by #.(swank:lookuppresentedobjectorlose 57.)
in the
string sent for evaluation. This works fine for Common Lisp backends
 at least provided *readeval*
is true  but doesn't work
elsewhere. One regexp allows us to preprocess the string to evaluate
to rewrite this into something else, but what? Enter cunning plan
number 1: (ab)use backquote.
Backquote? Yes, R has backquote bquote
. It also has moderately
crazy function call semantics, so it's possible to: rewrite the string
ob be evaluated to contain unquotations; parse the string into a form;
funcall the bquote
function on that form (effectively performing the
unquotations, simulating the readtime evaluation), and then eval
the result. And this would be marvellous except that Philipp Marek
discovered that his own bquote
using code didn't work. Some
investigation later, and the root cause became apparent:
CLUSER> (progn (set 'a 3) (eval (readfromstring "``,a")))
`,a
compare
R> a < 3; bquote(bquote(.(a)))
bquote(3)
NO NESTED BACKQUOTES?
So, scratch the do.call("bquote", ...)
plan. What else is available
to us? It turns out that there's a substitute
function, which for
substitute(expr, env)
returns the parse tree for the (unevaluated) expression 'expr', substituting any variables bound in 'env'.
Great! Because that means we can use the regular expression to rewrite the presentation lookup function calls to be specific variable references, then substitute these variable references from the idtoobject environment that we have anyway, and Bob is your metaphorical uncle. So I did that.
The situation is not perfect, unfortunately. Here's some more of the
documentation for substitute
:
'substitute' works on a purely lexical basis. There is no guarantee that the resulting expression makes any sense.
... and here's some more from do.call
:
The behavior of some functions, such as 'substitute', will not be the same for functions evaluated using 'do.call' as if they were evaluated from the interpreter. The precise semantics are currently undefined and subject to change.
Well that's not ominous at all.
Vsevolod Dyomkin — Programming Algorithms: A Crash Course in Lisp
@20190729 08:39 · 26 days agoThe introductory post for this book, unexpectedly, received quite a lot of attention, which is nice since it prompted some questions, and one of them I planned to address in this chapter.
I expect that there will be two main audiences, for this book:
 people who'd like to advance in algorithms and writing efficient programs — the major group
 lispers, either accomplished or aspiring who also happen to be interested in algorithms
This introductory chapter is, primarily, for the first group. After reading it, the rest of the book's Lisp code should become understandable to you. Besides, you'll know the basics to run Lisp and experiment with it if will you so desire.
For the lispers, I have one comment and one remark. You might be interested to read this part just to understand my approach of utilizing the language throughout the book. Also, you'll find my stance regarding the question that was voiced several times in the comments: whether it's justified to use some 3rdparty extensions and to what extent or should the author vigilantly stick to only the tools provided by the standard.
The Core of Lisp
To effortlessly understand Lisp, you'll have to forget, for a moment, any concepts of how programming languages should work that you might have acquired from your prior experience in coding. Lisp is simpler; and when people bring their Java, C or Python approaches to programming with it, first of all, the results are suboptimal in terms of code quality (simplicity, clarity, and beauty), and, what's more important, there's much less satisfaction from the process, not to mention very few insights and little new knowledge gained.
It is much easier to explain Lisp if we begin from a blank slate. In essence, all there is to it is just an evaluation rule: Lisp programs consist of forms that are evaluated by the compiler. There are 3+2 ways how that can happen:
 selfevaluation: all literal constants (like
1
,"hello"
, etc.) are evaluated to themselves. These literal objects can be either builtin primitive types (1
) or data structures ("hello"
)  symbol evaluation: separate symbols are evaluated as names of variables, functions, types or classes depending on the context. The default is variable evaluation, i.e. if we encounter a symbol
foo
the compiler will substitute in its place the current value associated with this variable (more on this a little bit later)  expression evaluation: compound expressions are formed by grouping symbols and literal objects with parenthesis. The following form
(oper 1 foo)
is considered a "functional" expression: the operator name is situated in the first position (head), and its arguments, if any, in the subsequent positions (rest). There are 3 ways to evaluate a functional expression: there are 25 special operators that are defined in lowerlevel code and may be considered something like axioms of the language: they are predefined, always present, and immutable. Those are the building blocks, on top of which all else is constructed, and they include the sequential
block
operator, the conditional expressionif
, and the unconditional jumpgo
, to name a few. Ifoper
is the name of a special operator, the lowlevel code for this operator that deals with the arguments in its own unique way is executed  there's also ordinary function evaluation: if
oper
is a function name, first, all the arguments are evaluated with the same evaluation rule, and then the function is called with the obtained values  finally, there's macro evaluation. Macros provide a way to change the evaluation rule for a particular form. If
oper
names a macro, its code is substituted instead of our expression and then evaluated. Macros are a major topic in Lisp, and they are used to build a large part of the language, as well as provide an accessible way, for the users, to extend it. However, they are orthogonal to the subject of this book and won't be discussed in further detail here. You can delve deeper into macros in such books as On Lisp or Let Over Lambda
 there are 25 special operators that are defined in lowerlevel code and may be considered something like axioms of the language: they are predefined, always present, and immutable. Those are the building blocks, on top of which all else is constructed, and they include the sequential
It's important to note that, in Lisp, there's no distinction between statements and expressions, no special keywords, no operator precedence rules, and other similar arbitrary stuff you can stumble upon in other languages. Everything is uniform; everything is an expression in a sense that it will be evaluated and return some value.
A Code Example
To sum up, let's consider an example of the evaluation of a Lisp form. The following one implements the famous binary search algorithm (that we'll discuss in more detail in one of the following chapters):
(when (> (length vec) 0)
(let ((beg 0)
(end (length vec)))
(do ()
((= beg end))
(let ((mid (floor (+ beg end) 2)))
(if (> (? vec mid) val)
(:= beg (1+ mid))
(:= end mid))))
(values beg
(? vec beg)
(= (? vec beg) val))))
It is a compound form. In it, the socalled toplevel form is when
, which is a macro for a oneclause conditional expression: an if
with only the truebranch. First, it evaluates the expression (> (length vec) 0)
, which is an ordinary function for a logical operator >
applied to two args: the result of obtaining the length
of the contents of the variable vec
and a constant 0
. If the evaluation returns true, i.e. the length of vec
is greater than 0
, the rest of the form is evaluated in the same manner. The result of the evaluation, if nothing exceptional happens, is either false (which is called nil
, in Lisp) or 3 values returned from the last form (values ...)
. ?
is the generic access operator, which abstracts over different ways to query data structures by key. In this case, it retrieves the item from vec
at the index of the second argument. Below we'll talk about other operators shown here.
But first I need to say a few words abut RUTILS
. It is a 3rdparty library that provides a number of extensions to the standard Lisp syntax and its basic operators. The reason for its existence is that Lisp standard is not going to change ever, and, as eveything in this world, it has its flaws. Besides, our understanding of what's elegant and efficient code evolves over time. The great advantage of the Lisp standard, however, which counteracts the issue of its immutability, is that its authors had put into it multiple ways to modify and evolve the language at almost all levels starting from even the basic syntax. And this addresses our ultimate need, after all: we're not so interested in changing the standard as we're in changing the language. So, RUTILS
is one of the ways of evolving Lisp and its purpose is to make programming in it more accessible without compromising the principles of the language. So, in this book, I will use some basic extensions from RUTILS
and will explain them as needed. Surely, using 3rdparty tools is the question of preference and taste and might not be approved by some of the Lisp oldtimes, but no worries, in your code, you'll be able to easily swap them for your favorite alternatives.
The REPL
Lisp programs are supposed to be run not only in a oneoff fashion of simple scripts, but also as live systems that operate over long periods of time experiencing change not only of their data but also code. This general way of interaction with a program is called ReadEvalPrintLoop (REPL), which literally means that the Lisp compiler read
s a form, eval
uates it with the aforementioned rule, print
s the results back to the user, and loop
s over.
REPL is the default way to interact with a Lisp program, and it is very similar to the Unix shell. When you run your Lisp (for example, by entering sbcl
at the shell) you'll drop into the REPL. We'll preceede all REPLbased code interactions in the book with a REPL prompt (CLUSER>
or similar). Here's an example one:
CLUSER> (print "Hello world")
"Hello world"
"Hello world"
A curious reader may be asking why "Hello world"
is printed twice. It's a proof that everything is an expression in Lisp. :) The print
"statement", unlike in most other languages, not only prints its argument to the console (or other output stream), but also returns it as is. This comes very handy when debugging, as you can wrap almost any form in a print
not changing the flow of the program.
Obviously, if the interaction is not necessary, just the readeval part may remain. But, what's more important, Lisp provides a way to customize every stage of the process:
 at the
read
stage special syntax ("syntax sugar") may be introduced via a mechanism called reader macros  ordinary macros are a way to customize the
eval
stage  the
print
stage is conceptually the simplest one, and there's also a standard way to customize object printing via the Common Lisp Object System's (CLOS)printobject
function  and the
loop
stage can be replaced by any desired program logic
Basic Expressions
The structural programming paradigm states that all programs can be expressed in terms of 3 basic constructs: sequential execution, branching, and looping. Let's see how these operators are expressed in Lisp.
Sequential Execution
The simplest program flow is sequential execution. In all imperative languages, it is what is assumed to happen if you put several forms in a row and evaluate the resulting code block. Like this:
CLUSER> (print "hello") (+ 2 2)
"hello"
4
The value returned by the last expression is dimmed the value of the whole sequence.
Here, the REPLinteraction forms an implicit unit of sequential code. However, there are many cases when we need to explicitly delimit such units. This can be done with the block
operator:
CLUSER> (block test
(print "hello")
(+ 2 2))
"hello"
4
Such block has a name (in this example: test
). This allows to prematurely end its execution by using an operator returnfrom
:
CLUSER> (block test
(returnfrom test 0)
(print "hello")
(+ 2 2))
0
A shorthand return
is used to exit from blocks with a nil
name (which are implicit in most of the looping constructs we'll see further):
CLUSER> (block nil
(return 0)
(print "hello")
(+ 2 2))
0
Finally, if we don't even plan to ever prematurely return from a block, we can use the progn
operator that doesn't require a name:
CLUSER> (progn
(print "hello")
(+ 2 2))
"hello"
4
Branching
Conditional expressions calculate the value of their first form and, depending on it, execute one of several alternative code paths. The basic conditional expression is if
:
CLUSER> (if nil
(print "hello")
(print "world"))
"world"
"world"
As we've seen, nil
is used to represent logical falsity, in Lisp. All other values are considered logically true, including the symbol T
or t
which directly has the meaning of truth.
And when we need to do several things at once, in one of the conditional branches, it's one of the cases when we need to use progn
or block
:
CLUSER> (if (+ 2 2)
(progn
(print "hello")
4)
(print "world"))
"hello"
4
However, often we don't need both branches of the expressions, i.e. we don't care what will happen if our condition doesn't hold (or holds). This is such a common case that there are special expressions for it in Lisp — when
and unless
:
CLUSER> (when (+ 2 2)
(print "hello")
4)
"world"
4
CLUSER> (unless (+ 2 2)
(print "hello")
4)
NIL
As you see, it's also handy because you don't have to explicitly wrap the sequential forms in a progn
.
One other standard conditional expression is cond
, which is used when we want to evaluate several conditions in a row:
CLUSER> (cond
((typep 4 'string)
(print "hello"))
((> 4 2)
(print "world")
nil)
(t
(print "can't get here")))
"world"
NIL
The t
case is a catchall that will trigger if none of the previous conditions worked (as its condition is always true). The above code is equivalent to the following:
(if (typep 4 'string)
(print "hello")
(if (> 4 2)
(progn
(print "world")
nil)
(print "can't get here")))
There are many more conditional expressions in Lisp, and it's very easy to define your own with macros (it's actually, how when
, unless
, and cond
are defined), and when there arises a need to use a special one, we'll discuss its implementation.
Looping
Like with branching, Lisp has a rich set of looping constructs, and it's also easy to define new ones when necessary. This approach is different from the mainstream languages, that usually have a small number of such statements and, sometimes, provide an extension mechanism via polymorphism. And it's even considered to be a virtue justified by the idea that it's less confusing for the beginners. It makes sense to a degree. Still, in Lisp, both generic and custom approaches manage to coexist and complement each other. Yet, the tradition of defining custom control constructs is very strong. Why? One justification for this is the parallel to human languages: indeed, when
and unless
, as well as dotimes
and loop
are either directly words from the human language or are derived from natural language expressions. Our mother tongues are not so primitive and dry. The other reason is because you can™. I.e. it's so much easier to define custom syntactic extensions in Lisp than in other languages that sometimes it's just impossible to resist. :) And in many use cases they make the code much more simple and clear.
Anyway, for a complete beginner, actually, you have to know the same number of iteration constructs as in any other language. The simplest one is dotimes
that iterates the counter variable a given number of times (from 0 to ( times 1)
) and executes the body on each iteration. It is analogous to for (int i = 0; i < times; i++)
loops found in Clike languages.
CLUSER> (dotimes (i 3)
(print i))
0
1
2
NIL
The return value is nil
by default, although it may be specified in the loop header.
The most versatile (and lowlevel) looping construct, on the other hand, is do
:
CLUSER> (do ((i 0 (1+ i))
(prompt (readline) (readline)))
((> i 1) i)
(print (pair i prompt))
(terpri))
foo
(0 "foo")
bar
(1 "bar")
2
do
iterates a number of variables (zero or more) that are defined in the first part (here, i
and prompt
) until the termination condition in the second part is satisfied (here, (> i 1)
), and as with dotimes
(and other dostyle macros) executes its body — rest of the forms (here, print
and terpri
, which is a shorthand for printing a newline). readline
reads from standard input until newline is encountered and 1+
returns the current value of i
increased by 1.
All dostyle macros (and there's quite a number of them, both builtin and provided from external libraries: dolist
, dotree
, doregistergroups
, dolines
etc.) have an optional return value. In do
it follows the termination condition, here — just return the final value of i
.
Besides dostyle iteration, there's also a substantially different beast in CL ecosystem — the infamous loop
macro. It is very versatile, although somewhat unlispy in terms of syntax and with a few surprising behaviors. But elaborating on it is beyond the scope of this book, especially since there's an excellent introduction to loop
in Peter Seibel's "LOOP for Black Belts".
Many languages provide a generic looping construct that is able to iterate an arbitrary sequence, a generator and other similarbehaving things — usually, some variant of foreach
. We'll return to such constructs after speaking about sequences in more detail.
And there's also an alternative iteration philosophy: the functional one, which is based on higherorder functions (map
, reduce
and similar) — we'll cover it in more detail in the following chapters, also.
Procedures and Variables
We have covered the 3 pillars of structural programming, but one essential, in fact, the most essential, construct still remains — variables and procedures.
What if I told you that you can perform the same computation many times, but changing some parameters... OK, OK, pathetic joke. So, procedures are the simplest way to reuse computations, and procedures accept arguments, which allows to pass values into their bodies. A procedure, in Lisp, is called lambda
. You can define one like this: (lambda (x y) (+ x y))
. When used, such procedure — also often called a function, although it's quite different from what we consider a mathematical function — and, in this case, it's called an anonymous function as it doesn't have any name — will produce the sum of its inputs:
CLUSER> ((lambda (x y) (+ x y)) 2 2)
4
It is quite cumbersome to refer to procedures by their full code signature, and an obvious solution is to assign names to them. A common way to do that in Lisp is via the defun
macro:
CLUSER> (defun add2 (x y) (+ x y))
ADD2
CLUSER> (add2 2 2)
4
The arguments of a procedure are examples of variables. Variables are used to name memory cells whose contents are used more than once and may be changed in the process. They serve different purposes:
 to pass data into procedures
 as temporary placeholders for some varying data in code blocks (like loop counters)
 as a way to store computation results for further reuse
 to define program configuration parameters (like the OS environment variables, which can also be thought of as arguments to the main function of our program)
 to refer to global objects that should be accessible from anywhere in the program (like
*standardoutput*
stream)  and more
Can we live without variables? Theoretically, well, maybe. At least, there's the socalled pointfree style of programming that strongly discourages the use of variables. But, as they say, don't try this at home (at least, until you know perfectly well what you're doing :) Can we replace variables with constants, or singleassignment variables, i.e. variables that can't change over time? Such approach is promoted by the so called purely functional languages. To a certain degree, yes. But, from the point of view of algorithms development, it makes life a lot harder by complicating many optimizations if not totally outruling them.
So, how to define variables in Lisp? You've already seen some of the variants: procedural arguments and let
bindings. Such variables are called local or lexical, in Lisp parlance. That's because they are only accessible locally throughout the execution of the code block, in which they are defined. let
is a general way to introduce such local variables, which is lambda
in disguise (a thin layer of syntax sugar over it):
CLUSER> (let ((x 2))
(+ x x))
4
CLUSER> ((lambda (x) (+ x x))
2)
4
While with lambda
you can create a procedure in one place, possibly, assign it to a variable (that's what, in essence, defun
does), and then apply many times in various places, with let
you define a procedure and immediately call it, leaving no way to store it and reapply again afterwards. That's even more anonymous than an anonymous function! Also, it requires no overhead, from the compiler. But the mechanism is the same.
Creating variables via let
is called binding, because they are immediately assigned (bound with) values. It is possible to bind several variables at once:
CLUSER> (let ((x 2)
(y 2))
(+ x y))
4
However, often we want to define a row of variables with next ones using the previous ones' values. It is cumbersome to do with let
, because you need nesting (as procedural arguments are assigned independently):
(let ((len (length list)))
(let ((mid (floor len 2)))
(let ((leftpart (subseq list 0 mid))
(rightpart (subseq list mid)))
...)))
To simplify this use case, there's let*
:
(let* ((len (length list))
(mid (floor len 2))
(leftpart (subseq list 0 mid))
(rightpart (subseq list mid)))
...)
However, there are many other ways to define variables: bind multiple values at once; perform the so called "destructuring" binding when the contents of a data structure (usually, a list) are assigned to several variables, first element to the first variable, second to the second, and so on; access the slots of a certain structure etc. For such use cases, there's with
binding from RUTILS, which works like let*
with extra powers. Here's a very simple example:
(with ((len (length list))
(mid rem (floor len 2))
;; this group produces a list of 2 sublists
;; that are bound to leftpart and rightpart
;; and ; character starts a comment in lisp
((leftpart rightpart) (group mid list)))
...
In the code throughout this book, you'll only see these two binding constructs: let
for trivial and parallel bindings and with
for all the rest.
As we said, variables may not only be defined, or they'd be called "constants", instead, but also modified. To alter the variable's value we'll use :=
from RUTILS (it is an abbreviation of the standard psetf
macro):
CLUSER> (let ((x 2))
(print (+ x x))
(:= x 4)
(+ x x))
4
8
Modification, generally, is a dangerous construct as it can create unexpected actionatadistance effects, when changing the value of a variable in one place of the code effects the execution of a different part that uses the same variable. This, however, can't happen with lexical variables: each let
creates its own scope that shields the previous values from modification (just like passing arguments to a procedure call and modifying them within the call doesn't alter those values, in the calling code):
CLUSER> (let ((x 2))
(print (+ x x))
(let ((x 4))
(print (+ x x)))
(print (+ x x)))
4
8
4
Obviously, when you have two let
s in different places using the same variable name they don't affect each other and these two variables are, actually, totally distinct.
Yet, sometimes it is useful to modify a variable in one place and see the effect in another. The variables, which have such behavior, are called global or dynamic (and also special, in Lisp jargon). They have several important purposes. One is defining important configuration parameters that need to be accessible anywhere. The other is referencing generalpurpose singleton objects like the standard streams or the state of the random number generator. Yet another is pointing to some context that can be altered in certain places subject to the needs of a particular procedure (for instance, the *package*
global variable determines in what package we operate — CLUSER
in all previous examples). More advanced uses for global variables also exist. The common way to define a global variable is with defparameter
, which specifies its initial value:
(defparameter *connection* nil
"A default connection object.") ; this is a docstring describing the variable
Global variables, in Lisp, usually have socalled "earmuffs" around their names to remind the user of what they are dealing with. Due to their actionatadistance feature, it is not the safest programming language feature, and even a "global variables considered harmful" mantra exists. Lisp is, however, not one of those squeamish languages, and it finds many uses for special variables. By the way, they are called "special" due to a special feature, which greatly broadens the possibilities for their sane usage: if bound in let
they act as lexical variables, i.e. the previous value is preserved and restored upon leaving the body of a let
:
CLUSER> (defparameter *temp* 1)
*TEMP*
CLUSER> (print *temp*)
1
CLUSER> (progn
(let ((*temp* 2))
(print *temp*)
(:= *temp* 4)
(print *temp*))
*temp*)
2
4
1
Procedures in Lisp are firstclass objects. This means the one you can assign to a variable, as well as inspect and redefine at runtime, and, consequently, do many other useful things with. The RUTILS function call
[1] will call a procedure passed to it as an argument:
CLUSER> (call 'add2 2 2)
4
CLUSER> (let ((add2 (lambda (x y) (+ x y))))
(call add2 2 2))
4
In fact, defining a function with defun
also creates a global variable, although in the function namespace. Functions, types, classes — all of these objects are usually defined as global. Though, for functions there's a way to define them locally with flet
:
CLUSER> (foo 1)
;; ERROR: The function COMMONLISPUSER::FOO is undefined.
CLUSER> (flet ((foo (x) (1+ x)))
(foo 1))
2
CLUSER> (foo 1)
;; ERROR: The function COMMONLISPUSER::FOO is undefined.
Comments
Finally, there's one more syntax we need to know: how to put comments in the code. Only losers don't comment their code, and comments will be used extensively, throughout this book, to explain some parts of the code examples, inside of them. Comments, in Lisp, start with a ;
character and end at the end of a line. So, the following snippet is a comment: ; this is a comment
. There's also a common style of commenting, when short comments that follow the current line of code start with a single ;
, longer comments for a certain code block precede it, occupy the whole line or a number of lines and start with ;;
, comments for code section that include several Lisp toplevel forms (global definitions) start with ;;;
and also occupy whole lines. Besides, each global definition can have a special commentlike string, called the "docstring", that is intended to describe its purpose and usage, and that can be queried programmatically. To put it all together, this is how different comments may look like:
;;; Some code section
(defun this ()
"This has a curious docstring."
...)
(defun that ()
...
;; this is an interesting block don't you find?
(block interesting
(print "hello"))) ; it prints hello
Getting Started
I strongly encourage you to play around with the code presented in the following chapters of the book. Try to improve it, find issues with it, and come up with fixes, measure and trace everything. This will not only help you master some Lisp, but also understand much deeper the descriptions of the discussed algorithms and data structures, their pitfalls and corner cases. Doing that is, in fact, quite easy. All you need is install some Lisp (preferrably, SBCL or CCL), add Quicklisp, and, with its help, RUTILS.
As I said above, the usual way to work with Lisp is interacting with its REPL. Running the REPL is fairly straightforward. On my Mint Linux I'd run the following commands:
$ aptget install sbcl rlwrap
...
$ rlwrap sbcl
...
* (print "hello world")
"hello world"
"hello world"
*
*
is the Lisp raw prompt. It's, basically, the same as CLUSER>
prompt you'll see in SLIME. You can also run a Lisp script file: sbcl script hello.lisp
. If it contains just a single (print "hello world")
line we'll see the "hello world" phrase printed to the console.
This is a working, but not the most convenient setup. A much more advanced environment is SLIME that works inside Emacs (a similar project for vim is called SLIMV). There exists a number of other solutions: some Lisp implementations provide and IDE, some IDEs and editors provide integration.
After getting into the REPL, you'll have to issue the following commands:
* (ql:quickload :rutilsx)
* (usepackage :rutilsx)
* (namedreadtables:inreadtable rutilsxreadtable)
Well, that's enough Lisp you'll need to know, to start. We'll get acquainted with other Lisp concepts as they will become needed for the next chapters of this book. Yet, you're all set to read and write Lisp programs. They may seem unfamiliar, at first, but as you overcome the initial bump and get used to their paranthesised prefix surface syntax, I promise that you'll be able to recognize and appreciate their clarity and conciseness.
So, as they say in Lisp land, happy hacking!
Footnotes:
[1] call
is the RUTILS abbreviation of the standard funcall
. It was surely fun to be able to call a function from a variable back in the 60's, but now it has become so much more common that there's no need for the prefix ;)
Lispers.de — Berlin Lispers Meetup, Monday, 29th July 2019
@20190726 09:03 · 29 days agoWe meet again on Monday 8pm, 29th July. Our host this time is James Anderson (www.dydra.com).
Berlin Lispers is about all flavors of Lisp including Clojure, Scheme and Common Lisp.
We will have two topics this time.
Christian will talk about his time in Sierra Leone during the Ebola outbreak 2014 and where he used Common Lisp for his work. Another topic will be Lisp jobs.
We meet in the TautHaus at Engeldamm 70 in BerlinMitte, the bell is "James Anderson". It is located in 10min walking distance from U Moritzplatz or U Kottbusser Tor or Ostbahnhof. In case of questions call Christian +49 1578 70 51 61 4.
Vsevolod Dyomkin — "Programming Algorithms" Book
@20190722 14:47 · 33 days agoI'm writing a book about algorithms and Lisp. It, actually, started several years ago, but as I experience a constant shortage of quality time to devote to such side activities, short periods of writing alternated with long pauses. Now, I'm, finally, at the stage when I can start publishing it. But I intend to do that, first, gradually in this blog and then put the final version — hopefully, improved and polished thanks to the comments of the first readers — on Leanpub. The book will be freely available with a CC BYNCND license.
The book will have 16 chapters grouped into 3 parts: essential data structures, derivative ones, and advanced algorithms. I plan to publish each one approximately once a week. To finish the process by the end of the year.
I hope the book turns out to be an enlightening read for those who start their career in programming or want to level up in it. At least, I tried to accumulate inside all my experience from production algorithmic development, teaching of these topics, and Lisp programming, over the last 10+ years. Below is a short preface and an introductory chapter about Complexity.
Why Algorithms Matter
In our industry, currently, there seems to prevail a certain misunderstanding of the importance of algorithms for the working programmer. There's often a disconnect between the algorithmic questions posed at the job interviews and the everyday essence of the same job. That's why opinions are voiced that you, actually, don't have to know CS to be successful in the software developer's job. That's true, you don't, but you'd better do if you want to be in the notorious top 10% programmers. For several reasons. One is that, actually, you can find room for algorithms almost at every corner of your work — provided you are aware of their existence. To put it simply, the fact that you don't know a more efficient or elegant solution to a particular programming problem doesn't make your code less crappy. The current trend in software development is that, although the hardware becomes more performant, the software becomes slower faster. There are two reasons for that, in my humble opinion:
 Most of the application programmers don't know the inner workings of the underlying platforms. And the number of platform layers keeps increasing.
 Most of the programmers also don't know enough algorithms and algorithmic development technics to squeeze the most from their code. And often this means a loss of one or more orders of magnitude of performance.
In the book, I'll address, primarily, the second issue but will also try to touch on the first whenever possible.
Besides, learning the art of solving difficult algorithmic problems trains the brain and makes it more apt to solving various other problems, in the course of your daytoday work.
Finally, you will be speaking the same lingua franca as other advanced programmers — the tongue that transcends the mundane differences of particular programming languages. And you'll gain a more detached view of those differences, freeing your mind from the dictate of a particular set of choices exhibiting in any one of them.
One of the reasons for this gap of understanding of the value of algorithms, probably, originates from how they are usually presented in the computer science curriculum. First, it is often done in a rather theoretical or "mathematical" way with rigorous proofs and lack of connection to the real world™. Second, the audience is usually freshmen or sophomores who don't have a lot of practical programming experience and thus can't appreciate and relate how this knowledge may be applied to their own programming challenges (because they didn't have those yet) — rather, most of them are still at the level of struggling to learn well their first programming language and, in their understanding of computing, are very much tied to its choices and idiosyncrasies.
In this book, the emphasis is made on the demonstration of the use of the described data structures and algorithms in various areas of computer programming. Moreover, I anticipate that the selfselected audience will comprise programmers with some experience in the field. This makes a significant difference in the set of topics that are relevant and how they can be conveyed. Another thing that helps a lot is when the programmer has a good command of more than one programming language, especially, if the languages are from different paradigms: static and dynamic, objectoriented and functional. These factors allow bridging the gap between "theoretical" algorithms and practical coding, making the topic accessible, interesting, and inspiring.
This is one answer to a possible question: why write another book on algorithms? Indeed, there are several good textbooks and online courses on the topic, of which I'd recommend the most Steven Skienna's The Algorithm Design Manual. Yet, as I said, this book is not at all academic in presentation of the material, which is a norm for other textbooks. Except for simple arithmetic, it contains almost no "math" or proofs. And, although proper attention is devoted to algorithm complexity, it doesn't deal with theories of complexity or computation and similar scientific topics. Besides, all the algorithms and data structures come with some example practical use cases. Last, but not least, there's no book on algorithms in Lisp, and, in my opinion, it's a great topic to introduce the language. The next chapter will provide a crash course to grasp the basic ideas, and then we'll discuss various Lisp programming approaches alongside the algorithms they will be used to implement.
This is an introductory book, not a bible of algorithms. It will draw a comprehensive picture and cover all topics necessary for further advancement of your algorithms knowledge. However, it won't go too deep into the advanced topics, such as persistent or probabilistic data structures, advanced tree, graph, and optimization algorithms, as well as algorithms for particular fields, such as Machine Learning, Cryptography or Computational Geometry. All of those fields require (and usually have) separate books of their own.
A Few Words about Lisp
For a long time, I've been contemplating writing an introductory book on Lisp, but something didn't add up, I couldn't see the coherent picture, in my mind. And then I got a chance to teach algorithms with Lisp. From my point of view, it's a perfect fit for demonstrating data structures and algorithms (with a caveat that students should be willing to learn it), while discussing the practical aspects of those algorithms allows to explain the language naturally. At the same time, this topic requires almost no endeavor into the adjacent areas of programming, such as architecture and program design, integration with other systems, user interface, and use of advanced language features, such as types or macros. And that is great because those topics are overkill for an introductory text and they are also addressed nicely and in great detail elsewhere (see Practical Common Lisp and ANSI Common Lisp).Why Lisp is great for algorithmic programs? One reason is that the language was created with such use case in mind. It has support for all the proper basic data structures, such as arrays, hashtables, linked lists, strings, and tuples. It also has a numeric tower, which means no overflow errors and, so, a much saner math. Next, it's created for the interactive development style, so the experimentation cycle is very short, there's no compilewaitrunrevise red tape, and there are no unnecessary constraints, like the need for additional annotations (a.k.a. types), prohibition of variable mutation or other stuff like that. You just write a function in the REPL, run it and see the results. In my experience, Lisp programs look almost like pseudocode. Compared to other languages, they may be slightly more verbose at times but are much more clear, simple, and directly compatible with the algorithm's logical representation.
But why not choose a popular programming language? The short answer is that it wouldn't have been optimal. There are 4 potential mainstream languages that could be considered for this book: C++, Java, Python, and JavaScript. (Surely, there's already enough material on algorithms that uses them). The first two are staticallytyped, which is, in itself, a big obstacle to using them as teaching languages. Java is also too verbose, while C++ — too lowlevel. These qualities don't prevent them from being used in the majority of production algorithm code, in the wild, and you'll, probably, end up dealing with such code sooner than later if not already. Besides, their standard libraries provide great examples of practical algorithm implementation. But, I believe that gaining good conceptual understanding will allow to easily adapt to one of these languages if necessary while learning them in parallel with diving into algorithms creates unnecessary complexity. Python and JS are, in many ways, the opposite choices: they are dynamic and provide some level of an interactive experience (albeit inferior compared to Lisp), but those languages are in many ways antialgorithmic. Trying to be simple and accessible, they hide too much from the programmer and don't give enough control of the concrete data. Teaching algorithms, using their standard libraries, seems like cheating to me as their basic data structures often are not what they claim to be. Lisp is in the middle: it is both highly interactive and gives enough control of the environment, while not being too verbose and demanding. And the price to pay — the unfamiliar syntax — is really small, in my humble opinion.
Yet, there's another language that is rapidly gaining popularity and is considered by many to be a good choice for algorithmic development — Rust. It's also a static language, although not so ceremonial as Java or C++. However, neither am I an expert in Rust, nor intend to become one. Moreover, I think the same general considerations regarding static languages apply to it.
Algorithmic Complexity
Complexity is a point that will be mentioned literally on every page of this book; the discussion of any algorithm or data structure can't avoid this topic. After correctness, it is the second most important quality of every algorithm — moreover, often correctness alone doesn't matter if complexity is neglected, while the opposite is possible: to compromise correctness somewhat in order to get significantly better complexity. By and large, algorithm theory differs from other subjects of CS in that it concerns not about presenting a working (correct) way to solve some problem but about finding an efficient way to do it. Where efficiency is understood as the minimal (or admissible) number of operations performed and occupied memory space.
In principle, the complexity of an algorithm is the dependence of the number of operations that will be performed on the size of the input. It is crucial to the computer system's scalability: it may be easy to solve the programming problem for a particular set of inputs, but how will the solution behave if the input is doubled, increased tenfold or millionfold? This is not a theoretical question, and an analysis of any generalpurpose algorithm should have a clear answer to it.
Complexity is a substantial research topic: a whole separate branch of CS — Complexity Theory — exists to study it. Yet, throughout the book, we'll try to utilize the end results of such research without delving deep into rigorous proofs or complex math, especially since, in most of the cases, measuring complexity is a matter of simple counting. Let's look at the following illustrative example:
(defun matmax (mat)
(let (max)
(dotimes (i (arraydimension mat 0))
(dotimes (j (arraydimension mat 1))
(when (or (null max)
(> (aref mat i j) max))
(:= max (aref mat i j)))))
max))
This function finds the maximum element of a twodimensional array (matrix):
CLUSER> (matmax #2A((1 2 3) (4 5 6)))
6
What's its complexity? To answer, we can just count the number of operations performed: at each iteration of the inner loop, there are 2 comparisons involving 1 array access, and, sometimes, if the planets align we perform another access for assignment. The inner loop is executed (arraydimension mat 1)
times (let's call it m
where m=3
), and the outer one — (arraydimension mat 0)
(n=2
, in the example). If we sum this all up we'll get: n * m * 4
as an upper limit, for the worst case when each sequent array element is larger then the previous. As a rule of thumb, each loop adds multiplication to the formula, and each sequential block adds a plus sign.
In this calculation, there are two variables (array dimensions n
and m
) and one constant (the number of operations performed for each array element). There exists a special notation — BigO — used to simplify the representation of end results of such complexity arithmetic. In it, all constants are reduced to 1, and thus m * 1
becomes just m
, and also since we don't care about individual array dimension differences we can just put n * n
instead of n * m
. With such simplification, we can write down the final complexity result for this function: O(n^2)
. In other words, our algorithm has quadratic complexity (which happens to be a variant of a broader class called "polynomial complexity") in array dimensions. It means that by increasing the dimensions of our matrix ten times, we'll increase the number of operations of the algorithm 100 times. In this case, however, it may be more natural to be concerned with the dependence of the number of operations on the number of elements of the matrix, not its dimensions. We can observe that n^2
is the actual number of elements, so it can also be written as just n
— if by `n` we mean the number of elements, and then the complexity is linear in the number of elements (O(n)
). As you see, it is crucial to understand what `n` we are talking about!
There are just a few more things to know about BigO complexity before we can start using it to analyze our algorithms.
1. There are 6 major complexity classes of algorithms:
 constanttime (
O(1)
)  sublinear (usually, logarithmic —
O(log n)
)  linear (
O(n)
) and superlinear (O(n * log n)
)  higherorder polynomial (
O(n^c)
, wherec
is some constant greater than 1)  exponential (
O(с^n)
, whereс
is usually 2 but, at least, greater than 1)  and just plain lunatic complex (
O(n!)
and so forth) — I call themO(mg)
, jokingly
Each class is a stepfunction change in performance, especially, at scale. We'll talk about each one of them as we'll be discussing the particular examples of algorithms falling into it.
2. Worstcase vs. averagecase behavior. In this example, we saw that there may be two counts of operations: for the average case, we can assume that approximately half of the iterations will require assignment (which results in 3,5 operations in each inner loop), and, for the worst case, the number will be exactly 4. As BigO reduces all numbers to 1, for this example, the difference is irrelevant, but there may be others, for which it is much more drastic and can't be discarded. Usually, for such algorithms, both complexities should be mentioned (alongside with ways to avoid worstcase scenarios): a good example is quicksort algorithm described in the subsequent chapter.
3. We have also seen the socalled "constant factors hidden by the BigO notation". I.e., from the point of view of algorithm complexity, it doesn't matter if we need to perform 3 operations in the inner loop or 30. Yet, it is quite important in practice, and we'll also discuss it below when examining binary search. Even more, some algorithms with better theoretical complexity may be worse in many practical applications due to these hidden factors (for example, until the dataset reaches a certain size).
4. Finally, besides execution time complexity, there's also space complexity, which instead of the number of operations measures the amount of storage space used proportional to the size of the input. In general, similar approaches are applied to its estimation.
Quicklisp news — aserve renaming breaks stuff
@20190719 11:56 · 36 days agoupdate This has been resolved, for now, by reverting the change.
Nicolas Hafner — A Month of Daily Gamedev  Confession 86
@20190716 09:13 · 39 days ago
That didn't feel as long as it was. A month ago I promised to do daily streams of game development. So far I've held true to that. I'm sure I won't be able to hold that true forever, but I'll try to keep going for as long as I can. In this one month alone, a lot has changed for the game though!
A new architecture for map organisation was implemented, including a new save file format for that. As part of that work I also completely rewrote the tilemap rendering as well. The game has a lighting system now, too, based on signed distance functions to compute precise light areas. In terms of physics, collision detection was revamped to properly support slopes and moving platforms, giving much more freedom for level design.
All of the art assets that existed previously were also dropped and replaced with new ones. I'm still working on that part, since I'm not quite happy with the current set of tiles. I'll also have to add more animations, and of course repeat the animation and character design work for any NPC I might add to the game. That's a bit of a ways out though, as I'll need to think about world building and story writing first before I can really get into that. Now there's a hard challenge!
Finally, in the last week I designed a new language for writing dialogue with branching, choices, looping, and so forth. To support this I extended the syntax of Markless, and added a compiler to transform the Markless AST into a simple assembly language, which is then executed in a simple, suspendable VM. Added on to that there's now a quest system that should be general enough to allow writing any kind of quest I'll need. Currently though it's lacking a way to conveniently write these quests, so that's a task to work on in the near future.
I intend on writing a simple UI to create and edit these quests. I'm not sure yet what I'll use for that, though. I'm most wellversed with Qt, but maybe I should finally cave in and give CLIM a shot. Or perhaps LTK. We'll have to see.
There's still about two months left in my summer break before university resumes. I hope to keep going with this until then, so expect further daily streams and more progress! As before, the stream still happens every day at 20:00 CEST, on https://stream.shinmera.com, or https://twitch.tv/shinmera. I've tremendously appreciated all the people that stop by in chat to watch, and even talk with me during work. It's made things so much more enjoyable for me. Really, thank you so much!
Quicklisp news — July 2019 Quicklisp dist update now available
@20190711 20:02 · 44 days ago adopt — Simple, flexible, UNIXstyle option parsing. — MIT
 bike — Common Lisp .Net Core Interop — MIT
 binpack — Rectangle packer for sprite/texture atlases — MIT
 clipfsapi2 — Bindings for the IPFS HTTP API. — GPLv3
 clkeycloak — Describe clkeycloak here — GPLv3
 cllzlib — lzip (LZMA) (de)compression using binding to lzlib — GPL3
 clsteamworks — Generator for the lowlevel steamworks bindings. — zlib
 csv — Read CSV into lists natively. Convert CSV into lists dangerously. — GNU GPL, version 3
 datumcomments — datum #;(comments) for common lisp — Public Domain (Unlicense)
 fiveamasdf — Library to integrate FiveAM testing with ASDF TESTOP and TESTSYSTEM — Lisp LGPL
 lastfm — Interface for the Last.fm API (https://www.last.fm/api/) — GPLv3
 lyrics — Song lyrics with local database — GPLv3
 methodhooks — simple qualifiable hooks defined like methods with the option to modify the dispatch method and how dispatch happens — Mozilla Public License Version 2.0
 origin — A native Lisp graphics math library with an emphasis on performance and correctness. — MIT
 patchwork — A spritesheet packer for games. — MIT
 youtube — Play youtube urls with or without video using mpv — GPLv3
 zbucium — last.fm music player with lyrics — GPLv3
To get this update, use (ql:updatedist "quicklisp").
Enjoy!
Lispers.de — LispMeetup in Hamburg on Monday, 1st July 2019
@20190628 00:00 · 58 days agoWe meet at Ristorante Opera, Dammtorstraße 7, Hamburg, starting around 19:00 CEST on 1st July 2019.
This is an informal gathering of Lispers of all experience levels.
Lispers.de — Berlin Lispers Meetup, Monday, 24th June 2019
@20190619 21:15 · 66 days agoWe meet again on Monday 8pm, 24th June. Our host this time is James Anderson (www.dydra.com).
Berlin Lispers is about all flavors of Lisp including Common Lisp and Emacs Lisp, Clojure and Scheme.
We will have one talk by James Anderson this time.
The title of the talk is "Using the compiler to walk code".
We meet in the TautHaus at Engeldamm 70 in BerlinMitte, the bell is "James Anderson". It is located in 10min walking distance from U Moritzplatz or U Kottbusser Tor or Ostbahnhof. In case of questions call Christian +49 1578 70 51 61 4.
Nicolas Hafner — Daily Game Development Streams  Confession 85
@20190617 21:23 · 68 days ago
Starting tomorrow I will be doing daily live game development streams for my current game project, Leaf. The streams will happen from 20:00 to 22:00 CEST, at https://stream.shinmera.com. There's also a short summary and countdown page here.
The streams will first primarily focus on rewriting some of the internals of the game and discussing viable designs for the level and tilemap representation. After that it'll probably fan out into some art design and rendering techniques. The streams won't be contiguous, so I'll be working on things offstream as well, but I'll try to work in brief recaps at the beginning to ensure everyone is up to speed on the current progress.
If you would like to study the source code yourself, the two primary projects are Leaf itself, and the game engine Trial. Both are open source, and I push changes regularly. Feedback or discussion on the code and design is of course very much welcome.
I'm doing this regular stream mostly to force myself out of a terrible depressive episode I've been having, and this should provide enough external pressure to get me out of it and back into my regular, productive schedule. Of course, the more people are interested in this, the greater the pressure, so I hope I'll see at least some people in chat during the streams. I would really appreciate it tremendously, even if you don't have much gamedev or technical expertise, or are just there to shitpost. Really, I genuinely mean it!
That said, I hope to see you there!
Nicolas Hafner — Improving Portability  Confession 84
@20190606 09:26 · 79 days ago
Common Lisp has a big standard, but sometimes that standard still doesn't cover everything one might need to be able to do. And so, various implementations offer various extensions in various ways to help people get their work done. A lot of these extensions are similar and compatible with each other, just with different interfaces. Thus, to make libraries work on as many implementations as possible, portability libraries have been made, and are still being made.
Unfortunately, writing one of these libraries to work across all common implementations, let alone all existing implementations, is a tremendous amount of work, especially since the implementations that are still maintained change over time as well to add, remove, and improve their extensions. Thus, quite a few of the portability libraries that exist do not support all implementations that they could, or not as well as they could. Similarly, not all implementations provide all extensions that they could, and probably should (like PackageLocal Nicknames).
In hopes of aiding this process, I have decided to create a small project that documents the state of the various portability libraries and implementations in a way that shows the information at a glance. You can see the state here and help improve it if you see a missing or incorrectly labelled library on GitHub.
It would make me overjoyed to see people help out the various projects to support more implementations, and to lobby their favourite implementation to take up support for new extensions*. I'm the maintainer of a few of the libraries listed on the page as well (atomics, definitions, dissect, floatfeatures, trivialarguments), and naturally I would very much welcome improvements to them. It would be especially great if the support for LispWorks in specific could be added, as I can't seem to get it to run on my system.
In any case, I hope that this project will nudge people towards caring a bit more about portability in the Lisp ecosystem. Maybe one day we'll be able to just pick and choose our implementations and libraries without having to worry about whether that specific combination will work or not. I think that would be fantastic.
*I'm looking at you, Allegro and LispWorks. Please support PackageLocal Nicknames!
On a side note, I know I haven't been writing a lot of stuff here lately. I hope to correct that with a couple of entries coming in the next few days. As always, thanks for reading!
Lispers.de — LispMeetup in Hamburg on Monday, 3rd June 2019
@20190530 00:00 · 87 days agoWe meet at Ristorante Opera, Dammtorstraße 7, Hamburg, starting around 19:00 CEST on 3rd June 2019.
This is an informal gathering of Lispers of all experience levels.
Lispers.de — Berlin Lispers Meetup, Monday, 27th May 2019
@20190526 14:15 · 90 days agoWe meet again on Monday 8pm, 27th May. Our host this time is James Anderson (www.dydra.com).
Berlin Lispers is about all flavors of Lisp including Scheme, Common Lisp, Emacs Lisp, Clojure.
We will have one talk this time.
Ingo Mohr concludes his series of talks telling us about "Das Ende von Lisp und LispMaschine in Ostdeutschland".
We meet in the TautHaus at Engeldamm 70 in BerlinMitte, the bell is "James Anderson". It is located in 10min walking distance from U Moritzplatz or U Kottbusser Tor or Ostbahnhof. In case of questions call Christian +49 1578 70 51 61 4.
Quicklisp news — May 2019 Quicklisp dist update now available
@20190521 13:19 · 95 days ago arrival — Classical planning plan validator written in modern Common Lisp — LLGPL
 assertp — A library of assertions written in Common Lisp. — GPLv3
 assertionerror — Error pattern for assertion libraries in Common Lisp. — GNU General Public License v3.0
 clcxx — Common Lisp Cxx Interoperation — MIT
 cldigikarutilities — A utility library, primarily intended to provide an easy interface to vectors and hashtables. — MIT
 cljustgetoptparser — Getoptlike parser for commandline options and arguments — Creative Commons CC0 (public domain dedication)
 clpostgresdatetime — Date/time integration for clpostgres that uses LOCALTIME for types that use time zones and SIMPLEDATE for those that don't — BSD3Clause
 clrdkafka — CFFI bindings for librdkafka to enable interaction with a Kafka cluster. — GPLv3
 clstream — Stream classes for Common Lisp — MIT
 numpyfileformat — Read and write Numpy .npy and .npz files. — MIT
 py4cl — Call Python libraries from Common Lisp — MIT
 sealablemetaobjects — A CLOSsy way to trade genericity for performance. — MIT
 simpleactors — Port of banker.scm from Racket — BSD
 trivialleftpad — Ports the functionality of the very popular leftpad from npm. — MIT
Removed projects: cffiobjects, clblapack, commonlispstat, fnv, lispmatrix, qtoolscommons, simplegui, trivia.balland2006.
A number of projects stopped working because of internal updates to SBCL  projects which used them have not been updated to use supported interfaces instead.
To get this update, use (ql:updatedist "quicklisp"). Enjoy!
Lispers.de — LispMeetup in Hamburg on Monday, 6th May 2019
@20190430 00:00 · 117 days agoWe meet at Ristorante Opera, Dammtorstraße 7, Hamburg, starting around 19:00 CET on 6th May 2019.
Christian was at ELS and will report, and we will talk about our attempts at a little informal language benchmark.
This is an informal gathering of Lispers of all experience levels.
Update: the fine folks from stkhamburg.de will be there and talk about their Lispbased work!
Paul Khuong — Fractional Set Covering With Experts
@20190423 22:05 · 123 days agoLast winter break, I played with one of the annual capacitated vehicle routing problem (CVRP) “Santa Claus” contests. Real world family stuff took precedence, so, after the obvious LKH with Concorde polishing for individual tours, I only had enough time for one diversification moonshot. I decided to treat the high level problem of assembling prefabricated routes as a set covering problem: I would solve the linear programming (LP) relaxation for the mincost set cover, and use randomised rounding to feed new starting points to LKH. Add a lot of luck, and that might just strike the right balance between solution quality and diversity.
Unsurprisingly, luck failed to show up, but I had ulterior motives: I’m much more interested in exploring first order methods for relaxations of combinatorial problems than in solving CVRPs. The routes I had accumulated after a couple days turned into a set covering LP with 1.1M decision variables, 10K constraints, and 20M nonzeros. That’s maybe denser than most combinatorial LPs (the aspect ratio is definitely atypical), but 0.2% nonzeros is in the right ballpark.
As soon as I had that fractional set cover instance, I tried to solve it with a simplex solver. Like any good Googler, I used Glop... and stared at a blank terminal for more than one hour.
Having observed that lack of progress, I implemented the toy I really wanted to try out: first order online “learning with experts” (specifically, AdaHedge) applied to LP optimisation. I let this notparticularlyoptimised serial CL code run on my 1.6 GHz laptop for 21 hours, at which point the first order method had found a 4.5% infeasible solution (i.e., all the constraints were satisfied with \(\ldots \geq 0.955\) instead of \(\ldots \geq 1\)). I left Glop running long after the contest was over, and finally stopped it with no solution after more than 40 days on my 2.9 GHz E5.
Given the shape of the constraint matrix, I would have loved to try an interior point method, but all my licenses had expired, and I didn’t want to risk OOMing my workstation. Erling Andersen was later kind enough to test Mosek’s interior point solver on it. The runtime was much more reasonable: 10 minutes on 1 core, and 4 on 12 cores, with the sublinear speedup mostly caused by the serial crossover to a simplex basis.
At 21 hours for a naïve implementation, the “learning with experts” first order method isn’t practical yet, but also not obviously uninteresting, so I’ll write it up here.
Using online learning algorithms for the “experts problem” (e.g., Freund and Schapire’s Hedge algorithm) to solve linear programming feasibility is now a classic result; Jeremy Kun has a good explanation on his blog. What’s new here is:
 Directly solving the optimisation problem.
 Confirming that the parameterfree nature of AdaHedge helps.
The first item is particularly important to me because it’s a simple modification to the LP feasibility metaalgorithm, and might make the difference between a tool that’s only suitable for theoretical analysis and a practical approach.
I’ll start by reviewing the experts problem, and how LP feasibility is usually reduced to the former problem. After that, I’ll cast the reduction as a surrogate relaxation method, rather than a Lagrangian relaxation; optimisation should flow naturally from that point of view. Finally, I’ll guess why I had more success with AdaHedge this time than with Multiplicative Weight Update eight years ago.^{1}
The experts problem and LP feasibility
I first heard about the experts problem while researching dynamic sorted set data structures: Igal Galperin’s PhD dissertation describes scapegoat trees, but is really about online learning with experts. Arora, Hazan, and Kale’s 2012 survey of multiplicative weight update methods. is probably a better introduction to the topic ;)
The experts problem comes in many variations. The simplest form sounds like the following. Assume you’re playing a binary prediction game over a predetermined number of turns, and have access to a fixed finite set of experts at each turn. At the beginning of every turn, each expert offers their binary prediction (e.g., yes it will rain today, or it will not rain today). You then have to make a prediction yourself, with no additional input. The actual result (e.g., it didn’t rain today) is revealed at the end of the turn. In general, you can’t expect to be right more often than the best expert at the end of the game. Is there a strategy that bounds the “regret,” how many more wrong prediction you’ll make compared to the expert(s) with the highest number of correct predictions, and in what circumstances?
Amazingly enough, even with an omniscient adversary that has access to your strategy and determines both the experts’ predictions and the actual result at the end of each turn, a stream of random bits (hidden from the adversary) suffice to bound our expected regret in \(\mathcal{O}(\sqrt{T}\,\lg n)\), where \(T\) is the number of turns and \(n\) the number of experts.
I long had trouble with that claim: it just seems too good of a magic trick to be true. The key realisation for me was that we’re only comparing against invidivual experts. If each expert is a move in a matrix game, that’s the same as claiming you’ll never do much worse than any pure strategy. One example of a pure strategy is always playing rock in RockPaperScissors; pure strategies are really bad! The trick is actually in making that regret bound useful.
We need a more continuous version of the experts problem for LP feasibility. We’re still playing a turnbased game, but, this time, instead of outputting a prediction, we get to “play” a mixture of the experts (with nonnegative weights that sum to 1). At the beginning of each turn, we describe what weight we’d like to give to each experts (e.g., 60% rock, 40% paper, 0% scissors). The cost (equivalently, payoff) for each expert is then revealed (e.g., \(\mathrm{rock} = 0.5\), \(\mathrm{paper} = 0.5\), \(\mathrm{scissors} = 0\)), and we incur the weighted average from our play (e.g., \(60\% \cdot 0.5 + 40\% \cdot 0.5 = 0.1\)) before playing the next round.^{2} The goal is to minimise our worstcase regret, the additive difference between the total cost incurred by our mixtures of experts and that of the a posteriori best single expert. In this case as well, online learning algorithms guarantee regret in \(\mathcal{O}(\sqrt{T} \, \lg n)\)
This line of research is interesting because simple algorithms achieve that bound, with explicit constant factors on the order of 1,^{3} and those bounds are known to be nonasymptotically tight for a large class of algorithms. Like dense linear algebra or fast Fourier transforms, where algorithms are often compared by counting individual floating point operations, online learning has matured into such tight bounds that worstcase regret is routinely presented without Landau notation. Advances improve constant factors in the worst case, or adapt to easier inputs in order to achieve “better than worst case” performance.
The reduction below lets us take any learning algorithm with an additive regret bound, and convert it to an algorithm with a corresponding worstcase iteration complexity bound for \(\varepsilon\)approximate LP feasibility. An algorithm that promises low worstcase regret in \(\mathcal{O}(\sqrt{T})\) gives us an algorithm that needs at most \(\mathcal{O}(1/\varepsilon\sp{2})\) iterations to return a solution that almost satisfies every constraint in the linear program, where each constraint is violated by \(\varepsilon\) or less (e.g., \(x \leq 1\) is actually \(x \leq 1 + \varepsilon\)).
We first split the linear program in two components, a simple domain (e.g., the nonnegative orthant or the \([0, 1]\sp{d}\) box) and the actual linear constraints. We then map each of the latter constraints to an expert, and use an arbitrary algorithm that solves our continuous version of the experts problem as a black box. At each turn, the black box will output a set of nonnegative weights for the constraints (experts). We will average the constraints using these weights, and attempt to find a solution in the intersection of our simple domain and the weighted average of the linear constraints. We can do so in the “experts problem” setting by consider each linear constraint’s violation as a payoff, or, equivalently, satisfaction as a loss.
Let’s use Stigler’s Diet Problem with three foods and two constraints as a small example, and further simplify it by disregarding the minimum value for calories, and the maximum value for vitamin A. Our simple domain here is at least the nonnegative orthant: we can’t ingest negative food. We’ll make things more interesting by also making sure we don’t eat more than 10 servings of any food per day.
The first constraint says we mustn’t get too many calories
\[72 x\sb{\mathrm{corn}} + 121 x\sb{\mathrm{milk}} + 65 x\sb{\mathrm{bread}} \leq 2250,\]
and the second constraint (tweaked to improve this example) ensures we ge enough vitamin A
\[107 x\sb{\mathrm{corn}} + 400 x\sb{\mathrm{milk}} \geq 5000,\]
or, equivalently,
\[107 x\sb{\mathrm{corn}}  400 x\sb{\mathrm{milk}} \leq 5000,\]
Given weights \([¾, ¼]\), the weighted average of the two constraints is
\[27.25 x\sb{\mathrm{corn}}  9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\]
where the coefficients for each variable and for the righthand side were averaged independently.
The subproblem asks us to find a feasible point in the intersection of these two constraints: \[27.25 x\sb{\mathrm{corn}}  9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]
Classically, we claim that this is just Lagrangian relaxation, and find a solution to
\[\min 27.25 x\sb{\mathrm{corn}}  9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]
In the next section, I’ll explain why I think this analogy is wrong and worse than useless. For now, we can easily find the minimum one variable at a time, and find the solution \(x\sb{\mathrm{corn}} = 0\), \(x\sb{\mathrm{milk}} = 10\), \(x\sb{\mathrm{bread}} = 0\), with objective value \(92.5\) (which is \(530\) less than \(437.5\)).
In general, three things can happen at this point. We could discover that the subproblem is infeasible. In that case, the original nonrelaxed linear program itself is infeasible: any solution to the original LP satisfies all of its constraints, and thus would also satisfy any weighted average of the same constraints. We could also be extremely lucky and find that our optimal solution to the relaxation is (\(\varepsilon\))feasible for the original linear program; we can stop with a solution. More commonly, we have a solution that’s feasible for the relaxation, but not for the original linear program.
Since that solution satisfies the weighted average constraint and payoffs track constraint violation, the black box’s payoff for this turn (and for every other turn) is nonpositive. In the current case, the first constraint (on calories) is satisfied by \(1040\), while the second (on vitamin A) is violated by \(1000\). On weighted average, the constraints are satisfied by \(\frac{1}{4}(3 \cdot 1040  1000) = 530.\) Equivalently, they’re violated by \(530\) on average.
We’ll add that solution to an accumulator vector that will come in handy later.
The next step is the key to the reduction: we’ll derive payoffs (negative costs) for the black box from the solution to the last relaxation. Each constraint (expert) has a payoff equal to its level of violation in the relaxation’s solution. If a constraint is strictly satisfied, the payoff is negative; for example, the constraint on calories is satisfied by \(1040\), so its payoff this turn is \(1040\). The constraint on vitamin A is violated by \(1000\), so its payoff this turn is \(1000\). Next turn, we expect the black box to decrease the weight of the constraint on calories, and to increase the weight of the one on vitamin A.
After \(T\) turns, the total payoff for each constraint is equal to the sum of violations by all solutions in the accumulator. Once we divide both sides by \(T\), we find that the divided payoff for each constraint is equal to its violation by the average of the solutions in the accumulator. For example, if we have two solutions, one that violates the calories constraint by \(500\) and another that satisfies it by \(1000\) (violates it by \(1000\)), the total payoff for the calories constraint is \(500\), and the average of the two solutions does strictly satisfy the linear constraint by \(\frac{500}{2} = 250\)!
We also know that we only generated feasible solutions to the relaxed subproblem (otherwise, we’d have stopped and marked the original LP as infeasible), so the black box’s total payoff is \(0\) or negative.
Finally, we assumed that the black box algorithm guarantees an additive regret in \(\mathcal{O}(\sqrt{T}\, \lg n)\), so the black box’s payoff of (at most) \(0\) means that any constraint’s payoff is at most \(\mathcal{O}(\sqrt{T}\, \lg n)\). After dividing by \(T\), we obtain a bound on the violation by the arithmetic mean of all solutions in the accumulator: for all constraint, that violation is in \(\mathcal{O}\left(\frac{\lg n}{\sqrt{T}}\right)\). In other words, the number of iteration \(T\) must scale with \(\mathcal{O}\left(\frac{\lg n}{\varepsilon\sp{2}}\right)\), which isn’t bad when \(n\) is in the millions but \(\varepsilon \approx 0.01\).
Theoreticians find this reduction interesting because there are concrete implementations of the black box, e.g., the multiplicative weight update (MWU) method with nonasymptotic bounds. For many problems, this makes it possible to derive the exact number of iterations necessary to find an \(\varepsilon\)feasible fractional solution, given \(\varepsilon\) and the instance’s size (but not the instance itself).
That’s why algorithms like MWU are theoretically useful tools for fractional approximations, when we already have subgradient methods that only need \(\mathcal{O}\left(\frac{1}{\varepsilon}\right)\) iterations: stateoftheart algorithms for learning with experts explicit nonasymptotic regret bounds that yield, for many problems, iteration bounds that only depend on the instance’s size, but not its data. While the iteration count when solving LP feasibility with MWU scales with \(\frac{1}{\varepsilon\sp{2}}\), it is merely proportional to \(\lg n\), the log of the the number of linear constraints. That’s attractive, compared to subgradient methods for which the iteration count scales with \(\frac{1}{\varepsilon}\), but also scales linearly with respect to instancedependent values like the distance between the initial dual solution and the optimum, or the Lipschitz constant of the Lagrangian dual function; these values are hard to bound, and are often proportional to the square root of the number of constraints. Given the choice between \(\mathcal{O}\left(\frac{\lg n}{\varepsilon\sp{2}}\right)\) iterations with explicit constants, and a looser \(\mathcal{O}\left(\frac{\sqrt{n}}{\varepsilon}\right)\), it’s obvious why MWU and online learning are powerful additions to the theory toolbox.
Theoreticians are otherwise not concerned with efficiency, so the usual answer to someone asking about optimisation is to tell them they can always reduce linear optimisation to feasibility with a binary search on the objective value. I once made the mistake of implementing that binary search last strategy. Unsurprisingly, it wasn’t useful. I also tried another theoretical reduction, where I looked for a pair of primal and dual feasible solutions that happened to have the same objective value. That also failed, in a more interesting manner: since the two solution had to have almost the same value, the universe spited me by sending back solutions that were primal and dual infeasible in the worst possible way. In the end, the second reduction generated fractional solutions that were neither feasible nor superoptimal, which really isn’t helpful.
Direct linear optimisation with experts
The reduction above works for any “simple” domain, as long as it’s convex and we can solve the subproblems, i.e., find a point in the intersection of the simple domain and a single linear constraint or determine that the intersection is empty.
The set of (super)optimal points in some initial simple domain is still convex, so we could restrict our search to the search of the domain that is superoptimal for the linear program we wish to optimise, and directly reduce optimisation to the feasibility problem solved in the last section, without binary search.
That sounds silly at first: how can we find solutions that are superoptimal when we don’t even know the optimal value?
Remember that the subproblems are always relaxations of the original linear program. We can port the objective function from the original LP over to the subproblems, and optimise the relaxations. Any solution that’s optimal for a realxation must have an optimal or superoptimal value for the original LP.
Rather than treating the black box online solver as a generator of Lagrangian dual vectors, we’re using its weights as solutions to the surrogate relaxation dual. The latter interpretation isn’t just more powerful by handling objective functions. It also makes more sense: the weights generated by algorithms for the experts problem are probabilities, i.e., they’re nonnegative and sum to \(1\). That’s also what’s expected for surrogate dual vectors, but definitely not the case for Lagrangian dual vectors, even when restricted to \(\leq\) constraints.
We can do even better!
Unlike Lagrangian dual solvers, which only converge when fed (approximate) subgradients and thus make us (nearly) optimal solutions to the relaxed subproblems, our reduction to the experts problem only needs feasible solutions to the subproblems. That’s all we need to guarantee an \(\varepsilon\)feasible solution to the initial problem in a bounded number of iterations. We also know exactly how that \(\varepsilon\)feasible solution is generated: it’s the arithmetic mean of the solutions for relaxed subproblems.
This lets us decouple finding lower bounds from generating feasible solutions that will, on average, \(\varepsilon\)satisfy the original LP. In practice, the search for an \(\varepsilon\)feasible solution that is also superoptimal will tend to improve the lower bound. However, nothing forces us to evaluate lower bounds synchronously, or to only use the experts problem solver to improve our bounds.
We can find a new bound from any vector of nonnegative constraint weights: they always yield a valid surrogate relaxation. We can solve that relaxation, and update our best bound when it’s improved. The Diet subproblem earlier had
\[27.25 x\sb{\mathrm{corn}}  9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]
Adding the original objective function back yields the linear program
\[\min 0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\] subject to \[27.25 x\sb{\mathrm{corn}}  9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10,\]
which has a trivial optimal solution at \([0, 0, 0]\).
When we generate a feasible solution for the same subproblem, we can use any valid bound on the objective value to find the most feasible solution that is also assuredly (super)optimal. For example, if some oracle has given us a lower bound of \(2\) for the original Diet problem, we can solve for
\[\min 27.25 x\sb{\mathrm{corn}}  9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\leq 2\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]
We can relax the objective value constraint further, since we know that the final \(\varepsilon\)feasible solution is a simple arithmetic mean. Given the same best bound of \(2\), and, e.g., a current average of \(3\) solutions with a value of \(1.9\), a new solution with an objective value of \(2.3\) (more than our best bound, so not necessarily optimal!) would yield a new average solution with a value of \(2\), which is still (super)optimal. This means we can solve the more relaxed subproblem
\[\min 27.25 x\sb{\mathrm{corn}}  9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\leq 2.3\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]
Given a bound on the objective value, we swapped the constraint and the objective; the goal is to maximise feasibility, while generating a solution that’s “good enough” to guarantee that the average solution is still (super)optimal.
For boxconstrained linear programs where the box is the convex domain, subproblems are bounded linear knapsacks, so we can simply stop the greedy algorithm as soon as the objective value constraint is satisfied, or when the knapsack constraint becomes active (we found a better bound).
This last tweak doesn’t just accelerate convergence to \(\varepsilon\)feasible solutions. More importantly for me, it pretty much guarantees that out \(\varepsilon\)feasible solution matches the best known lower bound, even if that bound was provided by an outside oracle. Bundle methods and the Volume algorithm can also mix solutions to relaxed subproblems in order to generate \(\varepsilon\)feasible solutions, but the result lacks the last guarantee: their fractional solutions are even more superoptimal than the best bound, and that can make bounding and variable fixing difficult.
The secret sauce: AdaHedge
Before last Christmas’s CVRP set covering LP, I had always used the multiplicative weight update (MWU) algorithm as my black box online learning algorithm: it wasn’t great, but I couldn’t find anything better. The two main downsides for me were that I had to know a “width” parameter ahead of time, as well as the number of iterations I wanted to run.
The width is essentially the range of the payoffs; in our case, the potential level of violation or satisfaction of each constraints by any solution to the relaxed subproblems. The dependence isn’t surprising: folklore in Lagrangian relaxation also says that’s a big factor there. The problem is that the most extreme violations and satisfactions are initialisation parameters for the MWU algorithm, and the iteration count for a given \(\varepsilon\) is quadratic in the width (\(\mathrm{max}\sb{violation} \cdot \mathrm{max}\sb{satisfaction}\)).
What’s even worse is that the MWU is explicitly tuned for a specific iteration count. If I estimate that, give my worstcase width estimate, one million iterations will be necessary to achieve \(\varepsilon\)feasibility, MWU tuned for 1M iterations will need 1M iterations, even if the actual width is narrower.
de Rooij and others published AdaHedge in 2013, an algorithm that addresses both these issues by smoothly estimating its parameter over time, without using the doubling trick.^{4} AdaHedge’s loss (convergence rate to an \(\varepsilon\)solution) still depends on the relaxation’s width. However, it depends on the maximum width actually observed during the solution process, and not on any explicit worstcase bound. It’s also not explicily tuned for a specific iteration count, and simply keeps improving at a rate that roughly matches MWU. If the instance happens to be easy, we will find an \(\varepsilon\)feasible solution more quickly. In the worst case, the iteration count is never much worse than that of an optimally tuned MWU.
These 400 lines of Common Lisp implement AdaHedge and use it to optimise the set covering LP. AdaHedge acts as the online black box solver for the surrogate dual problem, the relaxed set covering LP is a linear knapsack, and each subproblem attempts to improve the lower bound before maximising feasibility.
When I ran the code, I had no idea how long it would take to find a feasible enough solution: covering constraints can never be violated by more than \(1\), but some points could be covered by hundreds of tours, so the worst case satisfaction width is high. I had to rely on the way AdaHedge adapts to the actual hardness of the problem. In the end, \(34492\) iterations sufficed to find a solution that was \(4.5\%\) infeasible.^{5} This corresponds to a worst case with a width of less than \(2\), which is probably not what happened. It seems more likely that the surrogate dual isn’t actually an omniscient adversary, and AdaHedge was able to exploit some of that “easiness.”
The iterations themselves are also reasonable: one sparse matrix / dense vector multiplication to convert surrogate dual weights to an average constraint, one solve of the relaxed LP, and another sparse matrix / dense vector multiplication to compute violations for each constraint. The relaxed LP is a fractional \([0, 1]\) knapsack, so the bottleneck is sorting double floats. Each iteration took 1.8 seconds on my old laptop; I’m guessing that could easily be 1020 times faster with vectorisation and parallelisation.
In another post, I’ll show how using the same surrogate dual optimisation algorithm to mimick Lagrangian decomposition instead of Lagrangian relaxation guarantees an iteration count in \(\mathcal{O}\left(\frac{\lg \#\mathrm{nonzero}}{\varepsilon\sp{2}}\right)\) independently of luck or the specific linear constraints.

Yes, I have been banging my head against that wall for a while.↩

This is equivalent to minimising expected loss with random bits, but cleans up the reduction.↩

When was the last time you had to worry whether that log was natural or base2?↩

The doubling trick essentially says to start with an estimate for some parameters (e.g., width), then adjust it to at least double the expected iteration count when the parameter’s actual value exceeds the estimate. The sum telescopes and we only pay a constant multiplicative overhead for the dynamic update.↩

I think I computed the \(\log\) of the number of decision variables instead of the number of constraints, so maybe this could have gone a bit better.↩
Lispers.de — Berlin Lispers Meetup, Monday, 15th April 2019
@20190409 12:30 · 137 days agoWe meet again on Monday 8pm, 15th April. Our host this time is James Anderson (www.dydra.com).
Berlin Lispers is about all flavors of Lisp including Emacs Lisp, Common Lisp, Clojure, Scheme.
We will have two talks this time.
Hans Hübner will tell us about "Reanimating VAX LISP  A CLtL1 implementation for VAX/VMS".
And Ingo Mohr will continue his talk "About the Unknown East of the Ancient LISP World. History and Thoughts. Part II: Eastern Common LISP and a LISP Machine."
We meet in the TautHaus at Engeldamm 70 in BerlinMitte, the bell is "James Anderson". It is located in 10min walking distance from U Moritzplatz or U Kottbusser Tor or Ostbahnhof. In case of questions call Christian +49 1578 70 51 61 4.
Didier Verna — Quickref 2.0 "Be Quick or Be Dead" is released
@20190408 00:00 · 139 days agoSurfing on the energizing wave of ELS 2019, the 12 European Lisp Symposium, I'm happy to announce the release of Quickref 2.0, codename "Be Quick or Be Dead".
The major improvement in this release, justifying an increment of the major version number (and the very appropriate codename), is the introduction of parallel algorithms for building the documentation. I presented this work last week in Genova so I won't go into the gory details here, but for the brave and impatient, let me just say that using the parallel implementation is just a matter of calling the BUILD
function with :parallel t :decltthreads x :makeinfothreads y
(adjust x and y as you see fit, depending on your architecture).
The second featured improvement is the introduction of an author index, in addition to the original one. The author index is still a bit shaky, mostly due to technical problems (calling asdf:findsystem
almost two thousand times simply doesn't work) and also to the very creative use that some library authors have of the ASDF author
and maintainer
slots in the system descriptions. It does, however, a quite decent job for the majority of the authors and their libraries'reference manuals.
Finally, the repository now has a fully functional continuous integration infrastructure, which means that there shouldn't be anymore lags between new Quicklisp (or Quickref) releases and new versions of the documentation website.
Thanks to Antoine Hacquard, Antoine Martin, and Erik Huelsmann for their contribution to this release! A lot of new features are already in the pipe. Currently documenting 1720 libraries, and counting...
Lispers.de — LispMeetup in Hamburg on Monday, 1st April 2019
@20190328 19:17 · 149 days agoWe meet at Ristorante Opera, Dammtorstraße 7, Hamburg, starting around 19:00 CET on 1st April 2019.
This is an informal gathering of Lispers. Svante will talk a bit about the implementation of lispers.de. You are invited to bring your own topics.
Lispers.de — Berlin Lispers Meetup, Monday, 25th March 2019
@20190320 10:30 · 157 days agoWe meet again on Monday 8pm, 25th March. Our host this time is James Anderson (www.dydra.com).
Berlin Lispers is about all flavors of Lisp including Common Lisp, Scheme, Dylan, Clojure.
We will have a talk this time. Ingo Mohr will tell us "About the Unknown East of the Ancient LISP World. History and Thoughts. Part I: LISP on Punchcards".
We meet in the TautHaus at Engeldamm 70 in BerlinMitte, the bell is "James Anderson". It is located in 10min walking distance from U Moritzplatz or U Kottbusser Tor or Ostbahnhof. In case of questions call Christian +49 1578 70 51 61 4.
Quicklisp news — March 2019 Quicklisp dist update now available
@20190307 14:52 · 170 days ago bobbin — Simple (word) wrapping utilities for strings. — MIT
 clmango — A minimalist CouchDB 2.x database client. — BSD3
 clnetpbm — Common Lisp support for reading/writing the netpbm image formats (PPM, PGM, and PBM). — MIT/X11
 clskkserv — skkserv with Common Lisp — GPLv3
 cltorrents — This is a little tool for the lisp REPL or the command line (also with a readline interactive prompt) to search for torrents and get magnet links — MIT
 commonlispjupyter — A Common Lisp kernel for Jupyter along with a library for building Jupyter kernels. — MIT
 conf — Simple configuration file manipulator for projects. — GNU General Public License v3.0
 eventbus — An event bus in Common Lisp. — GPLv3
 openlocationcode — Open Location Code library. — Modified BSD License
 piggybackparameters — This is a configuration system that supports local file and database based parameter storage. — MIT
 quilc — A CLI frontend for the Quil compiler — Apache License 2.0 (See LICENSE.txt)
 qvm — An implementation of the Quantum Abstract Machine. — Apache License 2.0 (See LICENSE.txt)
 restrictedfunctions — Reasoning about functions with restricted argument types. — MIT
 simplet — Simple test runner in Common Lisp. — GPLv3
 skeletoncreator — Create projects from a skeleton directory. — GPLv3
 solidengine — The Common Lisp stackbased application controller — MIT
 spell — Spellchecking package for Common Lisp — BSD
 trivialcontinuation — Provides an implementation of function call continuation and combination. — MIT
 trivialhashtableserialize — A simple method to serialize and deserialize hashtables. — MIT
 trivialjsoncodec — A JSON parser able to identify class hierarchies. — MIT
 trivialmonitoredthread — Trivial Monitored Thread offers a very simple (aka trivial) way of spawning threads and being informed when one any of them crash and die. — MIT
 trivialobjectlock — A simple method to lock object (and slot) access. — MIT
 trivialpooleddatabase — A DB multithreaded connection pool. — MIT
 trivialtimer — Easy scheduling of tasks (functions). — MIT
 trivialvariablebindings — Offers a way to handle associations between a placeholder (aka. variable) and a value. — MIT
 ucons — Unique conses and functions for working on them. — MIT
 wordnet — Common Lisp interface to WordNet — CCBY 4.0
Removed projects: mgl, mglmat.
To get this update, use: (ql:updatedist "quicklisp")
Enjoy!
For older items, see the Planet Lisp Archives.
Last updated: 20190822 13:28