Planet Lisp

Nick LevineI'm looking for work

· 2 days ago

I've been an enthusiastic Common Lisp implementor, developer and advocate for over 25 years, in particular with ten years at Harlequin and more recently as a consultant for Ravenbrook. I've worked with various lisps on a range of projects, from servers and GUI applications down to the LispWorks garbage collector. Most of the software I've written doesn't belong to me and so I can't publish it, but you'll find links to some of the libraries I've worked on here, along with a few papers that I've authored.

My full CV is here.

Working with Ravenbrook makes me part of a team. We're highly qualified mathematicians and computer scientists, we have a lot of lisp experience, and between us we can cover a lot of ground. So if what you need is (for instance) a temporary need for more weight on one of your projects, please come to us. If we can't help then we'll put you in touch with people who can.

ndl@ravenbrook.com

Paul KhuongNumber systems for implicit data structures

· 3 days ago

Implicit data structures are data structures with negligible space overhead compared to storing the data in a flat array: auxiliary information is mostly represented by permuting the elements cleverly. For example, a sorted vector combined with binary search is an implicit in-order representation of a binary search tree. I believe the seminal implicit data structure is the binary heap of Williams and Floyd, usually presented in the context of heapsort.

I find most developers are vastly more comfortable dealing with pointer-based data structures than with implicit ones. I blame our courses, which focus on the former and completely fail to show us how to reason about the latter. For example, the typical presentation of the binary heap introduces an indexing scheme to map from parent to children – the children of heap[i] are heap[2 * i] and heap[2 * i + 1], with one-based indices – that is hard to generalise to ternary or k-ary heaps (Knuth’s presentation, with parents at \(\lfloor i/2 \rfloor\), is no better). The reason it’s so hard to generalise is that the indexing scheme hides a simple bijection between paths in k-way trees and natural numbers.

I find it ironic that I first encountered the idea of describing data structures or algorithms in terms of number systems, through Okasaki’s and Hinze’s work on purely functional data structures: that vantage point seems perfectly suited to the archetypal mutable data structure, the array! I’ll show how number systems help us understand implicit data structures with two examples: a simple indexing scheme for k-ary heaps and compositions of specially structured permutations to implement in-place bit-reversal and (some) matrix transpositions.

Simple k-ary max heaps

The classical way to present implicit binary heaps is to work with one-based array indices and to root an implicit binary tree at heap[1]. For any node heap[i], the children live at heap[2 * i] and heap[2 * i + 1]. My issue with this presentation is that it’s unclear how to extend the scheme to ternary or k-ary trees: if the children of heap[1] are heap[3 * 1 ... 3 * 1 + 2], i.e., heap[3, 4, 5], what do we do with heap[2]? We end up with k - 1 parallel k-ary trees stashed in the same array. Knuth’s choice of mapping children to parents with a floored integer division suffers the same fate.

One-based arrays hides the beauty of the binary indexing scheme. With zero-based arrays, the children of heap[i] are heap[2 * i + 1] and heap[2 * i + 2]. This is clearly isomorphic to the one-based scheme for binary heaps. The difference is that the extension to k-way trees is obvious: the children of heap[i] are heap[k * i + 1 ... k * i + k].

A couple examples like the one below fail to show any problem. However, even a large number of tests is no proof. Thinking in terms of number systems leads to a nice demonstration that the scheme creates a bijection between infinite k-ary trees and the naturals.

There is a unique finite path from the root to any vertex in an infinite k-ary tree. This path can be described as a finite sequence \((c\sb{1}, c\sb{2}, \ldots, c\sb{n})\) of integers between 1 and k (inclusively). If \(c\sb{1} = 1\), we first went down the first child of the root node; if \(c\sb{1} = 2\), we instead went down the second child; etc. If \(c\sb{2} = 3\), we then went down the third child, and so on. We can recursively encode such finite paths as naturals (in pidgin ML):

path_to_nat [] = 0
path_to_nat [c : cs] = k * path_to_nat cs + c

Clearly, this is an injection from finite paths in k-ary trees to the naturals. There’s only a tiny difference with the normal positional encoding of naturals in base k: there is no 0, and digits instead include k. This prevents us from padding a path with zeros, which would map multiple paths to the same natural.

We only have to show that path_to_nat is a bijection between finite paths and naturals. I’ll do that by constructing an inverse that is total on the naturals.

nat_to_path 0 = []
nat_to_path n = let c = pop n
                in c : nat_to_path (n - c) / k

where pop is a version of the mod k that returns k instead of 0:

pop n = let c = n `mod` k
        in 
         if (c != 0)
          then c
          else k

The base case is nat_to_path 0 = [].

In the induction step, we can assume that path_to_nat cs = n and that nat_to_path n = cs. We only have to show that, for any \(1 \leq c \leq k\), nat_to_path path_to_nat c:cs = c:cs. Let n' = path_to_nat c:cs = k * n + c.

\[n\sp{\prime} = kn + c \equiv c\quad \mod k,\] so pop n' will correctly return c (and k rather than 0). It’s then a tiny bit of algebra to show that (n' - c) / k = n, and we fall back to the induction hypothesis.

This scheme is so simple that I wound up coding a version of heapsort(3) that lets callers choose the heap’s arity at runtime. Higher arity heaps perform more comparisons but fewer swaps than binary heaps; the tradeoff is profitable when sorting large items. It seems to me that, for decades, we’ve been presenting implicit heaps and heapsort in a way that marginally simplifies the binary case at the expense of obscuring the elegant general case.

Array permutations as algebra on positional number systems

Bit reversing an array of length \(2\sp{l}\) sends the element x[i] to x[j], where the binary representation of i (including padding up to l bits) is the reverse of j. For example, in an array of length 16, 3 = 0011 becomes 1100 = 12.

Reversing a fixed-width integer’s binary representation is its self-inverse, so bit reversing an array is a sequence of swaps. This means that the permutation can be performed in-place, as a series of independent swaps. Bit reversal used to be slow on cached machines: contiguous elements (with indices that only vary in their low order bits) swap with far-off elements (indices that only vary in their high order bits). Worse, the stride between the latter elements is a large power of two, which causes all sorts of aliasing issues. Workarounds (see Zhang 99 (PDF)) mostly end up implementing a software cache with explicit buffers. Nowadays, even L1 caches have such a high associativity that aliasing is a much less important issue.

Napa-FFT3 implements bit reversals by calling a few specialised functions that only swap the lower and higher bits; the main routine iterates over an array of precomputed middle bit reversals (similar to various publications of Elster’s, but recursing on the middle bits first). In this implementation, the number of L1 cache misses incurred by bit reversing an array is only slightly greater than the compulsory misses. Bit reversal isn’t free, but it’s also not clear that autosorting FFTs are quicker than out-of-order FFTs followed by a bit reversal pass.

Bit reversal is the only array permutation I’ve seen described in terms of its effect on indices. I think it’s a fruitful avenue for other in-place permutations.

For example, the viewpoint makes it clear how to transpose a matrix of dimensions \(2\sp{m} \times 2\sp{n}\) with a sequence of in-place bit reversals (each \(i\sb{k}\) and \(j\sb{l}\) is a bit in the index’s binary representation).

For a row-major layout, the sketch above corresponds to:

  1. bit reverse each row of length \(2\sp{n}\) in place;
  2. bit reverse the whole vector of length \(2\sp{m + n}\) in place;
  3. bit reverse each new row of length \(2\sp{m}\) in place.

Bit reversal, like all other permutations, is a reversible linear operation. We can thus change the order of operation if we want to. For example, it’s not necessarily preferable to bit-reverse contiguous rows first. We could also flip the high-order bits of the indices: rather than swapping scalars, we would swap rows. Separately bit reversing contiguous rows works best when each row fits in cache. Bit reversing columns instead amortises the bad access pattern inherent to bit reversal by spending more time on each swap: swapping rows is slower than swapping scalars, but also very efficients with regards to (streaming!) I/O.

This is interesting because in-place transposition of rectangular matrices is hard, and transposition is already a bad fit for caches. Transposing matrices with a sequence of bit reversals might just be practical. In fact, that’s what I intend to do in Napa-FFT3 for multi-dimensional DFTs: we can fuse all but the middle whole-vector bit reversal with mixed-radix FFTs (and the latter might similarly benefit from operating on [sub-]rows rather than scalars).

One obvious question now appears: can we generalise the trick to general dimensions? It’s pretty clear that we can do it for any other base \(b\) and matrices of dimension \(b\sp{m} \times b\sp{n}\) (it’s interesting how highly composite numbers dimensions are easy to transpose, and, IIRC, so are coprime ones). What if there’s no such factorisation? The best I can do is “more or less.”

For arbitrary matrix dimensions \(m \times n\), I think it’s best to decompose indices in a mixed radix (but still positional) number system. For example, a \(63 \times 21\) matrix might have indices in radix \(3,7\ |\ 3,7,3\). Given this number system, matrix transposition is

It’s a small generalisation to let the radices be \(a,b\ |\ a,b,a\), for a matrix of dimension \(ab \times a\sp{2}b\). We can then perform most of a matrix transposition by swapping positions of identical weight: first a full mixed-radix digit reversal (the weights are palindromic), followed by another mixed-radix reversal on the first three positions.

This leaves the last chunk \(b\sb{2},a\sb{3}\), which should instead be \(a\sb{3},b\sb{2}\). That’s another rectangular matrix tranpose, but smaller than the original one. It might be practical to execute that last step with a straightforward out-of-place transpose: a smaller transpose needs less scratch space and may fit in cache. We can also apply the same trick as for bit reversals and apply the transpose before everything else, by permuting rows rather than scalars. The simplest way to do that is to transpose a matrix of pointers before replicating the permutation on the actual data (glibc’s mergesort references Knuth vol. 3, exercise 5.2-10).

Finally, this also works for \(a\sp{2}b \times ab\): matrix transposition is its own inverse, so we only have execute the inverse of each step, in reverse order.

Definitely mixed results, but at least we have some intuition on why general rectangular transpositions are hard to perform in place: they’re hard to express as sequences of swaps.

Next: C code and cycle counts!

This post is the more theoretical prologue to a low-level look at qsort(3): I really wanted to make sure the nice implicit tree layout in the first section had the space it deserves.

I tried to make in-order implicit trees fit in this number system approach. I can’t see how. The problem is that in-order trees associate ranges (rather than indices) with nodes; for example, at what depth is index 1000? It depends on the size of the search tree. It might be the root (in a tree of 2000 vertices) or a leaf (in a tree of 1001 vertices).

Colin LuptonSelf-seeding Context Added to CL-ISAAC

· 4 days ago

As recommended by Bob Jenkins, the original author of the ISAAC cryptographic random number generator algorithms, self-seeding ISAAC is a useful technique for increasing the cryptographic strength of the random numbers generated from a given ISAAC context; i.e., using random values generated by ISAAC to seed a new ISAAC context. This may not seem particularly valuable for one-off random values such as the session tokens generated in CL-ISAAC’s documented Quick Recipes, but when you need to generate millions of cryptographically-strong random numbers from a single context—such as for a One-Time Pad cipher—you notice the extra strength that self-seeding provides.

CL-ISAAC v1.0.4 is now available on GitHub, which includes the self-seeding context. It will be available in the April distribution of Quicklisp.

Usage

Using the Self-seed context is similar to the other seeds already available; the function supports both ISAAC-32 and ISAAC-64 algorithms, and provides one additional keyword parameter, count, which specifies the number of rounds your ISAAC context will be self-seeded. The default value is 1, but a count greater-than 3 is recommended.

Usage is as straight-forward as the other contexts. To create a 512bit hexadecimal string token using the ISAAC-64 algorithm from a self-seeded context with 3 rounds:

* (ql:quicklisp "cl-isaac")
...
* (defvar *self-ctx* (isaac:init-self-seed :count 3 :is64 t))
* (format nil "~64,'0x" (isaac:rand-bits-64 *self-ctx* 512))

The Self-seeding context is necessarily heavier than the kernel and cl:random seeds, by a factor of approx. 5n+1, where n is the number of self-seeding rounds. Specifically, for every round there is an additional context created, as well as an additional scramble.

Enjoy!


Ben HydeLisp on the BeagleBone Black

· 6 days ago

Somebody left and orphan in a basket on my doorstep.   A BeagleBone Black.  Inspired by Patrick Stein’s assurances that it was easy to get CCL working on another tiny ARM board I gave it a try.  It was tedious, but not hard.  I’ve put my notes and some rought half-tested scripts on github.

Paul KhuongNumber systems for implicit data structures

· 6 days ago

Implicit data structures are data structures with negligible space overhead compared to storing the data in a flat array: auxiliary information is mostly represented by permuting the elements cleverly. For example, a sorted vector combined with binary search is an implicit in-order representation of a binary search tree. I believe the seminal implicit data structure is the binary heap of Williams and Floyd, usually presented in the context of heapsort.

I find most developers are vastly more comfortable dealing with pointer-based data structures than with implicit ones. I blame our courses, which focus on the former and completely fail to show us how to reason about the latter. For example, the typical presentation of the binary heap introduces an indexing scheme to map from parent to children – the children of heap[i] are heap[2 * i] and heap[2 * i + 1], with one-based indices – that is hard to generalise to ternary or k-ary heaps (Knuth’s presentation, with parents at \(\lfloor i/2 \rfloor\), is no better). The reason it’s so hard to generalise is that the indexing scheme hides a simple bijection between paths in k-way trees and natural numbers.

I find it ironic that I first encountered the idea of describing data structures or algorithms in terms of number systems through Okasaki’s and Hinze’s work on purely functional data structures: that vantage point seems perfectly suited to the archetypal mutable data structure, the array! I’ll show how number systems help us understand implicit data structures with two examples: a simple indexing scheme for k-ary heaps and compositions of specially structured permutations to implement in-place bit-reversal and (some) matrix transpositions.

Simple k-ary max heaps

The classical way to present implicit binary heaps is to work with one-based array indices and to root an implicit binary tree at heap[1]. For any node heap[i], the children live at heap[2 * i] and heap[2 * i + 1]. My issue with this presentation is that it’s unclear how to extend the scheme to ternary or k-ary trees: if the children of heap[1] are heap[3 * 1 ... 3 * 1 + 2], i.e., heap[3, 4, 5], what do we do with heap[2]? We end up with k - 1 parallel k-ary trees stashed in the same array. Knuth’s choice of mapping children to parents with a floored integer division suffers the same fate.

One-based arrays hides the beauty of the binary indexing scheme. With zero-based arrays, the children of heap[i] are heap[2 * i + 1] and heap[2 * i + 2]. This is clearly isomorphic to the one-based scheme for binary heaps. The difference is that the extension to k-way trees is obvious: the children of heap[i] are heap[k * i + 1 ... k * i + k].

A couple examples like the one below fail to show any problem. However, even a large number of tests is no proof. Thinking in terms of number systems leads to a nice demonstration that the scheme creates a bijection between infinite k-ary trees and the naturals.

There is a unique finite path from the root to any vertex in an infinite k-ary tree. This path can be described as a finite sequence \((c\sb{1}, c\sb{2}, \ldots, c\sb{n})\) of integers between 1 and k (inclusively). If \(c\sb{1} = 1\), we first went down the first child of the root node; if \(c\sb{1} = 2\), we instead went down the second child; etc. If \(c\sb{2} = 3\), we then went down the third child, and so on. We can recursively encode such finite paths as naturals (in pidgin ML):

path_to_nat [] = 0
path_to_nat [c : cs] = k * encoding cs + c

Clearly, this is an injection from finite paths in k-ary trees to the naturals. There’s only a tiny difference with the normal positional encoding of naturals in base k: there is no 0, and digits instead include k. This prevents us from padding a path with zeros, which would map multiple paths to the same natural.

We only have to show that path_to_nat is a bijection between finite paths and naturals. I’ll do that by constructing an inverse that is total on the naturals.

nat_to_path 0 = []
nat_to_path n = let c = pop n
                in c : nat_to_path (n - c) / k

where pop is a version of the mod k that returns k instead of 0:

pop n = let c = n `mod` k
        in 
         if (c != 0)
          then c
          else k

The base case is nat_to_path 0 = [].

In the induction step, we can assume that path_to_nat cs = n and that nat_to_path n = cs. We only have to show that, for any \(1 \leq c \leq k\), nat_to_path path_to_nat c:cs = c:cs. Let n' = path_to_nat c:cs = k * n + c.

\[n\sp{\prime} = kn + c \equiv c\quad \mod k,\] so pop n' will correctly return c (and k rather than 0). It’s then a tiny bit of algebra to show that (n' - c) / k = n, and we fall back to the induction hypothesis.

This scheme is so simple that I wound up coding a version of heapsort(3) that lets callers choose the heap’s arity at runtime. Higher arity heaps perform more comparisons but fewer swaps than binary heaps; the tradeoff is profitable when sorting large items. It seems to me that, for decades, we’ve been presenting implicit heaps and heapsort in a way that marginally simplifies the binary case at the expense of obscuring the elegant general case.

Array permutations as algebra on positional number systems

Bit reversing an array of length \(2\sp{l}\) sends the element x[i] to x[j], where the binary representation of i (including padding up to l bits) is the reverse of j. For example, in an array of length 16, 3 = 0011 becomes 1100 = 12.

Reversing a fixed-width integer’s binary representation is its self-inverse, so bit reversing an array is a sequence of swaps. This means that the permutation can be performed in-place, as a series of independent swaps. Bit reversal used to be slow on cached machines: contiguous elements (with indices that only vary in their low order bits) swap with far-off elements (indices that only vary in their high order bits). Worse, the stride between the latter elements is a large power of two, which causes all sorts of aliasing issues. Workarounds (see Zhang 99 (PDF)) mostly end up implementing a software cache with explicit buffers. Nowadays, even L1 caches have such a high associativity that aliasing is a much less important issue.

Napa-FFT3 implements bit reversals by calling a few specialised functions that only swap the lower and higher bits; the main routine iterates over an array of precomputed middle bit reversals (similar to various publications of Elster’s, but recursing on the middle bits first). In this implementation, the number of L1 cache misses incurred by bit reversing an array is only slightly greater than the compulsory misses. Bit reversal isn’t free, but it’s also not clear that autosorting FFTs are quicker than out-of-order FFTs followed by a bit reversal pass.

Bit reversal is the only array permutation I’ve seen described in terms of its effect on indices. I think it’s a fruitful avenue for other in-place permutations.

For example, the viewpoint makes it clear how to transpose a matrix of dimensions \(2\sp{m} \times 2\sp{n}\) with a sequence of in-place bit reversals (each \(i\sb{k}\) and \(j\sb{l}\) is a bit in the index’s binary representation).

For a row-major layout, the sketch above corresponds to:

  1. bit reverse each row of length \(2\sp{n}\) in place;
  2. bit reverse the whole vector of length \(2\sp{m + n}\) in place;
  3. bit reverse each new row of length \(2\sp{m}\) in place.

Bit reversal, like all other permutations, is a reversible linear operation. We can thus change the order of operation if we want to. For example, it’s not necessarily preferable to bit-reverse contiguous rows first. We could also flip the high-order bits of the indices: rather than swapping scalars, we would swap rows. Separately bit reversing contiguous rows works best when each row fits in cache. Bit reversing columns instead amortises the bad access pattern inherent to bit reversal by spending more time on each swap: swapping rows is slower than swapping scalars, but also very efficients with regards to (streaming!) I/O.

This is interesting because in-place transposition of rectangular matrices is hard, and transposition is already a bad fit for caches. Transposing matrices with a sequence of bit reversals might just be practical. In fact, that’s what I intend to do in Napa-FFT3 for multi-dimensional DFTs: we can fuse all but the middle whole-vector bit reversal with mixed-radix FFTs (and the latter might similarly benefit from operating on [sub-]rows rather than scalars).

One obvious question now appears: can we generalise the trick to general dimensions? It’s pretty clear that we can do it for any other base \(b\) and matrices of dimension \(b\sp{m} \times b\sp{n}\) (it’s interesting how highly composite numbers dimensions are easy to transpose, and, IIRC, so are coprime ones). What if there’s no such factorisation? The best I can do is “more or less.”

For arbitrary matrix dimensions \(m \times n\), I think it’s best to decompose indices in a mixed radix (but still positional) number system. For example, a \(63 \times 21\) matrix might have indices in radix \(3,7\ |\ 3,7,3\). Given this number system, matrix transposition is

It’s a small generalisation to let the radices be \(a,b\ |\ a,b,a\), for a matrix of dimension \(ab \times a\sp{2}b\). We can then perform most of a matrix transposition by swapping positions of identical weight: first a full mixed-radix digit reversal (the weights are palindromic), followed by another mixed-radix reversal on the first three positions.

This leaves the last chunk \(b\sb{2},a\sb{3}\), which should instead be \(a\sb{3},b\sb{2}\). That’s another rectangular matrix tranpose, but smaller than the original one. It might be practical to execute that last step with a straightforward out-of-place transpose: a smaller transpose needs less scratch space and may fit in cache. We can also apply the same trick as for bit reversals and apply the transpose before everything else, by permuting rows rather than scalars. The simplest way to do that is to transpose a matrix of pointers before replicating the permutation on the actual data (glibc’s mergesort references Knuth vol. 3, exercise 5.2-10).

Finally, this also works for \(a\sp{2}b \times ab\): matrix transposition is its own inverse, so we only have execute the inverse of each step, in reverse order.

Definitely mixed results, but at least we have some intuition on why general rectangular transpositions are hard to perform in place: they’re hard to express as sequences of swaps.

Next: C code and cycle counts!

This post is the more theoretical prologue to a low-level look at qsort(3): I really wanted to make sure the nice implicit tree layout in the first section had the space it deserves.

I tried to make in-order implicit trees fit in this number system approach. I can’t see how. The problem is that in-order trees associate ranges (rather than indices) with nodes; for example, at what depth is index 1000? It depends on the size of the search tree. It might be the root (in a tree of 2000 vertices) or a leaf (in a tree of 1001 vertices).

Ben Hydecl-xmpp

· 8 days ago

Cl-xmpp has, sadly, fallen into disrepair. It can’t connect to anything I tried to connect oo. It’s hard to find it’s development list archives.   Common-lisp.net is working thru some troubles. FYI – you can subscribe to common-lisp.net mailing lists by adding +subscribe to the list’s name, as in cl-xmpp-devel+subscribe I built my own index to the archives as so:


cl-user> (loop for y from 2008 to 2014 do (loop for m in '("January" "February" "March" "April" "May" "June" "July" "August" "September" "October" "November" "December") as u = (format nil "http://lists.common-lisp.net/pipermail/cl-xmpp-devel/~A-~A/thread.html" y m) do (multiple-value-bind (a b) (drakma:http-request u :method :head) (declare (ignore a)) (when (= 200 b) (print u)))))

"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2008-January/thread.html" 
"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2008-June/thread.html" 
"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2008-July/thread.html" 
"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2008-August/thread.html" 
"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2009-January/thread.html" 
"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2009-September/thread.html" 
"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2010-March/thread.html" 
"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2010-August/thread.html" 
"http://lists.common-lisp.net/pipermail/cl-xmpp-devel/2011-March/thread.html" 
nil

Some random patches mentioned there … I’ll have to try those.

And all I wanted to do was send an IM when ever the bouy outside of Boston harbor is forecast to have big waves.

Patrick SteinClozure Common Lisp on the Raspberry Pi

· 8 days ago

I finally ordered a Raspberry Pi last week. It arrived yesterday. Last night, I installed the Raspbian (Debian for Raspberry Pi) distro. Then, I followed Rainer Joswig’s excellent instructions on how to get Clozure Common Lisp running on the Raspberry Pi.

The only think that I would add to Rainer’s presentation is that Raspbian didn’t come with m4(1) out of the box, but it is needed when you make all to rebuild Clozure.

Hacks and glory await!

Also, Linux pro-tip: If you are thinking of renaming your only account on the machine, make sure you add the new name to a group with sudo privileges and make sure you get both /etc/passwd and /etc/shadow in the same sudo.

Edit: Rainer correctly points out that his article does say you will need to install m4 if you haven’t already. I just missed it.

Paul KhuongRefactoring with LZ77: compression is compilation (?)

· 17 days ago

This post was written under the influence of coffee ice cream and espresso. It’s a magical drink ;)

I don’t really follow the compression scene, and only pay minimal attention to machine learning. Nevertheless, the “Compression is Learning” slogan feels more and more right to me. In this post, I’ll explore another relationship that I find surprising: one between compression and compilation.

Five years ago, I took Marc Feeley’s compiler class, and he let us choose our term project. I settled on generating traces from recursive algorithms (typical of cache-oblivious algorithms) and reordering them to get iterative code that was better suited to the first level of cache. I came up with a gnarly CL-accented research-quality prototype, but the result was surprisingly effective. Sadly, I never really managed to recover loops or function calls from the fully inlined trace, so even medium-size kernels would exhaust the instruction cache.

I believe I now see a practical and scalable solution, thanks to Artur Jez’s work on “A really simple approximation of smallest grammar.” His work might lead to a function-oriented analogue of trace compilation. Trace compilers are notoriously weak on recursion (recursive function calls don’t map well to loops), so it would be nifty to have an alternative that identifies functions rather than loops.

This makes me quip that “(Trace) Compilation is Compression.” We can see trace compilers as lazily unpacking traces from the source program, rewriting them a bit, and recompressing traces in an executable format. Admittedly, the analogy is a bit of a stretch for classical compilers: they are less aggressive in decompressing source code and directly translate from one compressed representation (a small source file may describe billions of execution paths) to another.

Anyway... this is extremely hypothetical and Jez’s work is fun independently of my weekend speculations.

How to fail with LZ77

Once we cast a program trace as a sequence of opcodes (perhaps without operands, to expose more redundancy), it’s obvious that reducing code size is “just” compression, and LZ-type algorithms quickly come to mind: they compress strings by referring to earlier substrings.

The heart of LZ77 is a loop that streams through the input sequence and finds the longest common subsequence earlier in the input. In practice, the search usually considers a fixed-size window; when I write LZ77 here, I instead refer to the theoretical approach with an unbounded search window. Repeated LCS searches on an unbounded window sounds slow, but, in sophisticated implementations, the bottleneck is sorting the input string to generate a suffix array.

I don’t feel like being sophisticated and will implement a quadratic-time LCS search.

1
2
3
4
5
6
7
8
9
10
11
12
(defun longest-subseq (seq start)
  "Find the longest subseq of seq[start...] that begins before start."
  (let ((best 1)
        (index nil))
    (dotimes (i start (values index best))
      (let ((length (mismatch seq seq :start1 i :start2 start
                                      :test #'equal)))
        (assert length)
        (decf length i)
        (when (> length best)
          (setf best length
                index i))))))

Some backreferences clearly look like function calls.

1
2
3
4
5
6
CL-USER> (longest-subseq "abab" 0)
NIL ; There is no match to the left of abab
1   ;                                  ^
CL-USER> (longest-subseq "abab" 2)
0   ; There is a match for the second ab, starting
2   ; at index 0 and spanning 2 characters.

It’s a bit surprising at first, but some also correspond to repeat loops.

1
2
3
CL-USER> (longest-subseq "abababa" 2)
0 ; There is a match for "ababa", starting
5 ; at index 0 and spanning 5 characters.

This last patch is strange because it’s self-referential. However, if we decompress character by character, its meaning becomes obvious: we replicate the substring in [0, 1] to generate exactly 5 characters. This is basically a loop; we can handle partial copies with, e.g., rotation and entering the middle of the loop (like Duff’s device).

Lempel and Ziv’s result tells us that we can look for references greedily in an unbounded window and obtain asymptotically optimal (proportional to the string’s entropy) compression. Again, there are algorithms to do that in linear time – via radix sort – but I’ll just bruteforce my way in cubic time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(defstruct factor start length subseq)

(defun factorise (seq)
  (let ((index 0)
        (length (length seq)))
    (loop while (< index length)
          collect
          (multiple-value-bind (start length)
              (longest-subseq seq index)
            (prog1
                (if (null start)
                    (elt seq index)
                    (make-factor :start start :length length
                                 :subseq (subseq seq start (+ start length))))
              (incf index length))))))

Here’s the problem with converting an LZ77 factorisation into calls and loops: there is no guarantee that patches nest sanely.

1
2
3
CL-USER> (factorise "ababac||bac")
(#\a #\b #S(FACTOR :START 0 :LENGTH 3 :SUBSEQ "aba") #\c #\| #\|
 #S(FACTOR :START 3 :LENGTH 3 :SUBSEQ "bac"))

This factorisation encodes “ababac||bac” by repeating “ab” two and a half times, inserting “c||”, and then repeating “bac.” The issue is that we wish to convert “ababa” into a loop, but the last patch would then need to enter that loop in its last iteration. We could also have the opposite problem, with a patch that only covers a prefix of an earlier patch (a few iterations of a loop). The sketch below shows how we might even have to deal with both issues in the same factor.

Jez’s solution

Jez’s paper analyses a method to recover a CFL with a single member, the string we wish to compress. The CFL is described by a fully deterministic (it produces exactly one string) CFG in Chomsky normal form, i.e., a straight-line program.

This “really simple” approach takes a step back from LZ compression and instead starts with the easiest, greedy, way to generate a straight-line program from a string: just pair characters as you encounter them, left-to-right.

For example, on “abab,” this would pair “(ab)(ab)” and then ”((ab)(ab)),” and generate the following grammar:

G -> XX
X -> ab

We got lucky: the repetition synced up with the greedy pairing. That’s not always the case; for example, “abcab” becomes “(ab)(ca)b” – which exposes no redundancy – rather than “(ab)c(ab).”

Jez addresses this issue by using LZ77 factorisation as an oracle to synchronise pairings.

First, we simplify the algorithm by assuming that there are no factors that begin exactly one character before their output. Such factors correspond to repetitions of that one character (e.g., “aaaa” = “a” x 4) and we can easily turn them into repetitions of a pair (e.g., “aa” x 2).

It’s then simply a matter of not pairing when it would prevent the next character from synchronising with its backreference. For example, in “abcab,” we’d skip pairing “ca” so that “ab” could pair like the first occurrence did. We also don’t force a pair when the backreference only includes the first letter, but not the second. Finally, we opportunistically match consecutive unpaired letters.

Each such pass creates a new, shorter, string of literals and production rules. We iteratively apply the guided pairing loop until we’re left with a single production rule.

That’s it. Jez’s paper is mostly concerned with showing that we only have to run LZ compression once and with proving suboptimality bounds. The key is that we can convert factors for the input into factors for the output, and that each pass shrinks the string by a multiplicative factor. I’ll simplify things further by recomputing a factorisation (in cubic time!) from scratch in each pass.

Now, the code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
;; specials for shared state (quick-and-dirty REPL style)

;; For each character position, whether it's the first or
;; second of a pair, or just a singleton (nil)
(defvar *state*)
;; The sequence to compress.
(defvar *sequence*)
;; Cursor in the sequence.
(defvar *index*)

(defmacro with-sequence ((seq) &body; body)
  (let ((temp (gensym "SEQ")))
    `(let* ((,temp ,seq)
            (*sequence* ,temp)
            (*state* (make-array (length *sequence*)
                                 :initial-element nil))
            (*index* 0))
       ,@body)))

;; No pair begins here, but we might still fuse with the previous
;; singleton.
(defun single ()
  (let ((i *index*))
    (when (and (plusp i)
               (null (aref *state* (1- i))))
      (setf (aref *state* (1- i)) 'first
            (aref *state* i) 'second)))
  (incf *index*))

;; Force a pair.
(defun pair ()
  (let ((i *index*))
    ;; we mustn't be at the end of the sequence
    (assert (< (1+ i) (length *sequence*)))
    (setf (aref *state* i) 'first)
    (setf (aref *state* (1+ i)) 'second)
    (incf *index* 2)))

(defun sync-and-pair ()
  (multiple-value-bind (begin length)
      (longest-subseq *sequence* *index*)
    (cond ((null begin)
           ;; No factor; advance and merge greedily.
           (single))
          ((= begin (1- *index*))
           ;; Single-character repetition.
           (loop repeat length do (single)))
          (t
           (let ((end (+ begin length)))
             (assert (<= end (length *sequence*)))
             (assert (< begin (1- *index*)))
             (when (eql 'second (aref *state* begin))
               ;; The first character is the second half of a pair;
               ;; leave it by itself.
               (single)
               (incf begin))
             ;; Mimic the pairing decisions of the backref.
             (loop with expected = (+ *index* (- end begin))
                   for i from begin below end
                   do (ecase (aref *state* i)
                        (second)
                        (first (if (= (1+ i) end) ; Last character
                                   (single)       ; can't force
                                   (pair)))       ; a pair.
                        ((nil) (single)))
                   finally (assert (= *index* expected))))))))

My implementation is exceedingly naïve and runs in cubic time in the worst case. With a lot more work, the bottleneck (LZ77 factorisation) runs at the speed of sort... but constructing a suffix array would only get in the way of this post. Let’s just see what pairing the code generates.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
CL-USER> (with-sequence ("abcab")
           (loop while (< *index* (length *sequence*))
                 do (sync-and-pair)
                 finally (return *state*)))
#(FIRST SECOND NIL FIRST SECOND)
CL-USER> (with-sequence ("abababa")
           (loop while (< *index* (length *sequence*))
                 do (sync-and-pair)
                 finally (return *state*)))
#(FIRST SECOND FIRST SECOND FIRST SECOND NIL)
CL-USER> (with-sequence ("ababac||bac")
           (loop while (< *index* (length *sequence*))
                 do (sync-and-pair)
                 finally (return *state*)))
#(FIRST SECOND FIRST SECOND FIRST SECOND FIRST SECOND NIL FIRST SECOND)

In short, “abcab” becomes “(ab)c(ab)”, “abababa” “(ab)(ab)(ab)a”, and “ababac||bac” “(ab)(ab)(ac)(||)b(ac)”.

Programs from pairs

So far, we’ve only made pairing decisions. The next step is to translate them into production rules (i.e., functions), and to merge equivalent rules to actually save space.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
(defvar *counter* 0)
(defvar *rules* (make-array 0 :adjustable t :fill-pointer 0))

(defstruct rule
  (id (incf *counter*))
  a b)

;; Hash-consing naturally merges equivalent rules.
(fare-memoization:define-memo-function rule (a b)
  (let ((rule (make-rule :a a :b b)))
    (vector-push-extend rule *rules*)
    rule))

(defun reset-rules ()
  (setf *counter* 0)
  ;; Sorry, this is ugly.
  (fare-memoization:unmemoize 'rule)
  (fare-memoization:memoize 'rule)
  (setf *rules* (make-array 0 :adjustable t :fill-pointer 0)))

(defun pair-cfg (sequence state)
  (loop with acc = '()
        for i upfrom 0
        for character across sequence
        for choice across state
        do (ecase choice
             (second)
             (first (push (rule character (aref sequence (1+ i)))
                          acc))
             ((nil) (push character acc)))
        finally (return (coerce (nreverse acc) 'simple-vector))))

(defun compress-1 (sequence)
 (with-sequence (sequence)
    (loop with length = (length sequence)
          while (< *index* length)
          do (sync-and-pair)
          finally (return (pair-cfg *sequence* *state*)))))

On the examples above, we find:

1
2
3
4
5
6
7
8
9
10
11
CL-USER> (reset-rules)
#()
CL-USER> (compress-1 "abcab")
#(#S(RULE :ID 1 :A #\a :B #\b) #\c #S(RULE :ID 1 :A #\a :B #\b))
CL-USER> (compress-1 "abababa")
#(#S(RULE :ID 1 :A #\a :B #\b) #S(RULE :ID 1 :A #\a :B #\b)
  #S(RULE :ID 1 :A #\a :B #\b) #\a)
CL-USER> (compress-1 "ababac||bac")
#(#S(RULE :ID 1 :A #\a :B #\b) #S(RULE :ID 1 :A #\a :B #\b)
  #S(RULE :ID 2 :A #\a :B #\c) #S(RULE :ID 3 :A #\| :B #\|) #\b
  #S(RULE :ID 2 :A #\a :B #\c))

We only have to pair production rules further until the sequence consists of a single rule (or literal).

1
2
3
4
(defun refactor (sequence)
  (if (<= (length sequence) 1)
      sequence
      (refactor (compress-1 sequence))))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
CL-USER> (refactor "ababac||bac")
#(#S(RULE
     :ID 8
     :A #S(RULE
           :ID 7
           :A #S(RULE
                 :ID 4
                 :A #S(RULE :ID 1 :A #\a :B #\b)
                 :B #S(RULE :ID 1 :A #\a :B #\b))
           :B #S(RULE
                 :ID 5
                 :A #S(RULE :ID 2 :A #\a :B #\c)
                 :B #S(RULE :ID 3 :A #\| :B #\|)))
     :B #S(RULE :ID 6 :A #\b :B #S(RULE :ID 2 :A #\a :B #\c))))

The “call graph” below shows there’s a lot of sharing for such a short string.

Enough with the strings

I started by writing about compiling program traces, but so far I’ve only been working with strings of characters. Let’s pretend the following PostScript file was written by someone who’s never heard of functions or loops.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
%!PS
270 315 translate
1.5 setlinewidth

gsave
newpath
  0 36 moveto
 72 0  rlineto
  0 72 rlineto
-72 0  rlineto
closepath
stroke
grestore

45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore

showpage

Phew, programming is hard! That’s a lot of copy-paste to produce a simple pattern.

Let’s see what our (simplified) implementation of Jez’s algorithm can do.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
CL-USER> (reset-rules)
#()
;; READ, the poor man's tokenizer.
CL-USER> (refactor #(
270 315 translate
1.5 setlinewidth

gsave
newpath
  0 36 moveto
 72 0  rlineto
  0 72 rlineto
-72 0  rlineto
closepath
stroke
grestore

45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore
45 rotate gsave newpath 0 36 moveto 72 0 rlineto 0 72 rlineto -72 0 rlineto closepath stroke grestore

showpage
))
#(#S(RULE
     :ID 35
   ...))

35 functions that each contain two literals or function calls. Not bad (: It’s an actual SMOP to convert this grammar into a PostScript program.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
(defun emit-one-rule (stream rule)
  (flet ((emit-subrule (x)
           (if (rule-p x)
               (format nil "R~A" (rule-id x))
               (format nil "~(~a~)" x))))
    (format stream "/R~A { ~A ~A } def~%"
            (rule-id rule)
            (emit-subrule (rule-a rule))
            (emit-subrule (rule-b rule)))))

(defun emit-rules (stream rules)
  (map nil (lambda (rule)
             (emit-one-rule stream rule))
       rules)
  (format stream "R~A~%" (rule-id (elt rules (1- (length rules))))))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
CL-USER> (emit-rules *standard-output* *rules*)
/R1 { 270 315 } def
/R2 { translate 1.5 } def
/R3 { setlinewidth gsave } def
/R4 { newpath 0 } def
/R5 { 36 moveto } def
/R6 { 72 0 } def
/R7 { rlineto 0 } def
/R8 { 72 rlineto } def
/R9 { -72 0 } def
/R10 { rlineto closepath } def
/R11 { stroke grestore } def
/R12 { 45 rotate } def
/R13 { R1 R2 } def
/R14 { R3 R4 } def
/R15 { R5 R6 } def
/R16 { R7 R8 } def
/R17 { R9 R10 } def
/R18 { R11 R12 } def
/R19 { gsave R4 } def
/R20 { R11 showpage } def
/R21 { R13 R14 } def
/R22 { R15 R16 } def
/R23 { R17 R18 } def
/R24 { R17 R20 } def
/R25 { R21 R22 } def
/R26 { R23 R19 } def
/R27 { R22 R24 } def
/R28 { R25 R26 } def
/R29 { R22 R26 } def
/R30 { R28 R29 } def
/R31 { R29 R29 } def
/R32 { R29 R27 } def
/R33 { R30 R31 } def
/R34 { R31 R32 } def
/R35 { R33 R34 } def
R35

This second PostScript program comprises 211 tokens, rather than the original’s 156, but 105 of those are “{ } def” noise to define functions. Arguably, the number of meaningful tokens was reduced from 156 to slightly more than 100. Crucially, the output remains the same.

Stack programs seem particularly well suited to this rewriting approach: explicit operands (e.g., registers) introduce trivial discrepancies. In a VM, I would consider “compressing” the opcode stream separately from the operands. Compiling only the former to native code would already reduce interpretative overhead, and extracting function calls from the instruction stream should avoid catastrophic size explosions.

There are several obvious deficiencies in the direct LZ77-guided pairing. Just breaking Chomsky normalisation to inline single-use function would already help. It might also make sense to special-case repetitions as repeat n loops, instead of hiding that in \(\log n\) levels of recursion. In a real program, it would also be useful to inline the bottom-most rules so as not to drown in control flow. The thing is, destroying structure in the name of performance is well understood compiler tech; recovering it efficiently was the hard part.

Hans HübnerNo ECLM in 2014

· 21 days ago
As many of you may know, there was the plan to have the European Common Lisp Meeting happen in Berlin in October 2014. I was part of the organizing team, and we've tried our best to find high-quality speakers to make the event be at least as good as the previous ECLMs.
Unfortunately, and you've determined that already from the title and the previous wording, we've failed. We simply did not find enough interesting commercial projects that have not been presented on one of the previous ECLMs. As we do not want to compromise on the content, we thus have decided that there will be no ECLM this year. We hope to learn about new and exciting developments in the course of this year so that we can have an ECLM in 2015.

Paul KhuongWhat I look for in GSoC proposals

· 24 days ago

EDIT: Added a link to a solid proposal from last year (PDF).

It’s already Google Summer of Code season; like spring, it’s slightly earlier this year. For multiple reasons – including the accelerated schedule and my transition to having a Real Job – I did not manage to guide any of the students who sent in applications. After an initial ranking, we’re left with a fair number of decent proposals for a Summer of SBCL. They’re of comparable quality, and all leave me a bit uneasy. In this post, I’ll try to explain what I look for in GSoC proposals... or, really, in action plans in general.

I care about the quality of GSoC plans for two reasons. First, because it reassures me that the applicant understands the difficulties they’ll have to overcome to complete their project. Second, because, like TDD, a good plan makes it easier to feel when we’re making concrete progress. To me, these two factors are necessary conditions for a good success rate.

We’ll probably allow additional time to update the proposals we have already received, so hopefully this post can be immediately useful. I have a single vote, and other mentors no doubt have different criteria. I only hope that this post can give ideas to applicants looking to review their proposal.

Why? Software development can’t be planned!

This is a common objection to any form of planning for software projects. I don’t understand where it comes from.

Software development should be easier to plan than most endeavours: we have to deal with relatively few people, and, when we do, it’s rarely in an adversarial setting... and I’m willing to bet most of us can better forecast the behaviour of programs than that of people. There’s currently an electoral campaign in Québec, and I’m certain that, before the date was even announced, every single party had a precise plan for each day, at the riding, regional, and provincial levels. Politicians can do it; why can’t software developers?

Of course, there will be constant adjustments to the initial plan. The reason I think it’s important to spend time and thought on that plan isn’t to stick to it, but primarily because it forces us to systematically consider all sorts of potential issues, challenges, failure modes, etc. This exercise becomes useful when problems crop up: rather than making decisions in the heat of the moment, we can rely on our past (well rested and composed) self’s research, notes, and ideas.

I also find plans tend to become much more handwavy as we get farther in the future. That’s a normal tendency. I think we’re usually pretty good at short-term planning; most people can cook a meal without sitting down to figure out how they’ll cause awesome chemical reactions to occur without burning their house down. It’s a different matter when it comes to month-long camping trips or feeding a thousand people.

The difference is one of scale, and it’s exactly when we’re out of our comfort zone that mindful planning becomes useful. I expect an action plan to be as thorough in its last week as in its first. Most of it will be fiction, but, as I wrote above, the thinking involved in writing that fiction is what I’m after.

The overall structure I’m used to

There are tons of ways to describe a plan. I like this one, but the main thing is what I have in mind when I read each section. For instance, last year’s proposal for register allocation via graph colouring (PDF) had:

  1. the project’s aim, with a description of the high level goal (better register allocation) and of mechanisms to implement (new allocation heuristic, live range splitting and coalescing, etc.);
  2. a plan section, describing each proposed step (find examples of bad allocation, write a new simple allocator, write a visualisation tool, ...) in details;
  3. a schedule at a 2-week granularity, with references to the detailed descriptions and deliverables for each step;
  4. a list of risks and challenges in the project;
  5. a short cv;
  6. why the student felt motivated and suitable for the project.

It’s a different structure than what I suggest below, but both mostly present the same information. My wish is to feel confident that the student has done the work to have that information.

First, a description of the goal. SBCL offers prefab proposals, and it’s tempting to just copy-paste. However, I feel much more confident in the intrinsic motivation of a student who’s made the project their own. Many of our suggestions are skeletons for projects; it should be easy to come up with ways to extend and better specify them according to the student’s interests. Others are already fleshed out; those are a bit harder, but even small variations on the initial theme are encouraging. In all cases, if I don’t feel like the student owns the project, I probably won’t rank the proposal highly, and certainly won’t be inclined to mentor them. The problem statement is a simple way to demonstrate one’s understanding of – and interest for – the project.

Second, a review of the current environment. What currently exists, in SBCL or elsewhere? Can we exploit interesting research or code? Has there been previous efforts, and how/why did they not pan out? Perhaps there’s a relevant post-mortem from a similar project that will help understand the problem domain. What potential difficulties do we know of? What don’t we know (known unknowns)? How bad do we estimate unknown unknowns to be, and why? What persons or resources could be helpful? To a certain extent, this is simply demonstrating that one’s done their due diligence. However, it also tells other people (e.g., potential mentors) how they can best work with the applicant. This section should be a good resource to re-read when bad things happen.

Third, the suggested course of action at a high level. Not only what would happen in a perfect execution, but, ideally, ways to address (ahead of time or when/if they occur) some of the difficulties and unknowns listed in the previous section.

Fourth, success indicators, a translation of the initial goal description into a few checkboxes. Ideally, the criteria are falsifiable... much like an experiment, with the null hypothesis being project failure. I find that checking even a few of these boxes gives me a good sense of closure on a project.

Finally the calendar, to show how one might execute the third section and satisfy the success indicators in the fourth section. The first few weeks are usually easy to envision, but proposals tend to become fuzzier with time. I’m more confident in proposals that only use one or two -week periods, with actions and a few testable for each period. Shorter periods make fuzziness more obvious, but, more importantly, they combine with the milestones to let us tell that we’re (or aren’t) getting traction.

In the unfortunate case that we’re somehow not making the expected progress, frequent milestones means we can tell more quickly that something is off. We only have to make up for a one week setback rather than for a month, i.e., one quarter of the GSoC program. The odds of outright project failure are lower, and we don’t risk feeling (as) down for wasting a month of work.

One thing that I find particularly scary in project calendars are “Magic Happens” steps. This seems more common with longer planning periods. Many proposals are quite detailed with respect to preliminary work and research, and to final integration and documentation. The issue is that most of the work is stashed in a 3-5 week interval during which the actual project becomes completed. It’s not clear what the student will do to get there (or even who will do the work ;). Planning each week individually makes such steps obvious.

I know that it’s hard to determine how we’ll achieve a goal when we’ve never done something similar before. But we can pretend. I want to feel that an applicant has a clear idea how to execute the task they propose, if only in an artificial perfect world. Otherwise, why should I believe they can do so in reality?

Planning is fiction

That’s it. I’m told that writers should show rather than tell. I feel the same about plans. I like to be shown how goals could be achieved in ideal and less-than-ideal worlds rather than told when steps will be completed. I think the most important point is that I don’t appreciate detailed plans because we should (or can) stick to them. Rather, I believe that the exercise of coming up with such a plan is an efficient way to ensure we can fruitfully respond to what’ll actually happen when we execute it.

The regalloc proposal was one of the longest ones last year, but I felt confident that the student had a good idea of what challenges made the task non-trivial. In the end, we followed only a small part of the plan: it’s hard to come up with a suite of bad regalloc examples, visualisation turned out not to scale to real code, and there was too little time to work on live range splitting. However, every time there was a change, Alexandra had no problem determining how to adjust the schedule and figuring out the next step. On my end, I could easily follow the modified schedule and see that we were making progress.

This planning exercise is not easy, and it’s certainly not quick. (FWIW, I find it useful to get someone else, even without relevant experience, to look at the plan and ask questions.) Action plans have helped me countless times: when I needed to react to problems; when I felt like I was running in circles but could tell for sure that I was making progress; when a project forcibly came to an end but I still had a sense of closure... Hopefully, they can do the same for SBCL GSoC students.

Hans HübnerBerlin Lispers Meetup: Tuesday March 25th, 2014, 8.00pm

· 26 days ago
You are kindly invited to the next "Berlin Lispers Meetup", an informal gathering
for anyone interested in Lisp, beer or coffee:

Berlin Lispers Meetup
Tuesday, March 25th, 2014
8 pm onwards

St. Oberholz, Rosenthaler Straße 72, 10119 Berlin
U-Bahn Rosenthaler Platz

We will try to occupy a large table on the first floor, but in case you don't see us,
please contact Christian: 0157 87 05 16 14.

Please join for another evening of parentheses!

Quicklisp newsMarch 2014 dist update now available

· 30 days ago
New projects: chirp, cl-flowd, defvariant, delta-debug, random.

Updated projects: access, alexandria, architecture.hooks, asdf-dependency-grovel, asdf-encodings, babel, caveman, cffi, cffi-objects, cl+ssl, cl-6502, cl-algebraic-data-type, cl-ana, cl-autorepo, cl-autowrap, cl-blapack, cl-charms, cl-colors, cl-data-format-validation, cl-dbi, cl-emb, cl-erlang-term, cl-indeterminism, cl-isaac, cl-lastfm, cl-launch, cl-murmurhash, cl-nxt, cl-olefs, cl-plplot, cl-prime-maker, cl-qprint, cl-quickcheck, cl-randist, cl-rethinkdb, cl-sdl2, cl-sendmail, cl-smtp, cl-syslog, cl-yaclyaml, cl-zmq, clache, clack, clack-errors, cletris, clfswm, climc, climon, clinch, closer-mop, clsql, coleslaw, com.informatimago, common-lisp-stat, dartsclhashtree, define-json-expander, defmacro-enhance, defstar, drakma, elf, ernestine, esrap-liquid, fare-memoization, fare-quasiquote, fare-utils, firephp, fnv, fset, gbbopen, gendl, graph, gtk-cffi, hu.dwim.def, hu.dwim.logger, hu.dwim.util, inferior-shell, integral, iolib, jsown, lambda-reader, lisp-matrix, lisp-unit2, lisp-zmq, lla, lquery, mcclim, monkeylib-prose-diff, more-conditions, optima, paiprolog, pgloader, postmodern, projectured, qmynd, repl-utilities, rutils, scribble, secure-random, single-threaded-ccl, slime, stumpwm, sxql, teepeedee2, trivial-channels, trivial-gray-streams, trivial-ldap, vgplot, weblocks, weblocks-examples, weblocks-stores, weblocks-utils, wookie, xml.location.

To update your libraries, use (ql:update-dist "quicklisp"). Enjoy!

ABCL Devabcl-1.3.0 released

· 31 days ago
abcl-1.3.0 is a feature release.

http://abcl.org/releases/1.3.0

15-MAR-2014

## Features

* Make LispStackFrame.UNAVAILABLE_ARG a singleton object,
and lazily create the little used portions of the Lisp stack.

Aggressively cache and control the use of memory by the underlying
Lisp stack frame representation by introducing the private
LispThread.StackFrame and LispThread.StackSegments classes.

Contributed by Dmitry Nadezhin.

LispStackFrame object are allocated on every
LispThread.execute(...) . However, they are seldom [accessed]
([... verify via] inspect[tion of the] stack trace). This patch
delays allocation of LispStackFrame? objects until they are
requested. Raw information about stack frames is stored in
stack. Stack is an Object[] array (more precisely a list of [...]4
[Mib] Object[] arrays).

ME: We are going to need a way to try to less agressively grab 4Mib
chunks in low memory situations.

Memory profiling of ABCL shows that the classes with largest
allocation count are org.armedbear.lisp.LispStackFrame and
org.armedbear.lisp.LispStackFrame.UnavailableArgument.

Contributed by Dmitry Nadezhin.

[r14572]: http://abcl.org/trac/changeset/14572
[r14579]: http://abcl.org/trac/changeset/14579

* ASDF 3.0.1.94 shipped with the implementation

* per function call stack and memory exception handler in CL:COMPILE

Inline calls to jrun-exception-protected (used by handler-bind to
catch out of memory conditions). This commit saves generation
roughly 50 cls files.

[r14552]: http://abcl.org/trac/changeset/14552

* SYS:SHA256 audited

The functionality if the SYS:SHA256 algorithim has been audited for
use on inputs of single for files with recently shipping ORCL Java 7
implementations (through jdk-1.7.0_51).

[r14582]: http://abcl.org/trac/changeset/14582

* Connect to NetBeans controlled JDWP via SLIME

The Netbeans IDE configuration now includes a way to connect to
the running-under-jdb ABCL via SLIME. One needs a version of
SLIME able to be loaded from its 'swank.asd' definition.

* Install 'abcl.jar' and 'abcl-contrib.jar' locally as Maven artifacts

The Ant `abcl.mvn.install` target now installs build artifacts
into the local Maven repository (Olof-Joachim Frahm)

[r14579]: http://abcl.org/trac/changeset/14606

## Compatibility

* CL:DIRECTORY

The implementation specific :RESOLVE-SYMLINKS argument to the ANSI
DIRECTORY function has been changed to nil. This implements
behavior closer to SBCL and guarantees that a DIRECTORY operation
will not signal a file error.

[r14619]: http://abcl.org/trac/changeset/14619
[ticket-340]: http://abcl.org/trac/ticket/340

## Fixes

* Fix CL:SLEEP for intervals less than a millisecond.

For intervals less than or equal to a nanosecond, including an
interval of zero, the current thread merely yields execution to
other threads.

[r14632]: http://abcl.org/trac/changeset/14632


## Tested

### "Java_HotSpot(TM)_64-Bit_Server_VM-Oracle_Corporation-1.7.0_51-b13" "x86_64-Mac_OS_X-10.9.1"

### "Java_HotSpot(TM)_64-Bit_Server_VM-Oracle_Corporation-1.8.0-b129" "x86_64-Mac_OS_X-10.9.2"

## Contrib

#### abcl-asdf

* Now working with both Maven 3.0.x and 3.1.x. Thanks to Anton for
the help!

[ticket-328]: http://abcl.org/trac/ticket/328

* cache Maven dependency resolution to avoid repeated lookups.

Instead of calling ABCL-ASDF:RESOLVE in both the ASDF COMPILE-OP
and LOAD-OP, we now cache the result of invocation in COMPILE-OP
and add this value in the LOAD-OP phase. Contributed by Cyrus
Harmon.

[r14631]: http://abcl.org/trac/changeset/14631

#### jna

Now references jna-4.0.0. Some incompatibility with CFFI ([in
progress with fixing upstream][cffi-easye]).

[cffi-easye]: http://github.com/easye/cffi/


Christophe Rhodeshttp-content-negotiation-and-generalized-specializers

· 34 days ago

I promised a non-trivial example of a use for generalized specializers a while ago. Here it is: automatic handling of HTTP (RFC 2616) Content-Type negotiation in computed responses.

In RESTful services and things of that ilk, a client indicates that it wants to apply a verb (GET, for example) to a particular resource (named by a URN, or possibly identified by a URI). This resource is a conceptual object; it will have zero or more concrete manifestations, and content negotiation provides a way for the client to indicate which of those manifestations it would prefer to receive.

That's all a bit abstract. To take a concrete example, consider the woefully incomplete list of books in my living room at openlibrary. A user operating a graphical web browser to access that resource is presumed to want to retrieve HTML and associated resources, in order to view a shiny representation of the information associated with that resource (a "web page", if you will). But the human-oriented representation of the information is not the only possible one, and it is common practice in some circles to provide machine-readable representations as well as human-oriented ones, at the same URL; for example, try:

curl -H 'Accept: application/json' https://openlibrary.org/people/csrhodes/lists

and observe the difference between that and visiting the same URL in a graphical browser.

How does the web server know which representation to send? Well, the example has given away the punchline (if the links above to RFC sections haven't already). The graphical web browser will send an Accept header indicating that it prefers to receive objects with presentational content types - text/html, image/* and so on; the browser I have to hand sends text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 as its Accept header, meaning "please give me text/html or application/xhtml+xml, or failing that application/xml, or failing that anything else". If the server has more than one representation for a given resource, it can use this client-supplied information to provide the best possible content; if the client has particular requirements - for example, attempting to find machine-readable content for further processing - it can declare this by specifying particular acceptable content-types in its Accept header.

For a resource for which more than one representation exists, then, the server must dispatch between them based on the client Accept header. And this is exactly a non-standard dispatch of the kind I've been discussing. Consider a resource http://foo.example/ which is implemented by sending the return value of the generic function foo back to the client:

(defgeneric foo (request)
  (:generic-function-class accept-generic-function))

The default behaviour is somewhat a matter of taste, but one reasonable choice is that if no content-type matches we should use the defined HTTP status code to indicate that the responses we could generate are not acceptable to the client:

(defmethod foo ((request t))
  (http:406 request))

Maybe we have a couple of presentational representations for the resource:

(defmethod foo ((request (accept "text/plain")))
  "Foo")

(defmethod foo ((request (accept "text/html")))
  "<!DOCTYPE html>
<html>
 <head><title>Foo</title></head>
 <body><p>Foo</p></body>
</html>")

And we might have some machine-readable representations:

(defmethod foo ((request (accept "text/turtle")))
  "@prefix foo: <http://example.org/ns#> .
@prefix : <http://other.example.org/ns#> .
foo:bar foo: : .")

(defmethod foo ((request (accept "application/rdf+xml")))
  "<?xml version=\"1.0\"?>
<rdf:RDF xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"
         xmlns:foo=\"http://example.org/ns#\">
  <rdf:Description about=\"http://example.org/ns#bar\">
    <foo:>
      <rdf:Description about=\"http://other.example.org/ns#\"/>
    </foo:>
  </rdf:Description>
</rdf:RDF>")

(I apologize to any fans of XML/RDF if I have mangled that).

Now a graphical web browser sending an accept header of text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 as above will cause the server to send the HTML version, as that is the most specific applicable method to that accept string. Given this, it is perfectly possible to construct specialized clients with alternative preferences expressed in the accept header. A terminal-based client might prioritize text/plain over text/html (though in fact neither w3m nor lynx does that, at least in the versions I have installed). A client for the Semantic Web might instead accept data in serialized RDF, preferring more modern serializations, by sending an accept string of text/turtle,application/rdf+xml;q=0.9. All these clients could each be served the resource in their preferred format.

The case of serving one of a set of alternative files hosted on the filesystem in response to a request with an arbitrary accept string is different; in this case, it doesn't make sense to do the dispatch through methods. If we were to try to do so, it would look something like

(defmethod filesystem-handler (url (content-type (accept "text/html")))
  (let ((prospect (pathname-for-url url "html")))
    (if (probe-file prospect)
        (serve-file prospect "text/html")
        (call-next-method))))

but we would need to define one such method per possible mime-type we might want to serve: doable, but unnecessary compared with the alternative strategy of finding all content-types servable for a given url, then choosing the one with the highest score:

(defun filesystem-handler (url accept)
  (do* ((prospects (files-for url) (cdr prospects))
        (mime-types (mapcar #'mime-type prospects) (cdr mime-types))
        result mime-type (q 0))
       ((null prospects) (serve-file result mime-type))
     (when (> (q (car mime-types) accept) q)
       (setf result (car prospects) 
             mime-type (car mime-types) 
             q (q (car mime-types))))))

(the set of files on the filesystem effectively already define a set of methods for a given url; it doesn't make sense to try to mirror that as a set of reified methods on a generic function. Also, I've written this out using do* largely to keep the do*-is-not-that-bad society alive.)

Anyway. There's an interesting detail I've elided so far; not only do response-generating functions have to generate the content they wish to send in the response; they also have to indicate what content-type they are actually sending. Our accept-generic-function already handles dispatching on content-type; can it also take responsibility for setting the content-type of the response?

Why yes! The way to do this is using a method combination; it might look something like this:

(defvar *actual-content-type*)

(defgeneric handle-content-type (request))

(define-method-combination content-negotiation/or ()
  ((around (:around))
   (primary () :required t))
  (:arguments request)
  (labels ((transform/1 (method)
             `(make-method
               (progn
                 (let ((s (car (sb-mop:method-specializers ,method))))
                   (when (typep s 'accept-specializer)
                     (setf *actual-content-type* (media-type s))))
                 (call-method ,method))))
           (transform (primaries)
             (mapcar #'(lambda (x) `(call-method ,(transform/1 x)))
                     primaries))
           (wrap (form)
             `(let ((*actual-content-type*))
                (multiple-value-prog1
                    ,form
                  (handle-content-type ,request)))))
    (let ((form (if (rest primary)
                    `(or ,@(transform primary))
                    `(call-method ,(transform/1 (car primary))))))
      (if around
          (wrap `(call-method ,(first around)
                              (,@(rest around) (make-method ,form))))
          (wrap form)))))

This behaves just like the or built-in method-combination, except that when calling a primary method whose specializer for the first argument is one of our accept-specializers, the content-type of the specializer is stored in a special variable; the last thing the effective method does is to call the new handle-content-type generic function, passing it the original generic function's first argument.

Now let's redefine our foo generic function to have the new method combination, and a method on handle-content-type:

(defgeneric foo (request)
  (:generic-function-class accept-generic-function)
  (:method-combination content-negotiation/or))

(defmethod handle-content-type ((x string))
  (format t "Content-Type: ~A~%" *actual-content-type*))

and now, finally, we can try it all out:

SPECIALIZABLE> (foo "text/plain,text/html;q=0.9,*/*;q=0.8")
Content-Type: text/plain
"Foo"

SPECIALIZABLE> (foo "text/turtle,application/rdf+xml;q=0.9")
Content-Type: text/turtle
"@prefix foo: <http://example.org/ns#> .
@prefix : <http://other.example.org/ns#> .
foo:bar foo: : ."

SPECIALIZABLE> (foo "audio/mp3")
406

OK, but by what magic do these accept-specializer objects exist and function? I wrote a paper about that, with Jan Moringen and David Lichteblau: as part of my ongoing open access experimentation, the version we submitted to the European Lisp Symposium is viewable at Goldsmiths' e-prints repository or on arXiv. The ELS Chairs have just announced a deadline extension, so there's still time (until March 23) for anyone to submit technical papers or abstracts for tutorials and demonstration sessions: please do, and I hope to see many of my readers there.

Ben Hydetidy up the output of lisp macros

· 35 days ago

For some reason it makes my teeth hurt to have my macros generate code that I wouldn’t have written by hand.  For example it’s not hard to get code like this out of a macro expansion.

(let ()
  (progn
    (if (fp x)
      (progn 
         (f1 x)
         (f2 x)))))

v.s. what I might like:

(when (fp x)
   (f1 x)
   (f2 x))

I probably ought to just relax and ignore it, but instead I often revise macros so the code they generate is nicer to look at.   So that:

`(let ,vars ,@body)

becomes

(if vars
    `(let ,vars ,@body)
    `(progn ,@body))

This is silly!   Now I have ugly macros instead of ugly output.   I’m just moving the ugly bits around.

So I’ve started doing this:

(tidy-expression `(let ,vars ,@body))

where tidy-expression is something like this:

(defun tidy-expression (x)
  (match x
    (`(or ,a (or ,@b)) `(or ,a ,@b))
    (`(progn ,a (progn ,@b)) `(progn ,a ,@b))
    (`(progn ,x) x)
    (`(and ,a (and ,@b)) `(and ,a ,@b))
    (`(if ,a (progn ,@b)) `(when ,a ,@b))
    (`(if ,a (progn ,@b) (progn ,@c)) `(cond (,a ,@b) (t ,@c)))
    (`(if ,a (progn ,@b) ,c) `(cond (,a ,@b) (t ,c)))
    (`(if ,a ,b (progn ,@c)) `(cond (,a ,b) (t ,@c)))
    (`(let ,vs (progn ,@body)) (tidy-expression `(let ,vs ,@body)))
    (`(let nil ,@body) (tidy-expression `(progn ,@body)))
    (_ x)))

It’s another chapter in my crush on optima.

I write these tidy up functions as necessary.

That example only chews on the top of the form.   If you wanted something to clean up the first example you’d need to write tidy-expression-all.

(tidy-expression-all
 '(progn
   (if (fp x)
       (progn 
         (f1 x)
         (f2 x)))))
-->
(when (fp x) (f1 x) (f2 x))

This all reminds me of Warren Teitelman’s programmer’s assistant in Interlisp.  It reminds me of some of the things that flycheck in Emacs does for other programming languages.   It reminds me that I’ve been wondering what would a lint for Common Lisp would look like.

I bet somebody already wrote a generalized tidy expression and I just don’t know were to look.

Quicklisp newsUsing Quicklisp with an authenticated proxy

· 38 days ago
Scott Turner has written a guide to using Quicklisp with an authenticated proxy. Thanks, Scott!

LispjobsTelecommuting Common Lisp Developer, Chatsubo.net

· 41 days ago

Chatsubo.net is looking for a part-time, telecommute developer who has the following skills:

- ANSI Common Lisp
- Javascript & Node.js
- Unix systems programming
- Web application development
- Couchdb

Would be nice to have
- Database system implementation
- Experience with performance aspects of large memory mapped files

Chatsubo.net is currently working with an innovative startup company on an interesting refiguring of the online shopping experience.  The application is built on Common Lisp, Node.js, Couchdb and our own VivaceGraph.  We are looking for someone to help us take both this application and our in-house graph database (VG) from prototype to production-quality status.  The work would be part-time telecommute to start, with the possibility of becoming full time in the future. Please contact raison@chatsubo.net to discuss.


Nick LevineLW-P4: LispWorks / Perforce bindings

· 42 days ago
These bindings allow access from the LispWorks GUI to the Perforce SCM.
control-x p e "P4 Edit" control-x p S "P4 Submit" control-x p o "P4 Opened" control-x p r "P4 Revert"

It's been some years since I released version 0.2. Since then I've thrown out the COM bit and now it no longer crashes, even a little bit. Also, it's no longer confined to Windows.

I've been using this (or its predecessor) daily for years. If I have any other users out there I'll be pleased to hear from them as it might improve the chances of there ever being a 0.4.

LispjobsSenior Common Lisp developers, Clozure Associates, Seattle, Boston or remote.

· 43 days ago

Senior Common Lisp Developer

Clozure Associates is seeking senior Common Lisp developers to join our team of engineers on an ambitious startup project.

Candidates should have significant professional or open source experience programming in modern Common Lisp.  A strong math background is desirable.  Other valuable skills and background experience include:

- Embedded systems
- Real-time systems
- Distributed systems
- Signal processing
- Linear systems
- Cryptography
- Communications theory
- Network Protocol Stacks
- Linux device drivers

Locations: Seattle Washington, Boston Massachusetts preferred. Remote work may be possible for highly qualified candidates.


Luís OliveiraSLIME 2.4

· 45 days ago
SLIME 2.4 has been released without any exciting release management, but with extensive release notes! :-)

Ben Hyde,@ v.s. ,.

· 46 days ago

I’m surprised that I didn’t know about the ,. construct in Common Lisp’s backquote syntax.  It is equivalent to ,@ except that it licenses the implementation to destructively modified the tail of the list being inlined.

cl-user> (defparameter *x* '(1 2 3))
*x*
cl-user> *x*
(1 2 3)
cl-user> `(a b ,@*x* c d)
(a b 1 2 3 c d)
cl-user> *x*
(1 2 3)
cl-user> `(a b ,.*x* c d)
(a b 1 2 3 c d)
cl-user> *x*
(1 2 3 c d)
cl-user>

Christophe Rhodessbcl release management in the air

· 47 days ago

Just because I'm attending mobile world congress doesn't mean that everything else stops. It's the end of the month, so it must be time to release sbcl. This month's is a little unsatisfying, because we've only achieved one of the two things I'd hoped for: we've been cautious and conservative after last month's landing of the new register allocator, but we haven't sorted out what's going on to cause the less active architectures to fail to build. There are some workarounds that have been mooted; for one reason and another no-one has had the time to check whether they actually work, and while there's partial progress on identifying the root cause of the build failure on sparc it is only partial.

Nevertheless, minor architectures have been broken before, and no-one particularly benefits from not releasing, so 1.1.16 it is. Actually making the release is a little more challenging than usual: I aim to release by the end of the month, and in practice that means it must be done today, 28th February. However, this is the day that I am returning from Barcelona, so I am less in control of laptop power and connectivity than usual for a release day. And to add to the challenge, I am trying this time to address the valid complaints that the binaries built on my laptop don't actually run on released versions of Linux, thanks to the change in the semantics of memcpy (glibc changed the implementation in its version 2.14 to exploit the freedom given to return undefined results if the source and destination overlap) and the behaviour of the linker and versioned symbols.

So over breakfast I dusted off my squeeze chroot (that doesn't sound dodgy at all), and tried to work out how to get a runnable version of SBCL in there at all (answer: build using clisp and link to the chroot's libc). At lunchtime, I used the café's wireless to check for any last-minute changes, and in the airport I found a power socket, so I started the release build. Of course it didn't finish before the gate opened, and in any case I wasn't going to chance uploading sbcl binaries over the free airport wifi (15 minutes, woo!)

I've performed some release stunts before. SBCL 0.9 was released by William Harold Newman and simultaneously announced by me at the first European Common Lisp Meeting in Amsterdam in 2005. Last year, I released SBCL 1.1.8 "live" as part of a lightning talk at the European Lisp Symposium in Madrid. But I think this is the first time that an SBCL release was even partially effected from the air. Next stop, space?

Cyrus Harmonwhoops

· 48 days ago

Elias Martenson kindly pointed out that I forgot a progn around ,@body:

(defmacro with-java-stack-trace (&body body)  
  `(handler-case   
    (progn  
      ,@body)  
    (java:java-exception (e)  
     (print (#"getMessage" e))))) 

Cyrus Harmonwith-java-stack-trace

· 49 days ago

Following up on the java stack trace post, here's a little macro that can be used to wrap the lisp code that (eventually) triggers the java-side error:

(defmacro with-java-stack-trace (&body body)  
  `(handler-case   
    ,@body  
    (java:java-exception (e)  
     (print (#"getMessage" e))))) 

Note the use of the java:java-exception in the handler-case clause so that we don't inadvertently trap lisp errors.

Cyrus HarmonABCL Error Handling

· 49 days ago

There's probably a better way to do this, but I have been having a difficult time trying to, from the lisp side of things, track down the cause of errors signaled from java code.

It turns out that we can use lisp's normal error handling facilities to work with java errors. The following snippet triggers a java NullPointerException and if we just evaluate this in SLIME we don't actually see the java backtrace (or at least I don't see it -- of course it would be nice if there were a way to do so).

(handler-case  
    ;; this will throw an NPE  
    (java:jstatic-raw "getenv" "java.lang.System" nil)  
  (error (e)  
    ;; this prints the stack trace to the jvm's standard out, which  
    ;; when running under slime, is our *inferior-lisp* buffer.  
    (print (#"printStackTrace" (java:java-exception-cause e))))) 

But this isn't so great as the stack trace is printed to the inferior-lisp buffer. To see it in SLIME's output buffers, we can use ABCL's getMessage routine as follows:

(handler-case  
    ;; this will throw an NPE  
    (java:jstatic-raw "getenv" "java.lang.System" nil)  
  (error (e)  
    ;; this prints the exception type and the stack trace to SLIME's STDOUT  
    (print (#"getMessage" e)))) 

Having this certainly makes it easier to find the source of errors in java code called from ABCL.

Christophe Rhodessbcl summer of code 2014 participation

· 50 days ago

I received notification yesterday that SBCL was accepted as a "mentoring organization" to Google's Summer of Code 2014 programme. What this means is that eligible students can think about applying to be mentored by members of the SBCL team to work on a development project - and to be paid a stipend for that work.

The next steps for students are:

The period for student applications to be mentored as part of SBCL's participation is from 10th March until 21st March 2014; for best results, get engaged with the community and potential mentors well before the cutoff point.

Last year two projects were funded, and both completed successfully, even if merging them into the mainline is/was slower than one might like. I'm looking forward to seeing what this year brings!

Michael ForsterInstalling LispWorks 6.1 Professional (32 bit) on Slackware64 14.1

· 51 days ago

I installed LispWorks 6.1 Professional (32-bit) on multilib Slackware64 14.1 .

The Slackware64 multilib installation includes the required GTK+ 2 (>=2.4). However, attempting to run LispWorks gives the following error message:

$ /usr/local/lib/LispWorks/lispworks-6-1-0-x86-linux

(lispworks:30162): GdkPixbuf-WARNING **: Cannot open
   pixbuf loader module file 
   '/usr/lib/gdk-pixbuf-2.0/2.10.0/loaders.cache': No such
   file or directory

This likely means that your installation is broken.
Try running the command
    gdk-pixbuf-query-loaders \
      > /usr/lib/gdk-pixbuf-2.0/2.10.0/loaders.cache
to make things work again for the time being.
...

To remedy this, I ran the 32-bit_version of gdk-pixbuf-query-loaders(1):

$ su -
# gdk-pixbuf-query-loaders-32 \
       > /usr/lib/gdk-pixbuf-2.0/2.10.0/loaders.cache

Following that, LispWorks started without incident.

Luís OliveiraCall for SLIME testers

· 52 days ago
As you may have heard, SLIME's recently moved to GitHub where, for the last 3 months or so, code has been refactored, bugs have been fixed and some new features have been introduced.

SLIME's commit activity over the last 12 months

The last SLIME release has a very long beard, so we'd like to make a new one next weekend. To make it a good release, it'd be very really helpful if fellow SLIME users grabbed the latest code from github (via either git or the ZIP download link), took it for a spin and, if need be, used the issue tracker to report issues that may pop up.

Thanks!

Hans HübnerBerlin Lispers Meetup: Tuesday February 25th, 2014, 6.30pm

· 53 days ago
This month we will attend the talk

 "A Glance at Clojure" by Stefan Kamphausen.

The event is hosted by Acrolinx and is organized
by the Java Usergroup Berlin-Brandenburg.

You will find Acrolinx on the 6th floor in
Friedrichstr. 100, 10117 Berlin. The doors are
open from 6.30pm onwards, the talk starts at 7pm.

http://www.jug-berlin-brandenburg.de/blog/2014/02/10/clojure-list-fur-die-jvm/

In case you you don't find us, please contact
Christian: 0157 87 05 16 14.

LispjobsExperienced Clojure/Datomic Engineer. New startup. Remote.

· 55 days ago

Experienced Clojure/Datomic Engineer

Anywhere in the world
Full time or contracting
Listora
Reply to henry.ec@gmail.com
New start up (3 months old) with highly experienced (and awesome) team of researchers, engineers and designers. Although the project is still hugely in its infancy (very greenfield), we have a big vision of being the global leader in structured events data, which will help to drive the event discovery experiences of the future.
In order to achieve this we’re developing a platform along with applications that meet the needs of event organisers, publishers and developers. It’s going to be a great challenge and we need exceptional and ambitious people to solve the problems that we’ll come up against.
We’re developing our core platform with Clojure and are also using Datomic – because they are the right tools for the problems that we’re solving. We have a very advanced stack that any passionate engineer would be excited about working on. We also regularly work in the open source community and are already releasing libraries.
We want to work with fantastic engineers and passionate people. This is one of the things that motivates us. We don’t mind where in the world you’re based, how many days a week you want to work or whether you want a permanent or contracting relationship.
In terms of product, we have an upfront investment in high quality and disciplined qual and quant research before we go into our prototyping phase. We are very open in what we do and have a toolset to build our products very closely with our customers. We have an advanced understanding of the behavioural needs of the market that we’re working in. We don’t work to deadlines, but we do spend a lot of time thinking about the prioritisation of what we are and could be building.
We are looking to add two engineers to our team that can hit the ground running. If you’re the right fit with our team we will work hard to look after you – from giving you the flexibility to explore ideas and technologies for the job at hand to remuneration.
If you have any questions, we would love to hear from you.


For older items, see the Planet Lisp Archives.


Last updated: 2014-04-14 12:20