Planet Lisp

Quicklisp newsJuly 2018 Quicklisp dist update now available

· 4 days ago
Hi everyone! There's a new Quicklisp update for July, and regular updates should resume on a monthly schedule.

I'm using a new release system that involves Docker for easier build server setup and management. It took a while to get going but it should (eventually) make it easier for others to run things in an environment similar to mine. For example, it has all the required foreign libraries needed to compile and load everything in Quicklisp.

Here's the info for the new update:

New projects:
  • april — April is a subset of the APL programming language that compiles to Common Lisp. — Apache-2.0
  • aws-foundation — Amazon AWS low-level utilities — BSD
  • binary-io — Library for reading and writing binary data. — BSD
  • cl-bip39 — A Common Lisp implementation of BIP-0039 — MIT
  • cl-bnf — A simple BNF parser. — MIT
  • cl-generator — cl-generator, a generator implementation for common lisp — MIT
  • cl-patterns — Pattern library for algorithmic music composition and performance in Common Lisp. — GNU General Public License v3.0
  • cl-progress-bar — Display progress bars directly in REPL. — MIT
  • clad — The CLAD System. — BSD
  • concrete-syntax-tree — Library for parsing Common Lisp code into a concrete syntax tree. — FreeBSD
  • definitions — General definitions reflection library. — Artistic
  • eclector — A Common Lisp reader that can adapt to different implementations, and that can return Concrete Syntax Trees — BSD
  • flute — A beautiful, easilly composable HTML5 generation library — MIT
  • froute — An Http routing class that takes advantage of the MOP — MIT
  • language-codes — A small library mapping language codes to language names. — Artistic
  • lichat-ldap — LDAP backend for the Lichat server profiles. — Artistic
  • multilang-documentation — A drop-in replacement for CL:DOCUMENTATION providing multi-language docstrings — Artistic
  • multiposter — A small application to post to multiple services at once. — Artistic
  • sandalphon.lambda-list — Lambda list parsing and usage — WTFPL
  • sel — Programmatic modification and evaluation of software — GPL 3
  • system-locale — System locale and language discovery — Artistic
  • taglib — Pure Lisp implementation to read (and write, perhaps, one day) tags — UNLICENSE 
  • terrable — Terragen TER file format reader — Artistic
  • tooter — A client library for Mastodon instances. — Artistic
  • trace-db — Writing, reading, storing, and searching of program traces — GPL 3
  • umbra — A library of reusable GPU shader functions. — MIT
Updated projects3d-matrices3d-vectorsagnostic-lizardalexaaws-sign4base-blobsbodge-blobs-supportbodge-chipmunkbodge-nanovgbodge-nuklearbordeaux-threadsbt-semaphorecavemanceplcepl.drm-gbmcerberuschirpcl-algebraic-data-typecl-asynccl-autorepocl-cognitocl-conllucl-darkskycl-flowcl-formscl-gamepadcl-geoscl-gobject-introspectioncl-gophercl-hamcrestcl-interpolcl-liballegrocl-libsvm-formatcl-mechanizecl-mpicl-muthcl-online-learningcl-pslibcl-pythoncl-random-forestcl-readlinecl-rediscl-rulescl-sdl2cl-strcl-tomlcl-yesqlclackclawclipcloser-mopclosure-htmlclssclxcodexcoleslawcommon-lisp-actorsconfiguration.optionscroatoancurry-compose-reader-macrosdelta-debugdeploydexadordjuladmldocumentation-utilsdocumentation-utils-extensionsdoubly-linked-listdufydynamic-mixinselffare-scriptsfemlispfftflareflexi-streamsforfreebsd-sysctlfxmlgamebox-frame-managergamebox-mathglsl-specglsl-toolkitglyphsgolden-utilsgraphgsllharmonyhelambdaphunchensocketironcladjoselichat-protocollichat-serverliblichat-tcp-serverlisp-chatlistopiamaidenmcclimmedia-typesmitoninevehomer-countopticloverlordoxenfurtparachuteparseqparser.inipathname-utilspgloaderphysical-quantitiesplumppostmodernppathpythonic-string-readerqbase64qlotqt-libsqtoolsrandom-samplerovertg-maths-dot2serapeumshadowsimple-flow-dispatcherslimespinneretstaplestring-casestumpwmsxqlthe-cost-of-nothingtriviatrivial-ldaptrivial-mmapubiquitousuiopunit-formulavarjowebsocket-driverwhofieldsxhtmlambdaxlsxxml-emitter.

Removed projects: binge, black-tie, cl-ctrnn, cl-directed-graph, cl-ledger, cl-scan, readable, spartns.

The projects removed either didn't build (cl-directed-graph) or are no longer available for download that I could find (everything else).

To get this update, use (ql:update-dist "quicklisp").


Paul KhuongThe Confidence Sequence Method: A Computer-age Test for Statistical SLOs

· 9 days ago

This post goes over some code that I pushed to github today. All the snippets below should be in the repo, which also includes C and Python code with the same structure.

I recently resumed thinking about balls and bins for hash tables. This time, I’m looking at large bins (on the order of one 2MB huge page). There are many hashing methods with solid worst-case guarantees that unfortunately query multiple uncorrelated locations; I feel like we could automatically adapt them to modern hierarchical storage (or address translation) to make them more efficient, for a small loss in density.

In theory, large enough bins can be allocated statically with a minimal waste of space. I wanted some actual non-asymptotic numbers, so I ran numerical experiments and got the following distribution of global utilisation (fill rate) when the first bin fills up.

It looks like, even with one thousand bins of thirty thousand values, we can expect almost 98% space utilisation until the first bin saturates. I want something more formal.

Could I establish something like a service level objective, “When distributing balls randomly between one thousand bins with individual capacity of thirty thousand balls, we can utilise at least 98% of the total space before a bin fills up, x% of the time?”

The natural way to compute the “x%” that makes the proposition true is to first fit a distribution on the observed data, then find out the probability mass for that distribution that lies above 98% fill rate. Fitting distributions takes a lot of judgment, and I’m not sure I trust myself that much.

Alternatively, we can observe independent identically distributed fill rates, check if they achieve 98% space utilisation, and bound the success rate for this Bernoulli process.

There are some non-trivial questions associated with this approach.

  1. How do we know when to stop generating more observations... without fooling ourselves with \(p\)-hacking?
  2. How can we generate something like a confidence interval for the success rate?

Thankfully, I have been sitting on a software package to compute satisfaction rate for exactly this kind of SLO-type properties, properties of the form “this indicator satisfies $PREDICATE x% of the time,” with arbitrarily bounded false positive rates.

The code takes care of adaptive stopping, generates a credible interval, and spits out a report like this : we see the threshold (0.98), the empirical success rate estimate (0.993 ≫ 0.98), a credible interval for the success rate, and the shape of the probability mass for success rates.

This post shows how to compute credible intervals for the Bernoulli’s success rate, how to implement a dynamic stopping criterion, and how to combine the two while compensating for multiple hypothesis testing. It also gives two examples of converting more general questions to SLO form, and answers them with the same code.

Credible intervals for the Binomial

If we run the same experiment \(n\) times, and observe \(a\) successes (\(b = n - a\) failures), it’s natural to ask for an estimate of the success rate \(p\) for the underlying Bernoulli process, assuming the observations are independent and identically distributed.

Intuitively, that estimate should be close to \(a / n\), the empirical success rate, but that’s not enough. I also want something that reflects the uncertainty associated with small \(n\), much like in the following ridge line plot, where different phrases are assigned not only a different average probability, but also a different spread.

I’m looking for an interval of plausible success rates \(p\) that responds to both the empirical success rate \(a / n\) and the sample size \(n\); that interval should be centered around \(a / n\), be wide when \(n\) is small, and become gradually tighter as \(n\) increases.

The Bayesian approach is straightforward, if we’re willing to shut up and calculate. Once we fix the underlying success rate \(p = \hat{p}\), the conditional probability of observing \(a\) successes and \(b\) failures is

\[P((a, b) | p = \hat{p}) \sim \hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b},\]

where the right-hand side is a proportion1, rather than a probability.

We can now apply Bayes’s theorem to invert the condition and the event. The inversion will give us the conditional probability that \(p = \hat{p}\), given that we observed \(a\) successes and \(b\) successes. We only need to impose a prior distribution on the underlying rate \(p\). For simplicity, I’ll go with the uniform \(U[0, 1]\), i.e., every success rate is equally plausible, at first. We find

\[P(p = \hat{p} | (a, b)) = \frac{P((a, b) | p = \hat{p}) P(p = \hat{p})}{P(a, b)}.\]

We already picked the uniform prior, \(P(p = \hat{p}) = 1\,\forall \hat{p}\in [0,1],\) and the denominator is a constant with respect to \(\hat{p}\). The expression simplifies to

\[P(p = \hat{p} | (a, b)) \sim \hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b},\]

or, if we normalise to obtain a probability,

\[P(p = \hat{p} | (a, b)) = \frac{\hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}}{\int\sb{0}\sp{1} \hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}\, d\hat{p}} = \textrm{Beta}(a+1, b+1).\]

A bit of calculation, and we find that our credibility estimate for the underlying success rate follows a Beta distribution. If one is really into statistics, they can observe that the uniform prior distribution is just the \(\textrm{Beta}(1, 1)\) distribution, and rederive that the Beta is the conjugate distribution for the Binomial distribution.

For me, it suffices to observe that the distribution \(\textrm{Beta}(a+1, b+1)\) is unimodal, does peak around \(a / (a + b)\), and becomes tighter as the number of observations grows. In the following image, I plotted three Beta distributions, all with empirical success rate 0.9; red corresponds to \(n = 10\) (\(a = 9\), \(b = 1\), \(\textrm{Beta}(10, 2)\)), black to \(n = 100\) (\(\textrm{Beta}(91, 11)\)), and blue to \(n = 1000\) (\(\textrm{Beta}(901, 101)\)).

We calculated, and we got something that matches my intuition. Before trying to understand what it means, let’s take a detour to simply plot points from that un-normalised proportion function \(\hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}\), on an arbitrary \(y\) axis.

Let \(\hat{p} = 0.4\), \(a = 901\), \(b = 101\). Naïvely entering the expression at the REPL yields nothing useful.

CL-USER> (* (expt 0.4d0 901) (expt (- 1 0.4d0) 101))

The issue here is that the un-normalised proportion is so small that it underflows double floats and becomes a round zero. We can guess that the normalisation factor \(\frac{1}{\mathrm{Beta}(\cdot,\cdot)}\) quickly grows very large, which will bring its own set of issues when we do care about the normalised probability.

How can we renormalise a set of points without underflow? The usual trick to handle extremely small or large magnitudes is to work in the log domain. Rather than computing \(\hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}\), we shall compute

\[\log\left[\hat{p}\sp{a} \cdot (1 - \hat{p})\sp{b}\right] = a \log\hat{p} + b \log (1 - \hat{p}).\]

CL-USER> (+ (* 901 (log 0.4d0)) (* 101 (log (- 1 0.4d0))))
CL-USER> (exp *)

That’s somewhat better: the log-domain value is not \(-\infty\), but converting it back to a regular value still gives us 0.

The \(\log\) function is monotonic, so we can find the maximum proportion value for a set of points, and divide everything by that maximum value to get plottable points. There’s one last thing that should change: when \(x\) is small, \(1 - x\) will round most of \(x\) away. Instead of (log (- 1 x)), we should use (log1p (- x)) to compute \(\log (1 + -x) = \log (1 - x)\). Common Lisp did not standardise log1p, but SBCL does have it in internals, as a wrapper around libm. We’ll just abuse that for now.

CL-USER> (defun proportion (x) (+ (* 901 (log x)) (* 101 (sb-kernel:%log1p (- x)))))
CL-USER> (defparameter *points* (loop for i from 1 upto 19 collect (/ i 20d0)))
CL-USER> (reduce #'max *points* :key #'proportion)

We have to normalise in the log domain, which is simply a subtraction: \(\log(x / y) = \log x - \log y\). In the case above, we will subtract \(-327.49\ldots\), or add a massive \(327.49\ldots\) to each log proportion (i.e., multiply by \(10\sp{142}\)). The resulting values should have a reasonably non-zero range.

CL-USER> (mapcar (lambda (x) (cons x (exp (- (proportion x) *)))) *points*)
((0.05d0 . 0.0d0)
 (0.1d0 . 0.0d0)
 (0.35d0 . 3.443943164733533d-288)
 (0.8d0 . 2.0682681158181894d-16) 
 (0.85d0 . 2.6252352579425913d-5)
 (0.9d0 . 1.0d0)
 (0.95d0 . 5.65506756824607d-10))

There’s finally some signal in there. This is still just an un-normalised proportion function, not a probability density function, but that’s already useful to show the general shape of the density function, something like the following, for \(\mathrm{Beta}(901, 101)\).

Finally, we have a probability density function for the Bayesian update of our belief about the success rate after \(n\) observations of a Bernoulli process, and we know how to compute its proportion function. Until now, I’ve carefully avoided the question of what all these computations even mean. No more (:

The Bayesian view assumes that the underlying success rate (the value we’re trying to estimate) is unknown, but sampled from some distribution. In our case, we assumed a uniform distribution, i.e., that every success rate is a priori equally likely. We then observe \(n\) outcomes (successes or failures), and assign an updated probability to each success rate. It’s like a many-world interpretation in which we assume we live in one of a set of worlds, each with a success rate sampled from the uniform distribution; after observing 900 successes and 100 failures, we’re more likely to be in a world where the success rate is 0.9 than in one where it’s 0.2. With Bayes’s theorem to formalise the update, we assign posterior probabilities to each potential success rate value.

We can compute an equal-tailed credible interval from that \(\mathrm{Beta}(a+1,b+1)\) posterior distribution by excluding the left-most values, \([0, l)\), such that the Beta CDF (cumulative distribution function) at \(l\) is \(\varepsilon / 2\), and doing the same with the right most values to cut away \(\varepsilon / 2\) of the probability density. The CDF for \(\mathrm{Beta}(a+1,b+1)\) at \(x\) is the incomplete beta function, \(I\sb{x}(a+1,b+1)\). That function is really hard to compute (this technical report detailing Algorithm 708 deploys five different evaluation strategies), so I’ll address that later.

The more orthodox “frequentist” approach to confidence intervals treats the whole experiment, from data colleaction to analysis (to publication, independent of the observations 😉) as an Atlantic City algorithm: if we allow a false positive rate of \(\varepsilon\) (e.g., \(\varepsilon=5\%\)), the experiment must return a confidence interval that includes the actual success rate (population statistic or parameter, in general) with probability \(1 - \varepsilon\), for any actual success rate (or underlying population statistic / parameter). When the procedure fails, with probability at most \(\varepsilon\), it is allowed to fail in an arbitrary manner.

The same Atlantic City logic applies to \(p\)-values. An experiment (data collection and analysis) that accepts when the \(p\)-value is at most \(0.05\) is an Atlantic City algorithm that returns a correct result (including “don’t know”) with probability at least \(0.95\), and is otherwise allowed to yield any result with probability at most \(0.05\). The \(p\)-value associated with a conclusion, e.g., “success rate is more than 0.8” (the confidence level associated with an interval) means something like “I’m pretty sure that the success rate is more than 0.8, because the odds of observing our data if that were false are small (less than 0.05).” If we set that threshold (of 0.05, in the example) ahead of time, we get an Atlantic City algorithm to determine if “the success rate is more than 0.8” with failure probability 0.05. (In practice, reporting is censored in all sorts of ways, so...)

There are ways to recover a classical confidence interval, given \(n\) observations from a Bernoulli. However, they’re pretty convoluted, and, as Jaynes argues in his note on confidence intervals, the classical approach gives values that are roughly the same2 as the Bayesian approach... so I’ll just use the Bayesian credibility interval instead.

See this stackexchange post for a lot more details.

Dynamic stopping for Binomial testing

The way statistics are usually deployed is that someone collects a data set, as rich as is practical, and squeezes that static data set dry for significant results. That’s exactly the setting for the credible interval computation I sketched in the previous section.

When studying the properties of computer programs or systems, we can usually generate additional data on demand, given more time. The problem is knowing when it’s ok to stop wasting computer time, because we have enough data... and how to determine that without running into multiple hypothesis testing issues (ask anyone who’s run A/B tests).

Here’s an example of an intuitive but completely broken dynamic stopping criterion. Let’s say we’re trying to find out if the success rate is less than or greater than 90%, and are willing to be wrong 5% of the time. We could get \(k\) data points, run a statistical test on those data points, and stop if the data let us conclude with 95% confidence that the underlying success rate differs from 90%. Otherwise, collect \(2k\) fresh points, run the same test; collect \(4k, \ldots, 2\sp{i}k\) points. Eventually, we’ll have enough data.

The issue is that each time we execute the statistical test that determines if we should stop, we run a 5% risk of being totally wrong. For an extreme example, if the success rate is exactly 90%, we will eventually stop, with probability 1. When we do stop, we’ll inevitably conclude that the success rate differs from 90%, and we will be wrong. The worst-case (over all underlying success rates) false positive rate is 100%, not 5%!

In my experience, programmers tend to sidestep the question by wasting CPU time with a large, fixed, number of iterations... people are then less likely to run our statistical tests, since they’re so slow, and everyone loses (the other popular option is to impose a reasonable CPU budget, with error thresholds so lax we end up with a smoke test).

Robbins, in Statistical Methods Related to the Law of the Iterated Logarithm, introduces a criterion that, given a threshold success rate \(p\) and a sequence of (infinitely many!) observations from the same Bernoulli with unknown success rate parameter, will be satisfied infinitely often when \(p\) differs from the Bernoulli’s success rate. Crucially, Robbins also bounds the false positive rate, the probability that the criterion be satisfied even once in the infinite sequence of observations if the Bernoulli’s unknown success rate is exactly equal to \(p\). That criterion is

\[{n \choose a} p\sp{a} (1-p)\sp{n-a} \leq \frac{\varepsilon}{n+1},\]

where \(n\) is the number of observations, \(a\) the number of successes, \(p\) the threshold success rate, and \(\varepsilon\) the error (false positive) rate. As the number of observation grows, the criterion becomes more and more stringent to maintain a bounded false positive rate over the whole infinite sequence of observations.

There are similar “Confidence Sequence” results for other distributions (see, for example, this paper of Lai), but we only care about the Binomial here.

More recently, Ding, Gandy, and Hahn showed that Robbins’s criterion also guarantees that, when it is satisfied, the empirical success rate (\(a/n\)) lies on the correct side of the threshold \(p\) (same side as the actual unknown success rate) with probability \(1-\varepsilon\). This result leads them to propose the use of Robbins’s criterion to stop Monte Carlo statistical tests, which they refer to as the Confidence Sequence Method (CSM).

(defun csm-stop-p (successes failures threshold eps)
  "Pseudocode, this will not work on a real machine."
  (let ((n (+ successes failures)))
    (<= (* (choose n successes) 
           (expt threshold successes)
           (expt (- 1 threshold) failures))
        (/ eps (1+ n)))))

We may call this predicate at any time with more independent and identically distributed results, and stop as soon as it returns true.

The CSM is simple (it’s all in Robbins’s criterion), but still provides good guarantees. The downside is that it is conservative when we have a limit on the number of observations: the method “hedges” against the possibility of having a false positive in the infinite number of observations after the limit, observations we will never make. For computer-generated data sets, I think having a principled limit is pretty good; it’s not ideal to ask for more data than strictly necessary, but not a blocker either.

In practice, there are still real obstacles to implementing the CSM on computers with finite precision (floating point) arithmetic, especially since I want to preserve the method’s theoretical guarantees (i.e., make sure rounding is one-sided to overestimate the left-hand side of the inequality).

If we implement the expression well, the effect of rounding on correctness should be less than marginal. However, I don’t want to be stuck wondering if my bad results are due to known approximation errors in the method, rather than errors in the code. Moreover, if we do have a tight expression with little rounding errors, adjusting it to make the errors one-sided should have almost no impact. That seems like a good trade-off to me, especially if I’m going to use the CSM semi-automatically, in continuous integration scripts, for example.

One look at csm-stop-p shows we’ll have the same problem we had with the proportion function for the Beta distribution: we’re multiplying very small and very large values. We’ll apply the same fix: work in the log domain and exploit \(\log\)’s monotonicity.

\[{n \choose a} p\sp{a} (1-p)\sp{n-a} \leq \frac{\varepsilon}{n+1}\]


\[\log {n \choose a} + a \log p + (n-a)\log (1-p) \leq \log\varepsilon -\log(n+1),\]

or, after some more expansions, and with \(b = n - a\),

\[\log n! - \log a! - \log b! + a \log p + b \log(1 - p) + \log(n+1) \leq \log\varepsilon.\]

The new obstacle is computing the factorial \(x!\), or the log-factorial \(\log x!\). We shouldn’t compute the factorial iteratively: otherwise, we could spend more time in the stopping criterion than in the data generation subroutine. Robbins has another useful result for us:

\[\sqrt{2\pi} n\sp{n + ½} \exp(-n) \exp\left(\frac{1}{12n+1}\right) < n! < \sqrt{2\pi} n\sp{n + ½} \exp(-n) \exp\left(\frac{1}{12n}\right),\]

or, in the log domain,

\[\log\sqrt{2\pi} + \left(n + \frac{1}{2}\right)\log n -n + \frac{1}{12n+1} < \log n! < \log\sqrt{2\pi} + \left(n + \frac{1}{2}\right)\log n -n +\frac{1}{12n}.\]

This double inequality gives us a way to over-approximate \(\log {n \choose a} = \log \frac{n!}{a! b!} = \log n! - \log a! - \log b!,\) where \(b = n - a\):

\[\log {n \choose a} < -\log\sqrt{2\pi} + \left(n + \frac{1}{2}\right)\log n -n +\frac{1}{12n} - \left(a + \frac{1}{2}\right)\log a +a - \frac{1}{12a+1} - \left(b + \frac{1}{2}\right)\log b +b - \frac{1}{12b+1},\]

where the right-most expression in Robbins’s double inequality replaces \(\log n!\), which must be over-approximated, and the left-most \(\log a!\) and \(\log b!\), which must be under-approximated.

Robbins’s approximation works well for us because, it is one-sided, and guarantees that the (relative) error in \(n!\), \(\frac{\exp\left(\frac{1}{12n}\right) - \exp\left(\frac{1}{12n+1}\right)}{n!},\) is small, even for small values like \(n = 5\) (error \(< 0.0023\%\)), and decreases with \(n\): as we perform more trials, the approximation is increasingly accurate, thus less likely to spuriously prevent us from stopping.

Now that we have a conservative approximation of Robbins’s criterion that only needs the four arithmetic operations and logarithms (and log1p), we can implement it on a real computer. The only challenge left is regular floating point arithmetic stuff: if rounding must occur, we must make sure it is in a safe (conservative) direction for our predicate.

Hardware usually lets us manipulate the rounding mode to force floating point arithmetic operations to round up or down, instead of the usual round to even. However, that tends to be slow, so most language (implementations) don’t support changing the rounding mode, or do so badly... which leaves us in a multi-decade hardware/software co-evolution Catch-22.

I could think hard and derive tight bounds on the round-off error, but I’d rather apply a bit of brute force. IEEE-754 compliant implementations must round the four basic operations correctly. This means that \(z = x \oplus y\) is at most half a ULP away from \(x + y,\) and thus either \(z = x \oplus y \geq x + y,\) or the next floating point value after \(z,\) \(z^\prime \geq x + y\). We can find this “next value” portably in Common Lisp, with decode-float/scale-float, and some hand-waving for denormals.

(defun next (x &optional (delta 1))
  "Increment x by delta ULPs. Very conservative for
   small (0/denormalised) values."
  (declare (type double-float x)
           (type unsigned-byte delta))
  (let* ((exponent (nth-value 1 (decode-float x)))
         (ulp (max (scale-float double-float-epsilon exponent)
    (+ x (* delta ulp))))

I prefer to manipulate IEEE-754 bits directly. That’s theoretically not portable, but the platforms I care about make sure we can treat floats as sign-magnitude integers.

  (declaim (inline %float-bits %bits-float next prev))
  (defun %float-bits (x)
    "Convert a double float x to sign-extended sign/magnitude, and
     then to 2's complement."
    (declare (type double-float x))
    (let* ((hi (sb-kernel:double-float-high-bits x))
           (lo (sb-kernel:double-float-low-bits x))
           (word (+ (ash (ldb (byte 31 0) hi) 32) lo)))
      ;; hi is the high half of the 64 bit sign-magnitude
      ;; representation... in two's complement. Extract the significand,
      ;; and then apply the sign bit. We want to preserve signed zeros,
      ;; so return -1 - word instead of -word.
      ;; (- -1 word) = (lognot word) = (logxor word -1).
      (logxor word (ash hi -32))))

  (defun %bits-float (bits)
    "Convert 2's complement to sign-extended sign/magnitude, then
     double float."
    (declare (type (signed-byte 64) bits))
    ;; convert back to sign-magnitude: if bits is negative, all but the
    ;; sign bit must be flipped again.
    (let ((bits (logxor bits
                        (ldb (byte 63 0) (ash bits -64)))))
      (sb-kernel:make-double-float (ash bits -32)
                                   (ldb (byte 32 0) bits))))

  (defun next (x &optional (delta 1))
    "Increment x by delta ULPs."
    (declare (type double-float x)
             (type unsigned-byte delta))
    (%bits-float (+ (%float-bits x) delta)))

  (defun prev (x &optional (delta 1))
    "Decrement x by delta ULPs."
    (declare (type double-float x)
             (type unsigned-byte delta))
    (%bits-float (- (%float-bits x) delta))))
CL-USER> (double-float-bits pi)
CL-USER> (double-float-bits (- pi))

The two’s complement value for pi is one less than (- (double-float-bits pi)) because two’s complement does not support signed zeros.

CL-USER> (eql 0 (- 0))
CL-USER> (eql 0d0 (- 0d0))
CL-USER> (double-float-bits 0d0)
CL-USER> (double-float-bits -0d0)

We can quickly check that the round trip from float to integer and back is an identity.

CL-USER> (eql pi (bits-double-float (double-float-bits pi)))
CL-USER> (eql (- pi) (bits-double-float (double-float-bits (- pi))))
CL-USER> (eql 0d0 (bits-double-float (double-float-bits 0d0)))
CL-USER> (eql -0d0 (bits-double-float (double-float-bits -0d0)))

We can also check that incrementing or decrementing the integer representation does increase or decrease the floating point value.

CL-USER> (< (bits-double-float (1- (double-float-bits pi))) pi)
CL-USER> (< (bits-double-float (1- (double-float-bits (- pi)))) (- pi))
CL-USER> (bits-double-float (1- (double-float-bits 0d0)))
CL-USER> (bits-double-float (1+ (double-float-bits -0d0)))
CL-USER> (bits-double-float (1+ (double-float-bits 0d0)))
CL-USER> (bits-double-float (1- (double-float-bits -0d0)))

The code doesn’t handle special values like infinities or NaNs, but that’s out of scope for the CSM criterion anyway. That’s all we need to nudge the result of the four operations to guarantee an over- or under- approximation of the real value. We can also look at the documentation for our libm (e.g., for GNU libm) to find error bounds on functions like log; GNU claims their log is never off by more than 3 ULP. We can round up to the fourth next floating point value to obtain a conservative upper bound on \(\log x\).

(declaim (type (unsigned-byte 31) *libm-error-limit*))
(defvar *libm-error-limit* 4
  "Assume libm is off by less than 4 ULPs.")

(declaim (inline log-up log-down))
(defun log-up (x)
  "Conservative upper bound on log(x)."
  (declare (type double-float x))
  (next (log x) *libm-error-limit*))

(defun log-down (x)
  "Conservative lower bound on log(x)."
  (declare (type double-float x))
  (prev (log x) *libm-error-limit*))

  (declaim (inline log1p-up log1p-down))
  (defun log1p-up (x)
    "Convervative upper bound on log(1 + x)."
    (declare (type double-float x))
    (next (sb-kernel:%log1p x) *libm-error-limit*))

  (defun log1p-down (x)
    "Conservative lower bound on log(1 + x)"
    (declare (type double-float x))
    (prev (sb-kernel:%log1p x) *libm-error-limit*)))

I could go ahead and use the building blocks above (ULP nudging for directed rounding) to directly implement Robbins’s criterion,

\[\log {n \choose a} + a \log p + b\log (1-p) + \log(n+1) \leq \log\varepsilon,\]

with Robbins’s factorial approximation,

\[\log {n \choose a} < -\log\sqrt{2\pi} + \left(n + \frac{1}{2}\right)\log n -n +\frac{1}{12n} - \left(a + \frac{1}{2}\right)\log a +a - \frac{1}{12a+1} - \left(b + \frac{1}{2}\right)\log b +b - \frac{1}{12b+1}.\]

However, even in the log domain, there’s a lot of cancellation: we’re taking the difference of relatively large numbers to find a small result. It’s possible to avoid that by re-associating some of the terms above, e.g., for \(a\):

\[-\left(a + \frac{1}{2}\right) \log a + a - a \log p = -\frac{\log a}{2} + a (-\log a + 1 - \log p).\]

Instead, I’ll just brute force things (again) with Kahan summation. Shewchuk’s presentation in Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates highlights how the only step where we may lose precision to rounding is when we add the current compensation term to the new summand. We can implement Kahan summation with directed rounding in only that one place: all the other operations are exact!

“kahan summation”
;;; Kahan-style summation.
;;; Represent the accumulator as an evaluated sum of two doubles. As
;;; long as the compensation term is initially 0, the result is a safe
;;; upper bound on the real value, and the two terms are
;;; "non-overlapping."  For more details, see "Adaptive Precision
;;; Floating-Point Arithmetic and Fast Robust Geometric Predicates",
;;; Shewchuk, 1997; Technical report CMU-CS-96-140R / Discrete & Comp
;;; Geom 18(3), October 1997. Theorem 6 in particular.

(declaim (inline sum-update-up sum-update-finish))
(defun sum-update-up (accumulator compensation term &optional ordered)
  "Given an evaluated sum
     (accumulator + compensation),
   return a new unevaluated sum for an upper bound on
     (accumulator + compensation + term).

   If ordered, assume
     term < accumulator,
     accumulator = compensation = 0."
  (declare (type double-float accumulator compensation
  (when (and (not ordered)
             (< (abs accumulator) (abs term)))
    (rotatef accumulator term))
  (let* ((rest-1 (next (+ compensation term))) ; safe upper bound on c + t
         (rest (if (<= compensation 0d0)       ; tighter, still safe.
                   (min term rest-1)
         ;; Perform a Dekker sum of accumulator + rest. The result is
         ;; exact, so no need for next/prev here.
         ;; Precondition: |accumulator| >= |rest| (or accumulator = 0).
         (a accumulator)
         (b rest)
         (x (+ a b))
         (b-virtual (- x a))     ; b-virtual = value really added to a
         (y (- b b-virtual)))
    (values x y)))

(defun sum-update-finish (accumulator compensation)
  "Return a conservative upper bound for accumulator + compensation.

   In theory, (+ accumulator compensation) is equal to accumulator.
   In practice, it doesn't hurt to do this right. The second return
   value is the new compensation term (should never be positive)."
  (declare (type double-float accumulator compensation))
  (let* ((raw-sum (next (+ accumulator compensation)))
         (sum (if (> compensation 0d0)
                  ;; if compensation <= 0, acc is already an upper
                  ;; bound.
                  (min accumulator raw-sum)))
         (delta (- sum accumulator)))
    (assert (>= delta compensation))
    (values sum (- compensation delta))))

(declaim (ftype (function (&rest double-float)
                          (values double-float double-float &optional))
(defun sum-up (&rest values)
  "Conservative upper bound for the sum of values, with a Kahan
   summation loop."
  (let ((acc 0d0)
        (err 0d0))
    (dolist (value values (sum-update-finish acc err))
      (setf (values acc err)
            (sum-update-up acc err value)))))

We need one last thing to implement \(\log {n \choose a}\), and then Robbins’s confidence sequence: a safely rounded floating-point value approximation of \(-\log \sqrt{2 \pi}\). I precomputed one with computable-reals:

CL-USER> (computable-reals:-r
           (computable-reals:sqrt-r computable-reals:+2pi-r+)))
CL-USER> (computable-reals:ceiling-r 
          (computable-reals:*r *
                               (ash 1 53)))
CL-USER> (* -8277062471433908 (expt 2d0 -53))
CL-USER> (computable-reals:-r (rational *)

We can safely replace \(-\log\sqrt{2\pi}\) with -0.9189385332046727d0, or, equivalently, (scale-float -8277062471433908.0d0 -53), for an upper bound. If we wanted a lower bound, we could decrement the integer significand by one.

;;; Upper bound for log c(n, s).

(declaim (type double-float *minus-log-sqrt-2pi*))
(defvar *minus-log-sqrt-2pi* -0.9189385332046727d0
  "Smallest double precision value > -log sqrt(2pi).")

(declaim (ftype (function ((unsigned-byte 49) (unsigned-byte 49))
                          (values double-float double-float &optional))
(defun robbins-log-choose (n s)
  "Compute a conservative upper bound on log c(n, s) based on
   Robbins's bounds for k!."
  (check-type n (unsigned-byte 49)) ;; ensure 53 bit arith is exact.
  (check-type s (unsigned-byte 49))
  (assert (<= 0 s n))
  ;; Handle easy cases, where c(n, s) is 1 or n.
  (when (or (= n s)
            (zerop s))
    (return-from robbins-log-choose (values 0d0 0d0)))
  (when (or (= s 1)
            (= s (1- n)))
    (return-from robbins-log-choose (values (log-up (float n 1d0))
  (let* ((n (float n 1d0))
         (s (float s 1d0))
         (n-s (float (- n s) 1d0))
         (l1 (next (* (+ n .5d0) (log-up n)))) ; (+ n .5d0) is exact.
         (l2 (next (- (* (+ s .5d0) (log-down s)))))
         (l3 (next (- (* (+ n-s .5d0) (log-down n-s)))))
         (r1 (next (/ (* 12d0 n))))          ; (* 12d0 n) is exact.
         (r2 (next (- (/ (1+ (* 12d0 s)))))) ; also exact.
         (r3 (next (- (/ (1+ (* 12d0 n-s)))))))
    (sum-up *minus-log-sqrt-2pi*
            l1 l2 l3
            r1 r2 r3)))

We can quickly check against an exact implementation with computable-reals and a brute force factorial.

CL-USER> (defun cr-log-choose (n s)
            (computable-reals:log-r (alexandria:factorial n))
            (computable-reals:log-r (alexandria:factorial s))
            (computable-reals:log-r (alexandria:factorial (- n s)))))
CL-USER> (computable-reals:-r (rational (robbins-log-choose 10 5))
                              (cr-log-choose 10 5))
CL-USER> (computable-reals:-r (rational (robbins-log-choose 1000 500))
                              (cr-log-choose 1000 500))
CL-USER> (computable-reals:-r (rational (robbins-log-choose 1000 5))
                              (cr-log-choose 1000 5))

That’s not obviously broken: the error is pretty small, and always positive.

Given a function to over-approximate log-choose, the Confidence Sequence Method’s stopping criterion is straightforward.

(declaim (ftype (function ((unsigned-byte 49)
                           (real (0) (1))
                           (unsigned-byte 49)
                          (values boolean double-float &optional))
(defun csm (n alpha s log-eps)
  "Given n trials and s sucesses, are we reasonably sure that the
  success rate is *not* alpha (with a false positive rate < exp(log-eps))?

  Answer that question with Ding, Gandy, and Hahn's confidence
  sequence method (CSM). The second return value is an estimate of the
  false positive target rate we would need to stop here. This value
  should only be used for reporting; the target rate eps should always
  be fixed before starting the experiment."
  (check-type n (unsigned-byte 49))
  (check-type alpha (real (0) (1)))
  (check-type s (unsigned-byte 49))
  (check-type log-eps real)
  (assert (<= 0 s n))
  (let* ((log-choose (robbins-log-choose n s))
         (n (float n 1d0))
         (alpha (float alpha 1d0))
         (s (float s 1d0))
         (log-eps (float log-eps 1d0))
         (log-level (sum-up (log-up (1+ n))
                            (next (* s (log-up alpha)))
                            (next (* (- n s) (log1p-up (- alpha)))))))
    (values (< log-level log-eps) log-level)))

The other, much harder, part is computing credible (Bayesian) intervals for the Beta distribution. I won’t go over the code, but the basic strategy is to invert the CDF, a monotonic function, by bisection3, and to assume we’re looking for improbable (\(\mathrm{cdf} < 0.5\)) thresholds. This assumption lets us pick a simple hypergeometric series that is normally useless, but converges well for \(x\) that correspond to such small cumulative probabilities; when the series converges too slowly, it’s always conservative to assume that \(x\) is too central (not extreme enough).

That’s all we need to demo the code. Looking at the distribution of fill rates for the 1000 bins @ 30K ball/bin facet in

it looks like we almost always hit at least 97.5% global density, let’s say with probability at least 98%. We can ask the CSM to tell us when we have enough data to confirm or disprove that hypothesis, with a 0.1% false positive rate.

Instead of generating more data on demand, I’ll keep things simple and prepopulate a list with new independently observed fill rates.

CL-USER> (defparameter *observations* '(0.978518900
CL-USER> (defun test (n)
           (let ((count (count-if (lambda (x) (>= x 0.975))
                                  :end n)))
             (csm:csm n 0.98d0 count (log 0.001d0))))
CL-USER> (test 10)
CL-USER> (test 100)
CL-USER> (test 1000)
CL-USER> (test 2000)
CL-USER> (test 4000)

We can also use the inverse Beta CDF to get a 99.9% credible interval. After 4000 trials, we found 3972 successes.

CL-USER> (count-if (lambda (x) (>= x 0.975))
                   :end 4000)

These values give us the following lower and upper bounds on the 99.9% CI.

CL-USER> (csm:beta-icdf 3972 (- 4000 3972) 0.001d0)
CL-USER> (csm:beta-icdf 3972 (- 4000 3972) 0.001d0 t)

And we can even re-use and extend the Beta proportion code from earlier to generate this embeddable SVG report.

There’s one small problem with the sample usage above: if we compute the stopping criterion with a false positive rate of 0.1%, and do the same for each end of the credible interval, our total false positive (error) rate might actually be 0.3%! The next section will address that, and the equally important problem of estimating power.

Monte Carlo power estimation

It’s not always practical to generate data forever. For example, we might want to bound the number of iterations we’re willing to waste in an automated testing script. When there is a bound on the sample size, the CSM is still correct, just conservative.

We would then like to know the probability that the CSM will stop successfully when the underlying success rate differs from the threshold rate \(p\) (alpha in the code). The problem here is that, for any bounded number of iterations, we can come up with an underlying success rate so close to \(p\) (but still different) that the CSM can’t reliably distinguish between the two.

If we want to be able to guarantee any termination rate, we need two thresholds: the CSM will stop whenever it’s likely that the underlying success rate differs from either of them. The hardest probability to distinguish from both thresholds is close to the midpoint between them.

With two thresholds and the credible interval, we’re running three tests in parallel. I’ll apply a Bonferroni correction, and use \(\varepsilon / 3\) for each of the two CSM tests, and \(\varepsilon / 6\) for each end of the CI.

That logic is encapsulated in csm-driver. We only have to pass a success value generator function to the driver. In our case, the generator is itself a call to csm-driver, with fixed thresholds (e.g., 96% and 98%), and a Bernoulli sampler (e.g., return T with probability 97%). We can see if the driver returns successfully and correctly at each invocation of the generator function, with the parameters we would use in production, and recursively compute an estimate for that procedure’s success rate with CSM. The following expression simulates a CSM procedure with thresholds at 96% and 98%, the (usually unknown) underlying success rate in the middle, at 97%, a false positive rate of at most 0.1%, and an iteration limit of ten thousand trials. We pass that simulation’s result to csm-driver, and ask whether the simulation’s success rate differs from 99%, while allowing one in a million false positives.

CL-USER> (labels ((bernoulli (i &aux; (p 0.97d0))
                    (declare (ignore i))
                    (< (random 1d0) p))
                  (generator (i &aux; (p 0.97d0)
                                     (alpha 0.96d0) (alpha-hi 0.98d0)
                                     (eps 1d-3) (max-count 10000))
                    (declare (ignore i))
                    (multiple-value-bind (success success-hi estimate)
                        (csm:csm-driver #'bernoulli alpha eps
                                        :alpha-hi alpha-hi
                                        :max-count max-count)
                      ;; check that the CSM succeeds, and that it does so
                      ;; with correct estimates.
                      (let ((correct-alpha (if (< p alpha)
                                               (< estimate alpha)
                                               (> estimate alpha)))
                            (correct-hi (if (< p alpha-hi)
                                            (< estimate alpha-hi)
                                            (> estimate alpha-hi))))
                        (cond ((and success success-hi)
                               (and correct-alpha correct-hi))
           (csm:csm-driver #'generator 0.99d0 1d-6))

We find that yes, we can expect the 96%/98%/0.1% false positive/10K iterations setup to succeed more than 99% of the time. The code above is available as csm-power, with a tighter outer false positive rate of 1e-9. If we only allow 1000 iterations, csm-power quickly tells us that, with one CSM success in 100 attempts, we can expect the CSM success rate to be less than 99%.

CL-USER> (csm:csm-power 0.97d0 0.96d0 1000 :alpha-hi 0.98d0 :eps 1d-3 :stream *standard-output*)
         1 0.000e+0 1.250e-10 10.000e-1 1.699e+0
        10 0.000e+0 0.000e+0 8.660e-1 1.896e+1
        20 0.000e+0 0.000e+0 6.511e-1 3.868e+1
        30 0.000e+0 0.000e+0 5.099e-1 5.851e+1
        40 2.500e-2 5.518e-7 4.659e-1 7.479e+1
        50 2.000e-2 4.425e-7 3.952e-1 9.460e+1
        60 1.667e-2 3.694e-7 3.427e-1 1.144e+2
        70 1.429e-2 3.170e-7 3.024e-1 1.343e+2
        80 1.250e-2 2.776e-7 2.705e-1 1.542e+2
        90 1.111e-2 2.469e-7 2.446e-1 1.741e+2
       100 1.000e-2 2.223e-7 2.232e-1 1.940e+2
100 iterations, 1 successes (false positive rate < 1.000000e-9)
success rate p ~ 1.000000e-2
confidence interval [2.223495e-7, 0.223213    ]
p < 0.990000    
max inner iteration count: 816


SLO-ify all the things with this Exact test

Until now, I’ve only used the Confidence Sequence Method (CSM) for Monte Carlo simulation of phenomena that are naturally seen as boolean success / failures processes. We can apply the same CSM to implement an exact test for null hypothesis testing, with a bit of resampling magic.

Looking back at the balls and bins grid, the average fill rate seems to be slightly worse for 100 bins @ 60K ball/bin, than for 1000 bins @ 128K ball/bin. How can we test that with the CSM?

First, we should get a fresh dataset for the two setups we wish to compare.

CL-USER> (defparameter *100-60k* #(0.988110167
CL-USER> (defparameter *1000-128k* #(0.991456281
CL-USER> (alexandria:mean *100-60k*)
CL-USER> (alexandria:mean *1000-128k*)
CL-USER> (- * **)

The mean for 1000 bins @ 128K ball/bin is slightly higher than that for 100 bins @ 60k ball/bin. We will now simulate the null hypothesis (in our case, that the distributions for the two setups are identical), and determine how rarely we observe a difference of 0.00117 in means. I only use a null hypothesis where the distributions are identical for simplicity; we could use the same resampling procedure to simulate distributions that, e.g., have identical shapes, but one is shifted right of the other.

In order to simulate our null hypothesis, we want to be as close to the test we performed as possible, with the only difference being that we generate data by reshuffling from our observations.

CL-USER> (defparameter *resampling-data* (concatenate 'simple-vector *100-60k* *1000-128k*))
CL-USER> (length *100-60k*)
CL-USER> (length *1000-128k*)

The two observation vectors have the same size, 10000 values; in general, that’s not always the case, and we must make sure to replicate the sample sizes in the simulation. We’ll generate our simulated observations by shuffling the *resampling-data* vector, and splitting it in two subvectors of ten thousand elements.

CL-USER> (let* ((shuffled (alexandria:shuffle *resampling-data*))
                (60k (subseq shuffled 0 10000))
                (128k (subseq shuffled 10000)))
           (- (alexandria:mean 128k) (alexandria:mean 60k)))

We’ll convert that to a truth value by comparing the difference of simulated means with the difference we observed in our real data, \(0.00117\ldots\), and declare success when the simulated difference is at least as large as the actual one. This approach gives us a one-sided test; a two-sided test would compare the absolute values of the differences.

CL-USER> (csm:csm-driver 
          (lambda (_)
            (declare (ignore _))
            (let* ((shuffled (alexandria:shuffle *resampling-data*))
                   (60k (subseq shuffled 0 10000))
                   (128k (subseq shuffled 10000)))
              (>= (- (alexandria:mean 128k) (alexandria:mean 60k))
          0.005 1d-9 :alpha-hi 0.01 :stream *standard-output*)
         1 0.000e+0 7.761e-11 10.000e-1 -2.967e-1
        10 0.000e+0 0.000e+0 8.709e-1 -9.977e-1
        20 0.000e+0 0.000e+0 6.577e-1 -1.235e+0
        30 0.000e+0 0.000e+0 5.163e-1 -1.360e+0
        40 0.000e+0 0.000e+0 4.226e-1 -1.438e+0
        50 0.000e+0 0.000e+0 3.569e-1 -1.489e+0
        60 0.000e+0 0.000e+0 3.086e-1 -1.523e+0
        70 0.000e+0 0.000e+0 2.718e-1 -1.546e+0
        80 0.000e+0 0.000e+0 2.427e-1 -1.559e+0
        90 0.000e+0 0.000e+0 2.192e-1 -1.566e+0
       100 0.000e+0 0.000e+0 1.998e-1 -1.568e+0
       200 0.000e+0 0.000e+0 1.060e-1 -1.430e+0
       300 0.000e+0 0.000e+0 7.207e-2 -1.169e+0
       400 0.000e+0 0.000e+0 5.460e-2 -8.572e-1
       500 0.000e+0 0.000e+0 4.395e-2 -5.174e-1
       600 0.000e+0 0.000e+0 3.677e-2 -1.600e-1
       700 0.000e+0 0.000e+0 3.161e-2 2.096e-1
       800 0.000e+0 0.000e+0 2.772e-2 5.882e-1
       900 0.000e+0 0.000e+0 2.468e-2 9.736e-1
      1000 0.000e+0 0.000e+0 2.224e-2 1.364e+0
      2000 0.000e+0 0.000e+0 1.119e-2 5.428e+0


We tried to replicate the difference 2967 times, and did not succeed even once. The CSM stopped us there, and we find a CI for the probability of observing our difference, under the null hypothesis, of [0, 0.007557] (i.e., \(p < 0.01\)). Or, for a graphical summary, . We can also test for a lower \(p\)-value by changing the thresholds and running the simulation more times (around thirty thousand iterations for \(p < 0.001\)).

This experiment lets us conclude that the difference in mean fill rate between 100 bins @ 60K ball/bin and 1000 @ 128K is probably not due to chance: it’s unlikely that we observed an expected difference between data sampled from the same distribution. In other words, “I’m confident that the fill rate for 1000 bins @ 128K ball/bin is greater than for 100 bins @ 60K ball/bins, because it would be highly unlikely to observe a difference in means that extreme if they had the same distribution (\(p < 0.01\))”.

In general, we can use this exact test when we have two sets of observations, \(X\sb{0}\) and \(Y\sb{0}\), and a statistic \(f\sb{0} = f(X\sb{0}, Y\sb{0})\), where \(f\) is a pure function (the extension to three or more sets of observations is straightforward).

The test lets us determine the likelihood of observing \(f(X, Y) \geq f\sb{0}\) (we could also test for \(f(X, Y) \leq f\sb{0}\)), if \(X\) and \(Y\) were taken from similar distributions, modulo simple transformations (e.g., \(X\)’s mean is shifted compared to \(Y\)’s, or the latter’s variance is double the former’s).

We answer that question by repeatedly sampling without replacement from \(X\sb{0} \cup Y\sb{0}\) to generate \(X\sb{i}\) and \(Y\sb{i}\), such that \(|X\sb{i}| = |X\sb{0}|\) and \(|Y\sb{i}| = |Y\sb{0}|\) (e.g., by shuffling a vector and splitting it in two). We can apply any simple transformation here (e.g., increment every value in \(Y\sb{i}\) by \(\Delta\) to shift its mean by \(\Delta\)). Finally, we check if \(f(X\sb{i}, Y\sb{i}) \geq f\sb{0} = f(X\sb{0}, Y\sb{0})\); if so, we return success for this iteration, otherwise failure.

The loop above is a Bernoulli process that generates independent, identically distributed (assuming the random sampling is correct) truth values, and its success rate is equal to the probability of observing a value for \(f\) “as extreme” as \(f\sb{0}\) under the null hypothesis. We use the CSM with false positive rate \(\varepsilon\) to know when to stop generating more values and compute a credible interval for the probability under the null hypothesis. If that probability is low (less than some predetermined threshold, like \(\alpha = 0.001\)), we infer that the null hypothesis does not hold, and declare that the difference in our sample data points at a real difference in distributions. If we do everything correctly (cough), we will have implemented an Atlantic City procedure that fails with probability \(\alpha + \varepsilon\).

Personally, I often just set the threshold and the false positive rate unreasonably low and handwave some Bayes.

That’s all!

I pushed the code above, and much more, to github, in Common Lisp, C, and Python (probably Py3, although 2.7 might work). Hopefully anyone can run with the code and use it to test, not only SLO-type properties, but also answer more general questions, with an exact test. I’d love to have ideas or contributions on the usability front. I have some throw-away code in attic/, which I used to generate the SVG in this post, but it’s not great. I also feel like I can do something to make it easier to stick the logic in shell scripts and continuous testing pipelines.

When I passed around a first draft for this post, many readers that could have used the CSM got stuck on the process of moving from mathematical expressions to computer code; not just how to do it, but, more fundamentally, why we can’t just transliterate Greek to C or CL. I hope this revised post is clearer. Also, I hope it’s clear that the reason I care so much about not introducing false positive via rounding isn’t that I believe they’re likely to make a difference, but simply that I want peace of mind with respect to numerical issues; I really don’t want to be debugging some issue in my tests and have to wonder if it’s all just caused by numerical errors.

The reason I care so much about making sure users can understand what the CSM codes does (and why it does what it does) is that I strongly believe we should minimise dependencies whose inner working we’re unable to (legally) explore. Every abstraction leaks, and leakage is particularly frequent in failure situations. We may not need to understand magic if everything works fine, but, everything breaks eventually, and that’s when expertise is most useful. When shit’s on fire, we must be able to break the abstraction and understand how the magic works, and how it fails.

This post only tests ideal SLO-type properties (and regular null hypothesis tests translated to SLO properties), properties of the form “I claim that this indicator satisfies $PREDICATE x% of the time, with false positive rate y%” where the indicator’s values are independent and identically distributed.

The last assumption is rarely truly satisfied in practice. I’ve seen an interesting choice, where the service level objective is defined in terms of a sample of production requests, which can replayed, shuffled, etc. to ensure i.i.d.-ness. If the nature of the traffic changes abruptly, the SLO may not be representative of behaviour in production; but, then again, how could the service provider have guessed the change was about to happen? I like this approach because it is amenable to predictive statistical analysis, and incentivises communication between service users and providers, rather than users assuming the service will gracefully handle radically new crap being thrown at it.

Even if we have a representative sample of production, it’s not true that the service level indicators for individual requests are distributed identically. There’s an easy fix for the CSM and our credible intervals: generate i.i.d. sets of requests by resampling (e.g., shuffle the requests sample) and count successes and failures for individual requests, but only test for CSM termination after each resampled set.

On a more general note, I see the Binomial and Exact tests as instances of a general pattern that avoids intuitive functional decompositions that create subproblems that are harder to solve than the original problem. For example, instead of trying to directly determine how frequently the SLI satisfies some threshold, it’s natural to first fit a distribution on the SLI, and then compute percentiles on that distribution. Automatically fitting an arbitrary distribution is hard, especially with the weird outliers computer systems spit out. Reducing to a Bernoulli process before applying statistics is much simpler. Similarly, rather than coming up with analytical distributions in the Exact test, we brute-force the problem by resampling from the empirical data. I have more examples from online control systems... I guess the moral is to be wary of decompositions where internal subcomponents generate intermediate values that are richer in information than the final output.

Thank you Jacob, Ruchir, Barkley, and Joonas for all the editing and restructuring comments.

  1. Proportions are unscaled probabilities that don’t have to sum or integrate to 1. Using proportions instead of probabilities tends to make calculations simpler, and we can always get a probability back by rescaling a proportion by the inverse of its integral.

  2. Instead of a \(\mathrm{Beta}(a+1, b+1)\), they tend to bound with a \(\mathrm{Beta}(a, b)\). The difference is marginal for double-digit \(n\).

  3. I used the bisection method instead of more sophisticated ones with better convergence, like Newton’s method or the derivative-free Secant method, because bisection already adds one bit of precision per iteration, only needs a predicate that returns “too high” or “too low,” and is easily tweaked to be conservative when the predicate declines to return an answer.

Eugene ZaikonnikovA tiny Lisp bytecode interpreter in Z-80 assembly

· 13 days ago

It all started with a raid on a long abandoned hosting service. Seen a mention of it in the news, leading to a vague recollection of using it for something. Email address associated with the account was long defunct, and the service itself changed ownership a few times in the past two decades. But incredibly, I could recall login credentials and they worked still.

Amazingly, in a pile of abandoned HTML templates, obsolete software archives and Under Construction GIFs there was a source file for a project I long considered lost. It's a minimal Lisp bytecode interpreter written in assembly for ZX Spectrum along the lines of MIT AIM-514. Save for address locations and maybe a couple ROM calls for error reporting it's generic Z-80 code.

It was a part of bigger project that should have included a primitive REPL, but no trace of that was found. Also, am quite sure there is a henious bug lurking in the mark&sweep GC. Should really find time to finally debug that!

Paul KhuongAn Old Conjecture on Stream Transducers

· 20 days ago

I’ve been thinking about stream processing again, and came back to an old pick-two-of-three conjecture of mine: for stream processing without dynamic allocation, “arbitrary outputs per input, multiple consumers, multiple producers: choose two.”

The question is interesting because stream processing in constant space is a subset of L (or FL), and thus probably not P-complete, let alone Turing complete. Having easily characterisable subsets of stream processing that can be implemented in constant space would be a boon for the usability of stream DSLs.

I think I find this academic trope as suspicious as @DRMavIver does, so I have mixed feelings about the fact that this one still feels true seven years later.

The main reason I believe in this conjecture is the following example, F(S(X), X), where S is the function that takes a stream and ouputs every other value. Or, more formally, \(F\sb{i} = f(X\sb{2i}, X\sb{i})\).

Let’s say X is some stream of values that can’t be easily re-computed (e.g., each output value is the result of a slow computation). How do we then compute F(S(X), X) without either recomputing the stream X, or buffering an unbounded amount of past values from that stream? I don’t see a way to do so, not just in any stream processing DSL (domain specific language), but also in any general purpose language.

For me, the essence of the problem is that the two inputs to F are out of sync with respect to the same source of values, X: one consumes two values of X per invocation of F, and the other only one. This issue could also occur if we forced stream transducers (processing nodes) to output a fixed number of value at each invocation: let S repeat each value of X twice, i.e., interleave X with X (\(F\sb{i} = f(X\sb{\lfloor i / 2\rfloor}, X\sb{i})\)).

Forcing each invocation of a transducer to always produce exactly one value is one way to rule out this class of stream processing network. Two other common options are to forbid either forks (everything is single-use or subtrees copied and recomputed for each reuse) or joins (only single-input stream processing nodes).

I don’t think this turtle-and-hare desynchronisation problem is a weakness in stream DSLs, I only see a reasonable task that can’t be performed in constant space. Given the existence of such tasks, I’d like to see stream processing DSLs be explicit about the tradeoffs they make to balance performance guarantees, expressiveness, and usability, especially when it comes to the performance model.

Christophe Rhodessbcl method tracing

· 30 days ago

Since approximately forever, sbcl has advertised the possibility of tracing individual methods of a generic function by passing :methods t as an argument to trace. Until recently, tracing methods was only supported using the :encapsulate nil style of tracing, modifying the compiled code for function objects directly.

For a variety of reasons, the alternative :encapsulate t implementation of tracing, effectively wrapping the function with some code to run around it, is more robust. One problem with :encapsulate nil tracing is that if the object being traced is a closure, the modification of the function's code will affect all of the closures, not just any single one - closures are distinct objects with distinct closed-over environments, but they share the same execuable code, so modifying one of them modifies all of them. However, the implementation of method tracing I wrote in 2005 - essentially, finding and tracing the method functions and the method fast-functions (on which more later) - was fundamentally incompatible with encapsulation; the method functions are essentially never called by name by CLOS, but by more esoteric means.

What are those esoteric means, I hear you ask?! I'm glad I can hear you. The Metaobject Protocol defines a method calling convention, such that method calls receive as two arguments firstly: the entire argument list as the method body would expect to handle; and secondly: the list of sorted applicable next methods, such that the first element is the method which should be invoked if the method uses call-next-method. So a method function conforming to this protocol has to:

  1. destructure its first argument to bind the method parameters to the arguments given;
  2. if call-next-method is used, reconstruct an argument list (in general, because the arguments to the next method need not be the same as the arguments to the existing method) before calling the next method's method-function with the reconstructed argument list and the rest of the next methods.

But! For a given set of actual arguments, for that call, the set of applicable methods is known; the precedence order is known; and, with a bit of bookkeeping in the implementation of defmethod, whether any individual method actually calls call-next-method is known. So it is possible, at the point of calling a generic-function with a set of arguments, to know not only the first applicable method, but in fact all the applicable methods, their ordering, and the combination of those methods that will actually get called (which is determined by whether methods invoke call-next-method and also by the generic function's method combination).

Therefore, a sophisticated (and by "sophisticated" here I mean "written by the wizards at Xerox PARC)") implementation of CLOS can compile an effective method for a given call, resolve all the next-method calls, perform some extra optimizations on slot-value and slot accessors, improve the calling convention (we no longer need the list of next methods, but only a single next effective-method, so we can spread the argument list once more), and cache the resulting function for future use. So the one-time cost for each set of applicable methods generates an optimized effective method, making use of fast-method-functions with the improved calling convention.

Here's the trick, then: this effective method is compiled into a chain of method-call and fast-method-call objects, which call their embedded functions. This, then, is ripe for encapsulation; to allow method tracing, all we need to do is arrange at compute-effective-method time that those embedded functions are wrapped in code that performs the tracing, and that any attempt to untrace the generic function (or to modify the tracing parameters) reinitializes the generic function instance, which clears all the effective method caches. And then Hey Bob, Your Uncle's Presto! and everything works.

(defgeneric foo (x)
  (:method (x) 3))
(defmethod foo :around ((x fixnum))
  (1+ (call-next-method)))
(defmethod foo ((x integer))
  (* 2 (call-next-method)))
(defmethod foo ((x float))
  (* 3 (call-next-method)))
(defmethod foo :before ((x single-float))
(defmethod foo :after ((x double-float))

Here's a generic function foo with moderately complex methods. How can we work out what is going on? Call the method tracer!

CL-USER> (foo 2.0d0)
  0: (FOO 2.0d0)
    1: ((SB-PCL::COMBINED-METHOD FOO) 2.0d0)
      2: ((METHOD FOO (FLOAT)) 2.0d0)
        3: ((METHOD FOO (T)) 2.0d0)
        3: (METHOD FOO (T)) returned 3
      2: (METHOD FOO (FLOAT)) returned 9
      2: ((METHOD FOO :AFTER (DOUBLE-FLOAT)) 2.0d0)
    1: (SB-PCL::COMBINED-METHOD FOO) returned 9
  0: FOO returned 9

This mostly works. It doesn't quite handle all cases, specifically when the CLOS user adds a method and implements call-next-method for themselves:

(add-method #'foo
            (make-instance 'standard-method
             :qualifiers '()
             :specializers (list (find-class 'fixnum))
             :lambda-list '(x)
             :function (lambda (args nms) (+ 2 (funcall (sb-mop:method-function (first nms)) args (rest nms))))))
CL-USER> (foo 3)
  0: (FOO 3)
      2: ((METHOD FOO (FIXNUM)) 3)
      2: (METHOD FOO (FIXNUM)) returned 8
    1: (METHOD FOO :AROUND (FIXNUM)) returned 9
  0: FOO returned 9

In this trace, we have lost the method trace from the direct call to the method-function, and calls that that function makes; this is the cost of performing the trace in the effective method, though a mitigating factor is that we have visibility of method combination (through the (sb-pcl::combined-method foo) line in the trace above). It would probably be possible to do the encapsulation in the method object itself, by modifying the function and the fast-function, but this requires rather more book-keeping and (at least theoretically) breaks the object identity: we do not have licence to modify the function stored in a method object. So, for now, sbcl has this imperfect solution for users to try (expected to be in sbcl-1.4.9, probably released towards the end of June).

(I can't really believe it's taken me twelve years to do this. Other implementations have had this working for years. Sorry!)

Quicklisp newsNo May 2018 Quicklisp dist update

· 44 days ago
The computer on which I make Quicklisp builds stopped working a little while ago, and I haven't had time to dive in and work on it. As soon as it's fixed, I'll prepare and release a new dist. Sorry about the inconvenience!

Christophe Rhodessbcl method-combination fixes

· 50 days ago

At the 2018 European Lisp Symposium, the most obviously actionable feedback for SBCL from a presentation was from Didier's remorseless deconstruction of SBCL's support for method combinations (along with the lack of explicitness about behavioural details in the ANSI CL specification and the Art of the Metaobject Protocol). I don't think that Didier meant to imply that SBCL was particularly bad at method combinations, compared with other available implementations - merely that SBCL was a convenient target. And, to be fair, there was a bug report from a discussion with Bruno Haible back in SBCL's history - May/June 2004, according to my search - which had languished largely unfixed for fourteen years.

I said that I found the Symposium energising. And what better use to put that energy than addressing user feedback? So, I spent a bit of time earlier this month thinking, fixing and attempting to work out what behaviours might actually be useful. To be clear, SBCL's support for define-method-combination was (probably) standards-compliant in the usual case, but if you follow the links from above, or listen to Didier's talk, you will be aware that that's not saying all that much, in that almost nothing is specified about behaviours under redefinition of method combinations.

So, to start with, I solved the cache invalidation (one of the hardest problems in Computer Science), making sure that discriminating functions and effective methods are reset and invalidated for all affected generic functions. This was slightly complicated by the strategy that SBCL has of distinguishing short and long method-combinations with distinct classes (and distinct implementation strategies for compute-effective-method); but this just needed to be methodical and careful. Famous last words: I think that all method-combination behaviour in SBCL is now coherent and should meet user expectations.

More interesting, I think, was coming up with test cases for desired behaviours. Method combinations are not, I think, widely used in practice; whether that is because of lack of support, lack of understanding or lack of need of what they provide, I don't know. (In fact in conversations at ELS we discussed another possibility, which is that everyone is more comfortable customising compute-effective-method instead - both that and define-method-combination provide ways for inserting arbitrary code for the effect of a generic function call with particular arguments. But what this means is that there isn't, as far as I know at least, a large corpus of interesting method combinations to play with.

One interesting one which came up: Bike on #lisp designed an implementation using method-combinations of finite state machines, which I adapted to add to SBCL's test suite. My version looks like:

(define-method-combination fsm (default-start)
    ((primary *))
    (:arguments &key start)
  `(let ((state (or ,start ',default-start)))
         (,@(mapcar (lambda (m) `(,(first (method-qualifiers m))
                                  (lambda ()
                                    (setq state (call-method ,m))
                                    (if (and (typep state '(and symbol (not null)))
                                             (find-restart state))
                                        (invoke-restart state)
       (invoke-restart state))))

and there will be more on this use of restart-bind in a later post, I hope. Staying on the topic of method combinations, how might one use this fsm method combination? A simple example might be to recognize strings with an even number of #\a characters:

;;; first, define something to help with all string parsing
(defclass info ()
  ((string :initarg :string)
   (index :initform 0)))
;;; then the state machine itself
(defgeneric even-as (info &key &allow-other-keys)
  (:method-combination fsm :yes))
(defmethod even-as :yes (info &key)
  (with-slots ((s string) (i index)) info
    (cond ((= i (length s)) t) ((char= (char s i) #\a) (incf i) :no) (t (incf i) :yes))))
(defmethod even-as :no (info &key)
  (with-slots ((s string) (i index)) info
    (cond ((= i (length s)) nil) ((char= (char s i) #\a) (incf i) :yes) (t (incf i) :no))))

(Exercise for the reader: adapt this to implement a Turing Machine)

Another example of (I think) an interesting method combination was one which I came up with in the context of generalized specializers, for an ELS a while ago: the HTTP Request method-combination to be used with HTTP Accept specializers. I'm interested in more! A github search found some examples before I ran out of patience; do you have any examples?

And I have one further question. The method combination takes arguments at generic-function definition time (the :yes in (:method-combination fsm :yes)). Normally, arguments to things are evaluated at the appropriate time. At the moment, SBCL (and indeed all other implementations I tested, but that's not strong evidence given the shared heritage) do not evaluate the arguments to :method-combination - treating it more like a macro call than a function call. I'm not sure that is the most helpful behaviour, but I'm struggling to come up with an example where the other is definitely better. Maybe something like

(let ((lock (make-lock)))
  (defgeneric foo (x)
    (:method-combination locked lock)
    (:method (x) ...)))

Which would allow automatic locking around the effective method of FOO through the method combination? I need to think some more here.

In any case: the method-combination fixes are in the current SBCL master branch, shortly to be released as sbcl-1.4.8. And there is still time (though not very much!) to apply for the many jobs advertised at Goldsmiths Computing - what better things to do on a Bank Holiday weekend?

Marco AntoniottiSome updates: bugs fixing and CLAD.

· 59 days ago
Hello there,

it has been a very long time since I posted here, but most recently, thanks to a couple of pesky bug reports on HEΛP by Mirko Vukovic, and because I had a couple of days relatively free, I was able to go back to do some programming, fix (some of) the bugs and post here.

Here is the story.  There were two bugs which I had to deal with (there are more, of course).
  1. A bug triggered by CCL and its implementation of the Common Lisp Reader algorithm.
  2. A buglet due to missing supporting data (.css files in this case) in the deployment of the HEΛP generated documentation.
The first bug was quite difficult to track down and it boiled down to CCL bailing out on READ in an unexpected way (that is, with respect to other implementations).  As an aside, this is a problem of the standard non having a predefined condition for "error caused by the reader because it does not find a package"; LW has CONDITIONS:PACKAGE-NOT-FOUND-READER, but that is not standard, and some implementation just signal READER-ERROR or PACKAGE-ERROR.  The error was easy to "fix" once diagnosed: just don't process files that you know will be problematic, and HEΛP can already "exclude" such files.

The second bug was easier to diagnose, but the fix was more complicated (especially due to the NIH syndrome I suffer from).  The problem is that ASDF moves the compiled code around, but not auxiliary data, like in my case, .css files.  I could have followed what ASDF does, but I decided to go another way and came up with a small library I called Common Lisp Application Data (CLAD, because you need to "dress" your code).


By now, at least on Windows 10 and Mac OS X (and Linux), there is a a notion of an Application and Data Folder. The user version of this folder (as opposed to the system one) is ~/Library/ on Mac OS X and %USERPROFILE%\AppData\Roaming\ (this is the "roaming" profile in W10 parlance).  For Linux there are several semi-standards, one of them is the XDG base directory specification; in this case a common solution is to use the ~/.config/ folder.

CLAD assumes these "fixed" locations to create per-application or per-library subfolders of a "Common Lisp" one.  That is, CLAD ensures the presence of the following folders in your account.
  • Mac Os X: ~/Library/Common Lisp/
  • Windows 10: %USERPROFILE%\AppData\Roaming\Common Lisp\
  • Linux/Unix: ~/.config/common-lisp/ (in this case, I am following ASDF lead)
The library exports three simple functions,
  1. user-cl-data-folder, which returns the pathnames of the folders above.
  2. ensure-app-or-library-data-folder, which, as the name implies, ensures that a subfolder exists in the proper location.
  3. app-or-library-data-folder, which returns the pathname associated to a library or app.
A library or app can now set itself up by doing something like

(defparameter *helambdap-data-folder*
  (clad:ensure-app-or-library-data-folder "HELambdaP")
  "The user HELambdaP data folder.")

On Mac OS X, this results in the folder ~/Library/Common Lisp/HELambdaP; a library or an application can now rely on a clear space where to store "common" data files.  For HEΛP it solved the problem of where to find the .css files in a reliable place.

Trivial? Yes.
NIH syndrome? Of course.
Complete? No.
Useful? You be the judge of that.


LispjobsLisp Developer, 3E, Brussels, Belgium

· 61 days ago


You join a team of developers, scientists, engineers and business developers that develop, operate and commercialize SynaptiQ worldwide.

You work in a Linux-based Java, Clojure and Common Lisp environment. Your focus is on the development, maintenance, design and unit testing of SynaptiQ's real-time aggregation and alerting engine that processes time-series and events. This data engine is Common Lisp based.

The objective is to own the entire lifecycle of the platform, that is from the architecture and development of new features to the deployment and operation of the platform in production environment. The position is open to candidates with no knowledge of LISP if they have a good affinity and experience in functional languages.

Christophe Rhodesalgorithms and data structures term2

· 68 days ago

I presented some of the work on teaching algorithms and data structures at the 2018 European Lisp Symposium

Given that I wanted to go to the symposium (and I'm glad I did!), the most economical method for going was if I presented research work - because then there was a reasonable chance that my employer would fund the expenses (spoiler: they did; thank you!). It might perhaps be surprising to hear that they don't usually directly fund attending events where one is not presenting; on the other hand, it's perhaps reasonable on the basis that part of an academic's job as a scholar and researcher is to be creating and disseminating new knowledge, and of course universities, like any organizations, need to prioritise spending money on things which bring value or further the organization's mission.

In any case, I found that I wanted to write about the teaching work that I have been doing, and in particular I chose to write about a small, Lisp-related aspect. Specifically, it is now fairly normal in technical subjects to perform a lot of automated testing of students; it relieves the burden on staff to assess things which can be mechanically assessed, and deliver feedback to individual students which can be delivered automatically; this frees up staff time to perform targeted interventions, give better feedback on more qualitative aspects of the curriculum, or work fewer weekends of the year. A large part of my teaching work for the last 18 months has been developing material for these automated tests, and working on the infrastructure underlying them, for my and colleagues' teaching.

So, the more that we can test automatically and meaningfully, the more time we have to spend on other things. The main novelty here, and the lisp-related hook for the paper I submitted to ELS, was being able to give meaningful feedback on numerical answer questions which probed whether students were developing a good mental model of the meaning of pseudocode. That's a bit vague; let's be specific and consider the break and continue keywords:

x ← 0
for 0 ≤ i < 9
  x ← x + i
  if x > 17
  end if
  x ← x + 1
end for
return x

The above pseudocode is typical of what a student might see; the question would be "what does the above block of pseudocode return?", which is mildly arithmetically challenging, particularly under time pressure, but the conceptual aspect that was being tested here was whether the student understood the effect of continue. Therefore, it is important to give the student specific feedback; the more specific, the better. So if a student answered 20 to this question (as if the continue acted as a break), they would receive a specific feedback message reminding them about the difference between the two operators; if they answered 45, they received a message reminding them that continue has a particular meaning in loops; and any other answers received generic feedback.

Having just one of these questions does no good, though. Students will go to almost any lengths to avoid learning things, and it is easy to communicate answers to multiple-choice and short-answer questions among a cohort. So, I needed hundreds of these questions: at least one per student, but in fact by design the students could take these multiple-chocie quizzes multiple times, as they are primarily an aid for the students themselves, to help them discover what they know.

Now of course I could treat the above pseudocode fragment as a template, parameterise it (initial value, loop bounds, increment) and compute the values needing the specific feedback in terms of the values of the parameters. But this generalizes badly: what happens when I decide that I want to vary the operators (say to introduce multiplication) or modify the structure somewhat (e.g. by swapping the two increments before and after the continue)? The parametrization gets more and more complicated, the chances of (my) error increase, and perhaps most importantly it's not any fun.

Instead, what did I do? With some sense of grim inevitability, I evolved (or maybe accreted) an interpreter (in emacs lisp) for a sexp-based representation of this pseudocode. At the start of the year, it's pretty simple; towards the end it has developed into an almost reasonable mini-language. Writing the interpreter is straightforward, though the way it evolved into one gigantic case statement for supported operators rather than having reasonable semantics is a bit of a shame; as a bonus, implementing a pretty-printer for the sexp-based pseudocode, with correct indentation and keyword highlighting, is straightforward. Then armed with the pseudocode I will ask the students to interpret, I can mutate it in ways that I anticipate students might think like (replacing continue with break or progn) and interpret that form to see which wrong answer should generate what feedback.

Anyway, that was the hook. There's some evidence in the paper that the general approach of repeated micro-assessment, and also the the consideration of likely student mistakes and giving specific feedback, actually works. And now that the (provisional) results are in, how does this term compare with last term? We can look at the relationship between this term's marks and last term's. What should we be looking for? Generally, I would expect marks in the second term's coursework to be broadly similar to the marks in the first term - all else being equal, students who put in a lot of effort and are confident with the material in term 1 are likely to have an easier time integrating the slightly more advanced material in term 2. That's not a deterministic rule, though; some students will have been given a wake-up call by the term 1 marks, and equally some students might decide to coast.

plot of term 2 marks against term 1: a = 0.82, R² = 0.67

I've asked R to draw the regression line in the above picture; a straight line fit seems reasonable based on the plot. What are the statistics of that line?

R> summary(lm(Term2~Term1, data=d))

lm(formula = Term2 ~ Term1, data = d)

        Min      1Q  Median      3Q     Max 
    -41.752  -6.032   1.138   6.107  31.155 

Estimate Std. Error t value Pr(>|t|)
(Intercept)  3.18414    4.09773   0.777    0.439
Term1        0.82056    0.05485  14.961   <2e-16 ***
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.46 on 107 degrees of freedom
  (32 observations deleted due to missingness)
Multiple R-squared:  0.6766,    Adjusted R-squared:  0.6736 
F-statistic: 223.8 on 1 and 107 DF,  p-value: < 2.2e-16

Looking at the summary above, we have a strong positive relationship between term 1 and term 2 marks. The intercept is approximately zero (if you got no marks in term 1, you should expect no marks in term 2), and the slope is less than one: on average, each mark a student got in term 1 tended to convert to 0.8 marks in term 2 - this is plausibly explained by the material being slightly harder in term 2, and by the fact that some of the assessments were more explicitly designed to allow finer discrimination at the top end - marks in the 90s. (A note for international readers: in the UK system, the pass mark is 40%, excellent work is typically awarded a mark in the 70% range - marks of 90% should be reserved for exceptional work). The average case is, however, only that: there was significant variation from that average line, and indeed (looking at the quartiles) over 50% of the cohort was more than half a degree class (5 percentage points) away from their term 2 mark as "predicted" from their mark for term 1.

All of this seems reasonable, and it was a privilege to work with this cohort of students, and to present the sum of their interactions on this course to the audience I had. I got the largest round of applause, I think, for revealing that as part of the peer-assessment I had required that students run each others' code. I also had to present some of the context for the work; not only because this was an international gathering, with people in various university systems and from industry, but also because of the large-scale disruption caused by industrial action over the Universities Superannuation Scheme (the collective, defined benefit pension fund for academics at about 68 Universities and ~300 other bodies associated with Higher Education). Perhaps most gratifyingly, students were able to continue learning despite being deprived of their tuition for three consecutive weeks; judging by their performance on the various assessments so far,

And now? The students will sit an exam, after which I and colleagues will look in detail at those results and the relationship with the students' coursework marks (as I did last year). I will continue developing this material (my board for this module currently lists 33 todo items), and adapt it for next year and for new cohorts. And maybe you will join me? The Computing department at Goldsmiths is hiring lecturers and senior lecturers to come and participate in research, scholarship and teaching in computing: a lecturer in creative computing, a lecturer in computer games, a lecturer in data science, a lecturer in physical and creative computing, a lecturer in computer science and a senior lecturer in computer science. Anyone reading this is welcome to contact me to find out more!

Victor Anyakin&#8220;A new way of blogging about Common Lisp&#8221; by Yehonathan Sharvit

· 69 days ago

Seen this post mentioned several times on Twitter, but not on Planet Lisp yet. So, here it is:

A new way of blogging about Common Lisp by Yehonathan Sharvit (@viebel).

Christophe Rhodesels2018 reflections

· 72 days ago

A few weeks ago, I attended the 2018 European Lisp Symposium.

If you were at the 2017 iteration, you might (as I do) have been left with an abiding image of greyness. That was not at all to fault the location (Brussels) or the organization (co-located with ) of the symposium; however, the weather was unkind, and that didn't help with my own personal low-energy mood at the time.

This year, the event was organized by Ravenpack in Marbella. And the first thing that I noticed on landing was the warmth. (Perhaps the only "perk" of low-cost airlines is that it is most likely that one will disembark on to terra firma rather than a passenger boarding bridge. And after quite a miserable winter in the UK, the warmth and the sunshine at the bus stop, while waiting for the bus from Malagá airport to Marbella, was very welcome indeed. The sense of community was strong too; while waiting for the bus, I hopped on to #lisp IRC and Nic Hafner volunteered to meet me at the Marbella bus station and walk with me to my hotel; we ended up going for drinks at a beachside bar that evening with Dimitri Fontaine, Christian Schafmeister, Mikhail Raskin and others - and while we were sitting there, enjoying the setting, I saw other faces I recognised walking past along the beach promenade.

The setting for the conference itself, the Centro Cultural Cortijo de Miraflores, was charming. We had the use of a room, just about big enough for the 90ish delegates; the centre is the seat of the Marbella Historical Archives, and our room was next to the olive oil exhibition.

2018 European Lisp Symposium opening

We also had an inside courtyard for the breaks; sun, shade, water and coffee were available in equal measure, and there was space for good conversations - I spoke with many people, and it was great to catch up and reminisce with old friends, and to discuss the finer points of language implementation and release management with new ones. (I continue to maintain that the SBCL time-boxed monthly release cadence of master, initiated by Bill Newman way back in the SBCL 0.7 days in 2002 [!], has two distinct advantages, for our situation, compared with other possible choices, and I said so more than once over coffee.)

The formal programme, curated by Dave Cooper, was great too. Zach's written about his highlights; I enjoyed hearing Jim Newton's latest on being smarter about typecase, and Robert Smith's presentation about quantum computing, including a walkthrough of a quantum computing simulator. As well as quantum computing, application areas mentioned in talks this year included computational chemistry, data parallel processing, pedagogy, fluid dynamics, architecture, sentiment analysis and enterprise resource planning - and there were some implementation talks in there too, not least from R. "it's part of the brand" Matthew Emerson, who gave a paean to Clozure Common Lisp. Highly non-scientific SBCL user feedback from ELS presenters: most of the speakers with CL-related talks or applications at least mentioned SBCL; most of those mentions by speaker were complimentary, but it's possible that most by raw count referred to bugs - solely due to Didier Verna's presentation about method combinations (more on this in a future blog post).

Overall, I found this year's event energising. I couldn't be there any earlier than the Sunday evening, nor could I stay beyond Wednesday morning, so I missed at least the organized social event, but the two days I had were full and stress-free (even when chairing sessions and for my own talk). The local organization was excellent; Andrew Lawson, Nick Levine and our hosts at the foundation did us proud; we had a great conference dinner, and the general location was marvellous.

Next year's event will once again be co-located with <Programming>, this time in Genova (sunny!) on 1st-2nd April 2019. Put it in your calendar now!

Didier VernaLisp, Jazz, Aikido, 10 years later

· 75 days ago

10 years ago, I published a short blog entitled "Lisp, Jazz, Aikido", barely scratching the surface of what I found to be commonalities between the 3 disciplines. At the time, I had the intuition that those ideas were the tip of a potentially big iceberg, and I ended the blog with the following sentence: "I'd like to write a proper essay about these things when I find the time... someday."

Well, 10 years later, I did. The essay, which is 50 pages long, has been published in the Art, Science, and Engineering of Programming Journal, and actually received the Reviewers'Choice Award 2018. I'm not the bragging type, far from it, but I had to mention this because this essay is so personal, and I invested so much in its preparation (more than 300 hours) that I am as deeply touched by the award as I would have been hurt, had it been negatively received...

The live presentation has unfortunately not been recorded, but I took the time to make a screencast afterwards, which is now available on YouTube. Just like the essay, this presentation is not in the typical setting that you'd expect at a scientific conference...

If you've got an artistic fiber, if you're sensitive to the aesthetic dimension in what you do, you may enjoy this work...

Zach BeaneThoughts on ELS 2018

· 75 days ago

Matt Emerson opened the conference keynote talk on Clozure CL. He also touched on using and celebrating CL. It provided a jolt of enthusiasm and positivity to kick off a fun conference. And it made me want to use Clozure CL more often and learn how to effectively hack on and with it.

Nicolas Hafner's talk on shaders was interesting, but partway through the talk he revealed that the slideshow system itself was an interactive Common Lisp program. Along with standard slideshow behavior, it displayed and animated the classic teapot, with interactive recompilation and redisplay of the shaders.

Ethan Schwartz's talk on Google's tweaks to SBCL's GC was a reminder that you don't have to settle for the system-provided customization knobs. When source is available and you're willing to customize very specifically for your application's workload, you can get nice improvements.

(Also his passing mention of Docker inspired me to work on a Docker image with all the Quicklisp-required foreign libraries baked into it. I think it will be helpful to me for easier testing, and maybe distributed builds. It should be helpful to others for testing everything in Quicklisp without the pain of gathering all the build requirements for its many projects.)

You should check out Douglas Katzman's lightning talk, "Self-modifying code (for fun and profit)". He's added a neat thing in SBCL where the system can be disassembled completely and reassembled into a new binary with various bits of optimization or instrumentation. 

There were many other great talks and discussions!

It's rejuvenating to hear about all the cool things people are working on. There's stuff people are making that you might make use of in your own projects, or inspiration to hack your own things to share with others.

Also: great weather, great city, fun excursion to Ronda, seeing old friends from past conferences, meeting new friends, putting online names to real-life faces, hearing directly from people who use Quicklisp. And Dmitri Fontaine told me about tic80, which my kid and I used to make a retro game in just a few days.

Next ELS conference is in Genova, Italy on April 1, 2019. Start planning today, and I'll see you there!

Quicklisp newsQuicklisp dist update for April, 2018 now available

· 76 days ago
New projects:
  • bst — Binary search tree — GPL-3
  • cl-colors2 — Simple color library for Common Lisp — Boost Software License - Version 1.0
  • cl-env — Easily parse .env files. That's it! — MIT
  • cl-gopher — Gopher protocol library — BSD 2-Clause
  • clunit2 — CLUnit is a Common Lisp unit testing framework. — BSD
  • dml — Diagram Make Language for common lisp — MIT License
  • external-symbol-not-found — Portability library for detecting reader ~ errors coming from reading non-existing or non-external symbols in packages — Unlicense
  • golden-utils — Auxiliary utilities (AU). — MIT
  • linedit — Readline-style library. — MIT
  • lisp-interface-library — Long name alias for lil — MIT
  • ppath — A Common Lisp path handling library based on Python's os.path module — BSD
  • practical-cl — All code from Practical Common Lisp. — BSD
  • quux-hunchentoot — Thread pooling for hunchentoot — MIT
  • s-dot2 — Render Graphviz graphs from within Lisp — BSD-style
Updated projects3d-matrices3d-vectorsagnostic-lizardarchitecture.builder-protocolceplcl-anacl-bootstrapcl-conllucl-cpuscl-dbicl-factoringcl-fadcl-feedparsercl-fixturescl-flowcl-fluent-loggercl-fondcl-gbmcl-glfw3cl-graylogcl-mlepcl-notebookcl-packcl-pythoncl-random-forestcl-readlinecl-sdl2cl-strcl-yesqlclawclemcloser-mopclwebclxcroatoandefenumdjuladocumentation-utils-extensionsdufydynaesrapfare-memoizationfemlispgamebox-dgengamebox-frame-managergamebox-mathglsl-specglsl-toolkithousehu.dwim.delicoinkwellironcladjonathanjoselacklisp-chatlistopialocal-time-durationmaidenmatlispmcclimmitonibblesninevehoclcloverlordoxenfurtparser.common-rulesparsleyperlrepngloadpostmodernqlotrovertg-mathserapeumshadowstaplestumpwmuniversal-configunix-optsutmvarjoxmls.

Removed projects: cl-openid, cl-sendmail, xmls-tools.

The removed projects depend on XMLS, which was recently updated with a slightly different API, and they were never updated to match.

To get this update, use (ql:update-dist "quicklisp").


TurtleWareMcCLIM demo - Gadgets

· 79 days ago

My second video where I explain CLIM gadget abstraction. This is a live coding session which takes around 1h30m. Enjoy!

If you like it - let me know. If you hate it - let me know. :-)

LispjobsSenior Lisp Developer, RavenPack, Marabella, Spain

· 90 days ago


As a Common Lisp Developer, you will be part of the Analytics team which is in charge of the development and maintenance of applications that, among other things, extract data from incoming news and deliver user and machine-friendly analytics to customers.

You will be reporting directly to the Analytics Manager and will work with an international team of developers skilled in Common Lisp, Java, Python and SQL.

The ability to communicate effectively in English both in writing and verbally is a must. Knowledge of Spanish is not a business requirement. European Union legal working status is required. Competitive compensation and a fun working environment. Relocation assistance is available, but remote working is not a possibility for this position.

LispjobsJunior Lisp Developer, RavenPack, Marbella, Spain

· 90 days ago


At RavenPack we are searching for a Junior Common Lisp Developer to join RavenPack's Development Team.

As a Junior Common Lisp Developer, you will be part of the Analytics team which is in charge of the development and maintenance of applications that, among other things, extract data from incoming news and delivers machine-friendly analytics to customers.

You will be reporting directly to the Analytics Manager and will work with an international team of developers skilled in Common Lisp, Java, Python and SQL.

The ability to communicate effectively in English both in writing and verbally is a must. Knowledge of Spanish is not a business requirement. European Union legal working status is required. Competitive compensation and a fun working environment. Relocation assistance available, but working remotely is not possible for this position.

Luís OliveiraReddit 1.0

· 104 days ago
As many Lisp programmers may remember, Reddit was initially written in Common Lisp. Then, in late 2005, it was rewritten in Python. Much discussion ensued. Steve Huffman (aka spez, current CEO of Reddit) wrote about why they switched to Python. The late Aaron Swartz also wrote Rewriting Reddit, focusing on the Python side of things.

Last week, that original, Common Lisp version of Reddit was published on GitHub: reddit-archive/reddit1.0. There's no documentation, but we know from the articles above that it ran on CMUCL. reddit.asd is perhaps a good entry point, where we can see it used TBNL, CL-WHO and CLSQL, a common toolset at the time.

It's missing various bits of static HTML, CSS and JS, so I don't think you can actually run this website without a fair bit of scrapping and stitching from the Wayback Machine. The code is quite readable and clean. It's not terribly interesting, since it's a fairly simple website. Still, it's a website that grew to become the 6th most visited worldwide (with 542 million monthly visitors) and now has 230 employees, says Wikipedia, so it's nice to see what it looked like at the very beginning.

Quicklisp newsQuicklisp dist update for March 2018 now available

· 109 days ago
New projects:
  • algebraic-data-library — A library of common algebraic data types. — BSD 3-clause
  • bodge-nanovg — Wrapper over nanovg library for cl-bodge system — MIT
  • can — A role-based access right control library — BSD 2-Clause
  • cl-darksky — Get weather via Dark Sky — BSD 2-clause
  • cl-gimei — random japanese name and address generator — MIT
  • cl-libsvm-format — A fast LibSVM data format reader for Common Lisp — MIT
  • cl-notebook — A notebook-style in-browser editor for Common Lisp — AGPL3
  • cl-octet-streams — In-memory octet streams — GPL-3
  • cl-postgres-plus-uuid — Common Lisp library providing a cl-postgres SQL reader for the PostgreSQL UUID type. — MIT
  • documentation-utils-extensions — Set of extensions for documentation-utils. — MIT
  • fact-base — Simple implementation of fact-base data storage for Common Lisp — AGPL3
  • house — Custom asynchronous HTTP server for the Deal project. — AGPL3
  • inkwell — An API client for the Splatoon 2 Splatnet. — Artistic
  • oxenfurt — A client for the Oxford dictionary API. — Artistic
  • rove — Small testing framework — BSD 3-Clause
Updated projectsarray-operationscellsceplcepl.spaceschecklchipzcl+sslcl-algebraic-data-typecl-bplustreecl-colorscl-dbicl-feedparsercl-fondcl-hamcrestcl-ledgercl-libuvcl-marshalcl-mpicl-online-learningcl-openglcl-openstack-clientcl-protobufscl-randomcl-random-forestcl-rmathcl-scsucl-selenium-webdrivercl-storecl-tomlcl-unicodeclackcloser-mopcomputable-realscroatoandbd-oracledexadoreasy-audioeazy-gnuplotesrapextended-realsfare-memoizationfemlispflexi-streamsfolio2harmonyhumblerlasslatex-tablelispbuilderlistopiallalucernemaidenmatlispmcclimmitoninevehomer-countopticloverlordparachuteparser.common-rulesparser.inipostmodernqlotrandom-statertg-mathscalplserapeumshadowstaplestumpwmtrivial-gray-streamstrivialib.bddutils-ktvarjo.

Removed projects: cl-cudd, lisp-interface-library, plain-odbc, quux-hunchentoot.

The removed projects no longer build.

To get this update, use (ql:update-dist "quicklisp").


Nicolas HafnerGeometry Clipmaps - Gamedev

· 126 days ago

A while back I attempted to implement a technique called Geometry Clipmaps in Trial. This technique is used to stream in terrain data for scenery where the terrain is far too large to be stored in memory all at once, let alone be rendered at full resolution. In this entry I'll go over the idea of the technique and the problems I had implementing it.

First, the technique is described in two papers: Geometry Clipmaps: Terrain Rendering Using Nested Regular Grids by Hoppe and Losasso [1], and Terrain Rendering Using GPU-Based Geometry Clipmaps by Hoppe and Asirvatham [2]. Neither of the papers offer any public source code, or much of any code at all to describe the details of the technique, especially in relation to the update mechanics. This is a problem, because a lot of the details of the papers are very strange indeed.

Before I get into that though, let me summarise the idea of the technique, which is fairly simple: since we cannot draw the full terrain, we need to reduce the detail somehow. Luckily for us, how much detail we can actually see in a rendered scene decreases with distance thanks to perspective skewing. Thus, the idea is that, the farther away the terrain geometry is from the camera, the coarser we render it.

To do this we create a square ring, whose inner sides are half the size of its outer sides. This gives us a power of two scaling, where we can simply render the same ring within itself over and over, scaling it down by two every time. Once we have sufficiently many rings, we simply fill the hole with a remaining square grid. Here's an image to illustrate the idea:


As you can see, the detail gets exponentially more coarse the farther away from the centre you go. If we look at this from the perspective of a camera that is properly centred, we can see the effect even better:


Despite the geometry becoming more coarse, it takes up a similar amount of actual pixels on screen. Now, the geometry I used here is rather simple. In the papers, they use a much more complex layout in order to save on vertices even more aggressively:


This, especially the interior trim, makes things a lot more painful to deal with. I'm quite sure the trim is a necessary evil in their scheme, as the ring is no longer a power of two reduction, but they still need the vertices to align properly. I can see no other reason for the trim to exist. Worse than that though, I can absolutely not understand why they specify four variants of the trim, one for each corner, when the largest trim is going to offset the rest of the geometry so much that even if every other ring uses the opposite trim, you cannot correct the shift enough to make things exactly centre again. If someone understands this, please let me know.

Anyway, now that we have these rings established, we need to look at how to actually render geometry with them. The idea here is to use height maps and dynamically shift the vertices' height in the vertex shader according to the brightness of the corresponding pixel in the height map. Since each ring has the same vertex resolution, we can use a fixed resolution texture for each level as well, saving on texture memory.

Let's look at an example for that. In order to generate the textures, we need a base heightmap at a resolution of (n*2^(r-1))^2, where n is the resolution of a clipmap, and r is the number of rings. In this example I used a clipmap resolution of 64 and 4 levels, meaning I needed a base heightmap resolution of 512^2. For the innermost level you simply crop out the center at the clipmap resolution. For the next level you crop out twice as much, and then scale to half, and so on.


The above are scaled up again using nearest-neighbour to make it better visible to the human eye. Displacing the rings by each corresponding map results in the following image. I've increased the clipmap resolution to 64 to match that of the textures.


Looks pretty good, eh? You might notice some issues arising on the edges between clipmap levels though. Namely, some vertices don't match up quite exactly. This is due to smoothing. Since each clipmap level has the same resolution, some of the pixels at the border are going to be too precise. In order to fix this we need to blend vertices as they get closer to the border. To do this we look at the distance of the vertex to the center and, depending on its magnitude, linearly blend between the current level and the next outer one. This means that for each clipmap level we draw, we need to have access to both the current level's heightmap and the next level's.

In the original paper they do this with an insane trick: they encode each heightmap value as a float and stuff the current level's height into the float's integer part, and the next level's into the fractional part. I don't know how this could ever be a good idea given floating point imprecision. I could understand stuffing both numbers into a single int, but this is crazy.

The way I do this in my current approach is different. I represent all the clipmap levels as a single 2D array texture, where each level in the texture represents one clipmap level. We can then easily access two levels, at the cost of two texture lookups of course. I could also see the encoding strategy working by limiting the depth to 16 bit and using 32 bit integer textures to stuff both levels into one. For now though, I'm not that concerned about performance, so optimisations like that seem folly.

Now, keep in mind that in order to figure out the proper smoothing, we need to get the exact position of the vertex of the inner clipmap in the texture of the outer clipmap. This is not that big of a deal if your clipmaps are perfectly aligned and centred, as you can simply scale by 0.5. However, if you implement the clipmap geometry like they suggest in the paper, with the weird trim and all, now suddenly your geometry is slightly shifted. This caused me no end of grief and I was never able to get things to match up perfectly. I don't know why, but I suspect it's something about not doing the calculation quite exactly right, or something about the interpolation causing slight errors that nevertheless remain visible.

I might rewrite the clipmaps using perfectly centred rings as I outlined above, just to see whether I can get it working perfectly, as I really cannot see that huge of an advantage in their approach. To reiterate, the reason why they do it they way they do is, according to my understanding, to find a good balance between number of vertices to keep in memory, and number of draw calls to issue to draw a single geometry layer. If I used perfectly centred rings I would have to draw either 12 smallish square blocks, or one big ring. This is opposed to their technique, which requires 14 draw calls and three different sets of geometry.             Wait what? If that makes you confused, I'm with you. I don't understand it either.

Either way, once you have the interpolation between levels going you're almost there. You can now draw a static terrain with very low computational effort using geometry clipmaps. However, usually you want to have things move around, which means you need to update the clipmaps. And this is where I stopped, as I could not figure out what exactly to do, and could not find any satisfactory information about the topic anywhere else either.

First of all, in the papers they only use a map for the coarsest level layer, and predict the finer levels based on interpolation with a residual. This has two consequences: one, you need to perform a costly ahead of time calculation to compute the residual for the entire terrain. Two, I do not believe that this interpolatory scheme will give you sufficient amount of control over details in the map. It will allow you to save texture space, by only requiring the coarsest level plus a single residual layer, but again I do not believe that such extreme optimisations are necessary.

Second, they use a format called PCT (a precursor of JPEG, I believe) which allows region of interest decoding from a file, meaning you can only read out the parts you really need in order to update the clipmap textures. This is nice, but I'm not aware of any CL library that supports this format, or supports JPEG and allows region of interest decoding.

In order to explain the third problem, let's first look at what happens when we need to update the terrain. Say we have a clipmap resolution of 1cm, meaning each grid cell is 1cm long at the finest level. The player now moves by +3cm in x. This means we need to shift the innermost layer by -3cm in x direction and load in the new data for that into the texture. We also need to update the next layer, as its resolution is 2cm, and we have thus stepped into the next grid cell for it as well. We don't need to step any of the other layers, as they are 4cm, and 8cm resolution respectively, and are too coarse for an update yet. This is fine because they are farther and farther away anyhow and we don't notice the lack of detail change. The edge interpolation helps us out here too as it bridges over the gaps between the layers even when they don't match up exactly.

Now, the way they do this updating in the papers is by drawing into the clipmap texture. They keep a lookup position for the texture which keeps up with the camera position in the game world. When the camera moves they fill in the new detail ahead of the camera's movement direction by drawing two quads. The magic of this idea is that texture lookup happens with the repeat border constraint, meaning that if you go beyond the standard texture boundary, you wrap around to the other end and continue on. Thus you only ever need to update a tiny region of the texture map for the data that actually changed, without having to shift the rest of the data around in the texture. Since this is probably quite hard to imagine, here's another picture from their paper that illustrates the idea:


The region "behind" the camera is updated, but because how the map is addressed happens through wraparound, it appears in front of it. Still, this is quite complicated, especially when you keep in mind that each level is, according to the paper's geometry, shifted by the trim and all. I'm also wondering whether, with my array texture technique, it would be faster to simply replace an entire clipmap texture level and save yourself the cost of having to first upload the two squares to textures, and then perform a draw call into the proper clipmap texture to update it. Since you know that each level has a fixed size, you could even allocate all the storage ahead of time and use a file format that could be directly memory mapped in or something like that, and then just use glSubImage2D to upload it all at once. This is just speculation and musing on my part, so I don't know whether that would be a feasible simplification on current hardware.

So all in all I have my problems with how the update is described in the paper. There is another source for geometry clipmaps, namely this talk by Marcin Gollent about how terrain rendering works in REDEngine 3 for the Witcher 3. He only very, very briefly talks about the clipmaps (5:55-6:55), but it still confuses me to this day, especially the way the clipmaps are illustrated. Let me try to outline what I understand from it:

Each clipmap level has a resolution of 1024² vertices, and we have five levels in total. The map is divided up into tiles of a uniform size, each tile representing 512² vertices on the finest level, meaning you need four tiles to render the innermost level fully. So far so good, but my question is now: what happens if you need to shift, say, the fourth level clipmap by one vertex in x and y? If each tile maps to a separate file on disk, it would mean we need to load in, at worst due to border conditions, 64 files. That seems like a whole lot to me. It would make more sense if you keep the resolution of a tile the same for each level, meaning you always need at worst 8 tiles to update a level. Either way, you end up with a tonne of files. Even just representing their map at the finest level would require 46²=2116 files.

I'm not sure if that's just me being flabbergasted at AAA games, or if there's something I truly don't understand about their technique. Either way, this approach, too, looks like a whole boatload of work to get going, though it does look more feasible than that of the original GPU clipmap paper. So far I haven't implemented either, as I've just been stuck on not knowing what to do. If there's something I'm legitimately not understanding about these techniques I'd rather not dive head first into implementing a strategy that just turns out to be stupid in the end.

I'd love to ask some people in the industry that actually worked on this sort of stuff to clear up my misunderstandings, but I couldn't for the life of me find a way to contact Marcin Gollent. I'm not surprised they don't put their email addresses up publicly, though.

If you have some ideas or questions of your own about this stuff, please do let me know, I'd be very interested in it. I definitely want to finish my implementation some day, and perhaps even develop some tooling around it to create my own heightmaps and things in Trial.

McCLIMSheets as ideal forms

· 133 days ago

CLIM operates on various kinds of objects. Some are connected by an inheritance, other by a composition and some are similar in a different sense.

As programmers, we often deal with the inheritance and the composition, especially since OOP is a dominating paradigm of programming (no matter if the central concept is the object or the function). Not so often we deal with the third type of connection, that is the Form and the phenomena which are merely a shadow mimicking it[1].

Let us talk about sheets. The sheet is a region[2] with an infinite resolution and potentially infinite extent on which we can draw. Sheets may be arranged into hierarchies creating a windowing system with a child-parent relation. Sheet is the Form with no visual appearance. What we observe is an approximation of the sheet which may be hard to recognize when compared to other approximations.

[1]: Theory of forms.

[2]: And many other things.

Physical devices

Sheet hierarchies may be manipulated without a physical medium but to make them visible we need to draw them. CLIM defines ports and grafts to allow rooting sheets to a display server. medium is defined to draw sheet approximation on the screen[3].

How should look a square with side length 10 when drawn on a sheet? Since it doesn't have a unit we can't tell. Before we decide on a unit we choose we need to take into consideration at least the following scenarios:

  1. If we assume device-specific unit results may drastically differ (density may vary from say 1200dpi on a printer down to 160dpi on a desktop[4]!. The very same square could have 1cm on a paper sheet, 7cm and 10cm on different displays and 200cm on a terminal. From the perspective of a person who uses the application, this may be confusing because physical objects don't change size without a reason.

  2. Another possible approach is to assume the physical world distances measured in millimeters (or inches if you must). Thanks to that object will have a similar size on different mediums. This is better but still not good. We have to acknowledge that most computer displays are pixel based. Specifying distances in millimeters will inevitably lead to worse drawing quality[5] (compared to the situation when we use pixels as the base unit). Moreover, conversion from millimeter values to the pixel will most inevitably lead to work on floats.

  3. Some applications may have specific demands. For instance, application is meant to run on a bus stop showing arrival times. Space of the display is very limited and we can't afford approximation from the high-density specification (in pixels or millimeters) to 80x40 screen (5 lines of 8 characters with 5x8 dots each). We need to be very precise and the ability to request a particular unit size for a sheet is essential. Of course, we want to test such application on a computer screen.

I will try to answer this question in a moment. First, we have to talk about the CLIM specification and limitations imposed by McCLIM's implementation of grafts.

[3]: See some general recommendations.

[4]: Technically it should be PPI, not DPI (pixels per inch).

[5]: Given programmer specifies sheet size in integers (like 100x100).

Ports, grafts, and McCLIM limitations

If a port is a physical connection to a display server then graft is its screen representation. The following picture illustrates how the same physical screen may be perceived depending on its settings and the way we look at it.

Graft drawing

As we can see graft has an orientation (:default starts at the top-left corner like a paper sheet and :graphics should start at the bottom left corner like on the chart). Moreover, it has units. Currently, McCLIM will recognize :device, :inches, :millimeters and :screen-sized [11].

That said McCLIM doesn't implement graft in a useful way. Everything is measured in pixels (which :device units are assumed to be) and only the :default orientation is implemented. By now we should already know that pixels are a not a good choice for the default unit. Also, programmer doesn't have means to request unit for a sheet (this is the API problem).

[11]: Screen size unit means that the coordinate [1/2, 1/2] is exactly in the middle of the screens.

Physical size and pixel size compromise

We will skip the third situation, for now, to decide what unit should we default to. There are cognitive arguments for units based on a real-world distance but there are also and practical reasons against using millimeters - poor mapping to pixels and the fact that CLIM software which is already written is defined with the assumption that we operate on something comparable to a pixel.

Having all that in mind the default unit should be device-independent pixel. One of McCLIM long-term goals is to adhere to Material Design guidelines - that's why we will use dip[6] unit. 100dp has absolute value 15.875mm. On 160dpi display it is 100px, on 96dpi display it is 60px, on 240dpi display it is 150px etc. This unit gives us some nice characteristics: we are rather compatible with the existing applications and we preserving absolute distances across different screens.

[6]: Density-independent pixel.

How to draw a rectangle on the medium

Application medium may be a pixel-based screen, paper sheet or even a text terminal. When the programmer writes the application he operates on dip units which have absolute value 0.15875mm. It is the McCLIM responsibility to map these units onto the device. To be precise each graft needs to hold an extra transformation which is applied before sending content to the display server.

Now we will go through a few example mappings of two rectangle borders[7] drawn on a sheet. The violet rectangle coordinates are [5,5], [22,35] and the cyan rectangle coordinates are [25,10], [30,15].

  • MDPI display device units are dip and they match native units of our choosing. No transformation is required.

Graft drawing

  • Some old displays have density 72PPI. Not all coordinates map exactly to pixels - we need to round them[3]. Notice that border is thicker and that proportions are a little distorted. On the other hand despite a big change in resolution size of the object is similar in real-world values.

Windows Presentation Foundation declared 96PPI screen's pixel being the device-independent pixel because such displays were pretty common on desktops. Almost all coordinates map perfectly on this screen. Notice the approximation of the right side of the violet rectangle.

Lower DPI

  • Fact that the screen has higher density doesn't mean that coordinates mapping perfectly on a lower density screen will map well to a higher density one. Take this HDPI screen. Almost all coordinates are floats while on the MDPI display they had all integer values.


  • Higher resolution makes rectangles look better (borderline is thinner and distortions are less visible to the eye). Here is XXXHDPI:


  • Some printers have a really high DPI, here is imaginary 2560 DPI printer. Funnily enough its accuracy exceeds our screen density so the red border which is meant to show the "ideal" rectangle is a little off (it's fine if we scale the image though).

HQ Printer

  • Until now we've seen some screens with square pixels (or dots). Let's take a look at something with a really low density - a character terminal. To make the illustration better we assume an absurd terminal which has 5x8 DP per character (too small to be seen by a human eye). Notice that the real size of the rectangles is still similar.

Character terminal

It is time to deal with graphics orientation (Y-axis grows towards the top). An imaginary plotter with 80DPI resolution will be used to illustrate two solutions (the first one is wrong!). Knowing the plotter screen height is important to know where we start the drawing.

  • Graft reverts Y axis and sends the image to the plotter. Do you see what is wrong with this picture? We have defined both rectangles in default orientation, so our drawing should look similar disregarding the medium we print on. We do preserve the real size but we don't preserve the image orientation - cyan rectangle should be higher on the plot.

Plotter (bad transformation)

  • Correct transformation involves reverting Y axis and translating objects by the screen height. See the correct transformation (on 80DPI and on MDPI plotter).

80DPI and MDPI plotters

[7]: the Ideal border is composed of lines which are 1-dimensional objects which doesn't occupy any space. Red border in drawings marks "ideal" object boundaries. Points are labeled in device units (with possible fractions).

Sheets written with a special device in mind

There is still an unanswered question - how can we program applications with a specific device limitations in mind? As we have discussed earlier default sheet unit should be dip and default sheet orientation is the same as a paper sheet's[8].

Writing an application for a terminal is different than writing an application for a web browser. The number of characters which fit on the screen is limited, drawing shapes is not practical etc. To ensure that the application is rendered correctly we need a special kind of sheet which will operate on units being characters. Take a look at the following example.

Sheets with different units.

The first sheet base unit is a character of a certain physical size. We print some information on it in various colors. Nothing prevents us from drawing a circle[9] but that would miss the point. We use a character as a unit because we don't want rounding and approximation. Background and foreground colors are inherited.

The second sheet base unit is dip. It has two circles, solid grey background and a few strings (one is written in italic).

Ideally, we want to be able to render both sheets on the same physical screen. The graft and the medium will take care of the sheet approximation. The effect should look something like this.

Different kinds of screens.

The first screen is a terminal. Circles from the second sheet are approximated with characters and look ugly but the overall effect resembles the original application. Proportions are preserved (to some degree). We see also the terminal-specialized sheet which looks exactly as we have defined it.

The second screen is mDPI display. The second sheet looks very much like something we have defined. What is more interesting though is our first sheet - it looks exactly the same as on the terminal.

[8]: Providing means to change defaults requires additional thought and is a material for a separate chapter. McCLIM doesn't allow it yet.

[9]: As we have noted sheets are ideal planar spaces where line thickness is 0 and there is nothing preventing us from using fractions as coordinates.

Port and graft protocols

Now we know what we want. Time to think about how to achieve it. Let me remind you what kind of objects we are dealing with:

  • Port is a logical connection to a display server. For instance, it may contain a foreign handler which is passed to the external system API. It is responsible for the communication - configuring, writing to and reading[10] from a device, we are connected to.

  • Graft is a logical screen representation on which we draw. It is responsible for all transformations necessary to achieve the desired effect on the physical screen. The same port may have many associated grafts for applications with different units and orientations.

  • Medium is a representation of the sheet on a physical device. Sheet is the Form which is a region and may be drawn - it doesn't concern itself with physical limitations.

In the next post I will show how to implement the port and the graft (and a bit of the medium) for the charming backend. I will cover only bare minimum for mediums important to verify that graft works as expected. Complete medium implementation is the material for a separate post.

[10]: Sometimes we deal with devices which we can't take input from - for instance a printer, a PDF render or a display without other peripherals.

Quicklisp newsDownload stats for February 2018

· 133 days ago
Here are the raw download stats for projects in February.

20308  alexandria
16702 closer-mop
15724 babel
15563 cl-ppcre
15531 split-sequence
15002 trivial-features
14598 iterate
13926 trivial-gray-streams
13822 bordeaux-threads
13786 anaphora
12948 let-plus
12777 cffi
12329 trivial-garbage
11742 flexi-streams
10768 nibbles
10530 puri
10329 usocket
9602 trivial-backtrace
9316 more-conditions
9051 chunga
8828 cl-fad
8794 cl+ssl
8641 cl-base64
8295 esrap
8228 chipz
8011 utilities.print-items
7860 named-readtables
7510 drakma
6918 local-time
6803 cl-yacc
6634 ironclad
5743 asdf-flv
5588 parse-number
5555 fiveam
5208 cl-json
5189 closure-common
5171 cxml
5151 log4cl
4556 optima
4476 parser.common-rules
4372 architecture.hooks
4204 cl-unicode
4101 plexippus-xpath
3920 cl-interpol
3779 lparallel
3551 lift
3520 cl-clon
3459 cl-dot
3299 trivial-types
3257 slime
3155 cl-syntax
3143 cl-annot
3136 cxml-stp
3098 cl-store
3000 fare-utils
2997 fare-quasiquote
2973 xml.location
2971 cl-utilities
2853 utilities.print-tree
2853 trivial-utf-8
2850 static-vectors
2814 fare-mop
2812 inferior-shell
2741 quri
2718 metabang-bind
2686 trivial-indent
2664 md5
2638 documentation-utils
2549 fast-io
2545 uuid
2542 array-utils
2378 plump
2275 djula
2275 symbol-munger
2271 cl-slice
2232 collectors
2225 gettext
2190 arnesi
2189 hunchentoot
2162 access
2151 cl-parser-combinators
2146 cl-locale
2028 simple-date-time
2019 ieee-floats
1889 prove
1831 yason
1603 asdf-system-connections
1588 metatilities-base
1574 cl-containers
1525 rfc2388
1471 postmodern
1460 fast-http
1441 trivial-mimes
1358 salza2
1338 monkeylib-binary-data
1334 cl-colors
1305 jonathan
1292 proc-parse
1277 xsubseq
1253 cl-ansi-text

Nicolas HafnerPipeline Allocation - Gamedev

· 134 days ago

This is a follow-up to my previous entry, where I talked about the shader pipeline system. However, I specifically excluded the way resources for shaders are allocated in that article. We'll get to that now, since I finally got around to rewriting that part of the engine to be more complete.

As you may recall, each rendering step is segregated into a shader stage. This stage may either render all on its own, render a scene directly, or influence the render behaviour of the scene. However, each stage will need buffers to draw to that can act as the inputs to future stages. This means that we have an allocation problem on our hands: ideally we'd like to minimise the number of buffers needed to render the entire pipeline, without having any of the stages accidentally draw into a buffer it shouldn't.

Let's look at the following pipeline, which consists of five stages:


The first stage renders the scene as normal, and produces a colour and depth buffer as a result. The second stage renders the scene but every object is drawn pitch black. It only produces a colour buffer, as depth is irrelevant for this. We then blend the two colour buffers together while radially blurring the black one. This results in a "god ray" effect. The next stage only renders transparent objects. This is usually done in order to avoid having to perform expensive transparency calculations on opaque objects. Finally the two stages are rendered together, using the depth and alpha information to blend. This produces the final image.

In a naive implementation, this would require seven buffers, five colour buffers and two depth buffers. However, you can do with just three colour buffers instead, by re-using the colour buffers from the first two stages for the fourth and fifth stages. This kind of allocation problem, where a graph consists of nodes with discrete input and output ports, has cropped up in multiple places for me. It is a form of a graph colouring problem, though I've had to invent my own algorithm as I couldn't find anything suitable. It works as follows:

  1. The nodes in the DAG are topologically sorted. We need to do this anyway to figure out the execution order, so this step is free.
  2. Count the number of ports in the whole graph and create a list of colours, where each colour is marked as being available.
  3. Iterate through the nodes in reverse order, meaning we start with the last node in the execution order.
  4. For each already coloured port in the node, if the port is noted as performing in-place updates, mark the colour as available. This is never the case for OpenGL textures, but it is useful for other allocation problems.
  5. For each input port in the node:
    1. If the port on the other side of the connection isn't coloured yet, pick the next available colour from the colour list.
    2. Set this colour as the colour of the other port.
    3. Mark the colour as unavailable.
  6. For each output port in the node:
    1. If the port isn't coloured yet, pick the next available colour from the colour list.
    2. Set this colour as the colour of the port.
    3. Mark the colour as unavailable.
  7. For each port in the node:
    1. If the node is coloured, mark the colour as available again.

This algorithm needs to be repeated separately for each incompatible buffer type. In our case, we need to run it once for the depth buffers, and once for the colour buffers. Since the algorithm might be hard to understand in text, here's a small animation showing it in action:


As far as I can tell, this algorithm performs the optimal allocation strategy in every case. I don't think it's amazingly fast, but since our graphs will be relatively small, and won't change very often, this is a negligible cost.

Unfortunately this allocation is not everything yet. In order to be able to figure out which buffers are compatible, we need to perform another analysis step beforehand. Namely, we need to figure out which texture specifications are "joinable". Some shader passes might have special requirements about their buffers, such as specific resolutions, internal formats, or interpolation behaviour. Thus, in order to know whether two buffers can actually potentially be shared, we first need to figure out whether the requirements can be joined together. This turns out to be a tricky problem.

Before we can get into this, I think it's worth explaining a bit more about how this stuff works in OpenGL. Two components are essential to this, namely textures and frame buffers. Any OpenGL context has a default frame buffer, which is typically the window you draw to. This frame buffer has a colour buffer, and a depth and stencil buffer if you activate those features. However, since the output of this frame buffer is internal to OpenGL, you can't capture it. If you need to re-process the contents, you'll have to use custom frame buffers. Each frame buffer has a number of attachments, to which you need to bind fitting textures. When the frame buffer is bound, a rendering call will then draw into the bound textures, rather than to the window.

Previously Trial simply automatically determined what kind of texture to use based on the window dimensions and the attachment the output was supposed to be for. Thus allocation was trivial as I just needed to run it once for each kind of attachment. However, this is not optimal. Certain effects, such as HDR require other features in textures, such as a specific internal format. Once you give the user the ability to completely customise all texture options things become difficult. Textures have the following attributes:

  • width The width of the texture. This applies to all texture targets.
  • height The height of the texture. This only applies to 2D, 3D, or array textures.
  • depth The depth of the texture. This only applies to 3D, or 2D array textures.
  • target The texture target, which encodes the number of dimensions, whether it's an array, and whether it has multisampling.
  • samples If it is a multisample texture, how many samples it supports.
  • internal format The internal pixel storage format. I'll mention this specifically later.
  • pixel format The external pixel format when loading pixel data into the texture. Closely related to the internal format.
  • pixel type The external pixel type when loading pixel data into the texture. Closely related to the pixel format.
  • magnification filter What algorithm to use when the texture is scaled up.
  • minification filter What algorithm to use when the texture is scaled down.
  • anisotropy What level of downscaling anisotropy to use.
  • wrapping How access outside of the normalised texture coordinates is handled.
  • storage Whether the texture storage can be resized or not.

That's quite a few properties! Fortunately most properties are either trivially joinable, or simply incompatible. For instance, the following are incompatible:

target, pixel format, pixel type, magnification filter, minification filter, wrapping, storage

And the following are trivially joinable by just choosing the maximum:

samples, anisotropy

The biggest problems are the dimensions, and the internal format. I'm as of yet undecided what to do with the dimension constraints, since in some cases the constraint the user writes down might just be a lower bound advice and could be upgraded to a bigger size, but in others an exact size might be expected.

The internal format however is the real kick in the dick. OpenGL has a plethora of different internal formats that may or may not be joinable. Here's a list of some:

red, r8i, r16-snorm, rgb32ui, rgba16f

Not... too bad, right? Well, unfortunately that's not all. How about these formats?

rgba2, rgb5-a1, r3-g3-b2, rgb9-e5, rgb10-a2ui, srgb8-alpha8

Or how about these?

compressed-srgb-alpha-bptc-unorm, compressed-signed-rg-rgtc2

That's not even all of them, and even worse, the complete list does not cover all possible combinations! For instance, there is no rgb2 even though there's an rgba2. There's also no r6-g6-b5, and no rgb16-e9, etc. Some of these formats are also simply insane to consider. I mean, how is a r11f-g11f-b10f format supposed to work? That's supposedly 11 bits for red and green each, and 10 bits for blue, but as floats. What??

Putting the insanity of the formats aside, the task of joining them is, fortunately, somewhat clear: a format can be joined with another of equal or greater number of colour components. A format can be joined with another of equal or greater bits for each shared component. A format can be joined with another if the required colour type matches, and the required features such as compression and srgb match. To put this into examples:

  • redred = red
  • redrg = rg
  • redr16 = r16
  • rgr32 = rg32
  • rgb32frgb8i = ∅
  • compressed-redred = ∅

You could argue that a non-compressed format could be joined with a compressed one, but for now I've taken the stance that, if the user really requests such a thing, the use is probably pretty special-purpose, so I'd rather not muck with it.

In order to do this in a somewhat sane manner that isn't explicitly listing all possible upgrades, I've written a machinery that can decompose the internal formats into a sensible plist that documents the precise requirements of the format. Based on this, a joining can then be performed much more easily. Finally, the resulting joined format is then recomposed into a matching internal format again. This last step is currently not optimal, as sometimes a join might result in an inexistent format, in which case further upgrading of the join might be necessary. However, for my current needs it is good enough, and I honestly doubt I'll ever reach the point where I need to touch this again.

Alright, so now that we know how to join internal formats, we can join texture specifications. Thus, as a step before the allocations, the system gathers all texture specifications from the ports, normalises them, and then repeatedly joins every spec with every other spec, replacing the two inputs with the result if it succeeds, or keeping it if it doesn't. At some point the set of specs won't change anymore, at which point we have the minimal texture spec set.

We then run the algorithm for each spec in this set, only considering ports whose spec can be joined with it. This results in a perfect allocation that respects the texture constraints each pass might have. A last, additional step could be performed here that I do not do (yet): since some ports might not get shared due to the allocation constraints, their texture specs could be relaxed again, so we would need to recompute the join for all ports in the same colouring group.

I've rewritten this system as part of the asset rewrite, since I thought that if I break things I might as well break them hard. Trial is not quite yet back on its feet right now, but I can at least compile a minimal pipeline and run a window, so it's very close to being functional again.

As a next step from here, I think I'll work some more on UI stuff. I know I promised an article on geometry clipmaps, and that is coming, but it'll be another while. Things have been surprisingly busy recently, and I've also been surprisingly lazy as of late. Hopefully both of those will improve soon!

LispjobsSoftware Engineer (Clojure), Chatterbox Labs, UK

· 135 days ago

From this post:

Machine Learning; Artificial Intelligence; Natural Language Processing; Functional Programming; Java; Clojure; Lisp

Chatterbox Labs are looking for a Software Engineer to join our, UK-based team to help both implement our cognitive technologies. You should hold an MSc in the field of Computer Science (or a related/similar field) and have experience in a commercial software environment.

You will join our technical team who focus on the research and development of our multi-lingual Cognitive Engine for short form and long form text classification & our Synthetic Data Generator, alongside our Data Scientists.

This diverse role will allow you to both collaborate on machine learning research and to also implement commercial software.

  • Develop, extend and maintain underlying machine learning within the Cognitive Engine & Synthetic Data Generator
  • Implement & test Clojure & Java code for commercial release
  • Research, explore & evaluate new methods for classification & data synthesis
  • Co-create new Intellectual Property alongside the Data Science team
  • Test and evaluate current technologies and make recommendations for improvements
Required Qualifications
  • MSc in Computer Science or related field
  • Demonstrable experience of using a functional language, e.g. Clojure, Scala, Racket, Common Lisp, Haskell, Erlang, etc
  • Commercial experience using a Java Virtual Machine language (Java, Clojure, Groovy, etc)
  • Experience using collaborative software management tools (git, subversion, etc)
  • Strong problem solving and algorithm development skills
  • The desire to build robust, proven and tested technologies
Desirable Qualifications
  • Experience of developing machine learning software in a commercial environment
  • Experience either researching or applying:
    • Reinforcement Learning
    • Deep learning techniques
  • Experience dealing with large datasets
  • Experience with large-scale processing frameworks (e.g. Hadoop, Spark)
Minimum Education Level

MSc in Computer Science or related field

Language Requirements
  • Spoken English language is essential
  • A second language is useful
To apply

Please send an email to with a cover note introducing yourself and a bit about you, and attach a comprehensive CV in pdf format.

Quicklisp newsQuicklisp dist update for February 2018 now available

· 137 days ago
New projects:
  • base-blobs — Base system foreign library collection — MIT
  • bodge-glad — OpenGL 4.6 Core GLAD wrapper for cl-bodge system — MIT
  • bodge-glfw — Wrapper over glfw3 library — MIT
  • bodge-nuklear — Wrapper over nuklear IM GUI library for cl-bodge system — MIT
  • bodge-ode — Thin wrapper over Open Dynamics Engine — MIT
  • bodge-openal — Thin wrapper over OpenAL cross-platform 3D audio API — MIT
  • bodge-sndfile — Wrapper over libsndfile for cl-bodge system — MIT
  • cl-flow — Data-flow driven concurrency model for Common Lisp — MIT
  • cl-muth — Multithreading utilities — MIT
  • cl-selenium-webdriver — cl-selenium-webdriver is a binding library to the Selenium 2.0 — MIT
  • cl-toml — TOML v0.4.0 parser and encoder — MIT
  • clutz — Cross-platform utility toolkit for supporting simple Common Lisp OpenGL applications — MIT
  • glad-blob — GLAD foreign library collection — MIT
  • glfw-blob — GLFW foreign library collection — MIT
  • hdf5-cffi — hdf5-cffi is a CFFI wrapper for the HDF5 library. — BSD
  • listopia — This is no official port of Haskell package Data.List — LLGPL
  • matlisp — Matlisp is a scientific computation library for Common Lisp. — LLGPL
  • nanovg-blob — Foreign library collection of nanovg 2d drawing library — MIT
  • nuklear-blob — Nuklear IM GUI foreign library collection — MIT
  • ode-blob — Foreign library collection of ODE 3d physics library — MIT
  • openal-blob — OpenAL Soft foreign library collection — MIT
  • simple-flow-dispatcher — Reference implementation of a dispatcher for cl-flow library — MIT
  • sndfile-blob — SNDFILE foreign library collection — MIT
Updated projects3bgl-shaderahungry-fleecearchitecture.service-providerasd-generatorbodge-blobs-supportbodge-chipmunkcellsceplcepl.glopcepl.sdl2cepl.spacescl-algebraic-data-typecl-anacl-ansi-termcl-charmscl-conllucl-csvcl-emojicl-gobject-introspectioncl-mongo-idcl-oclapicl-openglcl-pcgcl-permutationcl-readlinecl-skkservcl-soilcl-strcl-svgclawclmlcloser-mopclwebclxcommon-lisp-statconfiguration.optionscroatoandeflatedocumentation-utilsdoubly-linked-listdufyfare-scriptsfiascofiveamflac-metadataflac-parserforgamebox-dgengamebox-ecsgamebox-frame-managergamebox-mathgamebox-sprite-packergenieglsl-specinquisitorironcladiteratejonathanjsonrpckenzolisp-chatlispbuilderlocal-timemaidenmcclimmd5mglmitoninevehoclclosicatoverlordpaiprologparse-numberparsleyplumppngloadpostmodernpp-tomlpsychiqpuriqtools-uiquux-hunchentootremote-jsrtg-mathrutilssanitized-paramsserapeumsha3shadowshortysimple-loggerskitterspecialization-storesplit-sequencestumpwmtype-rubiquitousvarjoverbosewith-setf.

Removed projects: algebraic-data-library, cl-curlex, cl-forest, cl-gpu, cl-indeterminism, cl-larval, cl-read-macro-tokens, cl-secure-read, cl-stm, defmacro-enhance, fs-utils, funds, lol-re, stump-touchy-mode-line, yaclanapht.

There is some extra churn this month. It's mostly because of the failure of cl-curlex, a library that uses SBCL internals and which no longer works as of the latest SBCL. Many projects (all by the same author) depend on cl-curlex, and they all broke. There hasn't been any response to my bug report yet.

The other removals are due to author request, neglect, or other breakage. If you'd like to restore a removed project to Quicklisp, let me know - I'm happy to help.

To get this update, use (ql:update-dist "quicklisp").


Zach BeaneParadigms of AI Programming is now available as a free PDF

· 137 days ago

Peter Norvig has published Paradigms of AI Programming on github. There are PDFs of the entire book along with a plain-text version and separate files with the code.

I’m not exactly sure how this came about (Robert Smith hints at it on twitter) but it’s a fantastic development. I recommend Paradigms of AI Programming to anyone who wants to learn Common Lisp, and having it freely available will help a great many people.

I avoided this book for a while because I had little interest in AI. But PAIP isn’t really about AI. The book is about programming effectively, with Common Lisp as the language and classical AI problems as the context. The tools and techniques you take away from careful study are applicable in many, many situations.

McCLIMMcCLIM 0.9.7 "Imbolc" release

· 150 days ago

After 10 years we have decided that it is time to make a new release - the first one since 2008, which was McCLIM 0.9.6, St. George's Day. Imbolc is a Gaelic traditional festival marking the beginning of spring held between the winter solstice and the spring equinox.

Due to a long period of time, the number of changes is too big to list in full detail and we will thus note only major changes made during the last eleven iterations (though many important changes were done before that). For more information please check out previous iteration reports on McCLIM blog, git log and the issue tracker. We'd like to thank all present and past contributors for their time, support and testing.

  • Bug fix: tab-layout fixes.
  • Bug fix: formatting-table fixes.
  • Bug fix: scrolling and viewport fixes and refactor.
  • Feature: raster image draw backend extension.
  • Feature: bezier curves extension.
  • Feature: new tests and demos in clim-examples.
  • Feature: truetype rendering is now default on clx.
  • Feature: additions to region, clipping rectangles and drawing.
  • Feature: clim-debugger and clim-listener improvmenets.
  • Feature: mop is now done with CLOSER-MOP.
  • Feature: threading is now done with BORDEAUX-THREADS.
  • Feature: clx-fb backend (poc of framebuffer-based backend).
  • Feature: assumption that all panes must be mirrored has been removed.
  • Cleanup: many files cleaned up from style warnings and such.
  • Cleanup: removal of PIXIE.
  • Cleanup: removal of CLIM-FFI package.
  • Cleanup: changes to directory structure and asd definitions.
  • Cleanup: numerous manual additions and corrections.
  • Cleanup: broken backends has been removed.
  • Cleanup: goatee has been removed in favour of Drei.
  • Cleanup: all methods have now corresponding generic function declarations.

We also have a bounty program financed with money from the fundraiser. We are grateful for financial contributions which allow us to attract new developers and reward old ones with bounties. Currently active bounties (worth $2650) are available here.

As Imbolc marks the beginning of spring we hope this release will be one of many in the upcoming future.

Nicolas HafnerShader Pipelines and Effects - Gamedev

· 158 days ago

In the last entry I talked about Trial's way of handling shaders and how they are integrated into the object system to allow inheritance and composition. This time we'll take a look at how that system is extended to allow entire pipelines of effects and rendering systems.

Back in the olden days you could get away with a single pass to render everything. In fact, the fixed-function pipeline of GL 2.1 works that way, as do many simple rendering systems like primitive forward rendering, or most 2D games. However, more complex systems and effects require multiple passes. For instance, a deferred rendering system requires at least two passes, one that emits a "gbuffer", and another that combines the information from it to produce the final image. Another example would be a "god-rays" effect, which requires rendering the scene with all objects black, applying a radial blur to that, and then finally blending it with a rendering of the scene as normal.

These kinds of pipelines can become quite complex, especially when you start to consider that each pipeline might require multiple kinds of textures to use as buffers, inputs, and outputs. The management and allocation of these resources is a pain to do manually, so Trial includes a system to reduce the process to a graph definition. This allows you to write complicated pipelines without having to worry about any of the underlying buffers. The pipeline part is only one hand of the equation, but we'll get to the rest later.

To do the rendering, a stage will require a set of textures to output to, and a set of textures as inputs to the stage. A stage might offer additional parameters aside from that to tweak its behaviour, but those do not require allocation, so we can disregard them for now.

To give a bit of context I'll explain how these textures are attached to the shaders in OpenGL. I've glossed over that in the previous entry as it wasn't directly relevant. Let's consider a simple fragment shader that simply copies the colour from an input texture to an output texture:

uniform sampler2D input;
out vec4 color;

void main(){
  color = texelFetch(input, ivec2(int(gl_FragCoord.x), int(gl_FragCoord.y)));

Every GLSL file begins with a set of variable declarations. Each of these global variables has a storage qualifier that tells OpenGL where it comes from. uniforms are parameters that can be set from the CPU before the rendering is performed and are the same for all of the stages in the rendering. outs are variables that are sent to the next stage. The last important qualifier that isn't mentioned above is in, which takes variables from a previous stage. Every file must also contain a main function that acts as the entry point for the computation. In this case we perform a texture read at the current fragment position and store it into the output color - effectively copying it.

The way in which outputs and inputs differ should become apparent here. Inputs are declared as uniforms, which need to be set by the application before each rendering call. The outputs on the other hand are part of the framebuffer target, which is much more implicit. A frame buffer is a special kind of object that points to a set of textures to which the rendering is output. The default frame buffer will draw to your window so that things can actually be seen. This discrepancy between inputs and outputs is a bit of a pain, but it is simply a consequence of how OpenGL evolved. If I had to design a graphics API like this today I would definitely treat input and output buffers in a much more similar manner.

Whatever the case may be, we have the ability to fix this in Trial by giving the user a uniform view over both inputs and outputs to a shader stage. For instance, it would be nice if we could simply say something like the following:

(define-shader-pass renderer ()
  ((:output color)))

(define-shader-pass blur ()
  ((:input previous)
   (:output color)))

Then when we need to define an actual shader pipeline we could just do something like this:

(let ((pipeline (make-pipeline))
      (renderer (make-instance 'renderer))
      (blur (make-instance 'blur)))
  (connect (output color renderer) (input previous blur) pipeline)
  (pack pipeline))

What you see above, you can do in Trial, almost exactly. The syntax is a bit different, but the concept is much the same. The pack step analyses the graph formed by the pipeline and automatically allocates the minimal number of texture buffers needed, re-using them where possible. This is especially useful if you have a long chain of post-effects.

As the texture allocation system is implemented right now it's unfortunately pretty suboptimal. There's some severe limitations that I need to fix, but I have a good idea on how to do that, so I'll probably get around to doing that soon. Still, for most of my work so far the current approach has sufficed, and the general idea is clear.

With the pipeline and allocation system in place, one half of the equation is solved. The other half requires a bit more ingenuity. So far the passes we've looked at did one of two things: either simply render the scene to a buffer, or perform some kind of buffer to buffer transformation like blurring. However, most actually interesting shaders will want to do more complex things than that - they'll want to not only render objects, but also influence how they are rendered.

But, hold on, don't the objects themselves decide how they are rendered? And now the pass should suddenly decide what's going on? Well yes, and yes. Some passes will want to completely influence how every object is drawn, some will want to influence how objects are drawn, and yet some will not care how the objects are drawn at all. Allowing this is the second part of the shader pipeline protocol. As you know, modelling protocols with CLOS is my favourite thing to do, so strap in.

(define-shader-entity shader-pass () ())

(defgeneric register-object-for-pass (pass object))
(defgeneric shader-program-for-pass (pass object))
(defgeneric make-pass-shader-program (pass class))
(defgeneric coerce-pass-shader (pass class type spec))

The above class definition is not quite accurate, but for simplicity's sake I've shortened it to something that will be clear enough for our purposes here, and will link back to the previous entry appropriately. Each shader pass is a shader entity, and as such allows you to attach shader code to the class definition.

The first function, register-object-for-pass, is used to notify a shader pass about an object that would like to draw in the current scene. This is necessary for any shader pass that would like to influence the drawing of an object, or leave it up to the object entirely. In both of those cases, the pass will compute a new effective shader for the object and compile that into a shader program to use. This program can then be retrieved by the object with shader-program-for-pass. Thus, whenever an object is painted onto a stage, it can use this function to retrieve the shader program and set the necessary uniforms with it. Stages that would like to supplant their own rendering method can simply ignore these two functions.

The make-pass-shader-program function is responsible for combining the shader programs from the given class and pass, and to produce a useful shader program resource object as a result. This function, by default, simply does some book keeping and calls coerce-pass-shader to do the actual combination. coerce-pass-shader in turn merely retrieves the corresponding shader code from the pass and once again uses glsl-toolkit's merge-shader-sources to combine it with the existing shader.

That might be a bit much to put together in your head, so let's look at an example. We'll have an object that simply draws itself textured, and a shader pass that should apply a sort of fade out effect where things farther away become more transparent.

(define-shader-entity cube (vertex-entity textured-entity) ())

(define-shader-pass fade (render-pass) ())

(define-class-shader (fade :fragment-shader)
  "out vec4 color;
void main(){
  color.a *= (1 - gl_FragCoord.z);

(defmethod setup-scene ((main main) scene)
  (enter (make-instance 'fade) scene)
  (enter (make-instance 'cube) scene))

Similar to how we did it in the previous entry, we define a new entity that is composed of vertices and has textured surfaces. I omitted some details like what vertex data and textures to use to make it simpler. Then we define a new shader pass, which is based on the existing render-pass that is a per-object-pass with a default color and depth output. Next we have to add the shader logic to our pass. Doing what we want is actually quite simple - we simply multiply the colour's alpha value with the depth value. The depth needs to be inverted, as by default 1 is far away and 0 is close by. Thus, with this shader, the farther away the fragment is, the more transparent it will now become.

Finally we define a method that hooks into the rest of Trial and establishes a scene to be drawn. We first enter a new instance of our fade pass into it, which will automatically get added to the underlying shader pipeline. We then add an instance of the cube to it, at which point several things will happen:

  1. The cube is added to the scene graph
  2. register-object-for-pass is called on the pass and cube instances.
  3. This will call determine-effective-shader-class on the cube. This function is used as a caching strategy to avoid duplicating shaders for classes that don't change anything from their superclass.
  4. If the determined class is not yet registered, make-pass-shader-program is called.
  5. make-pass-shader-program computes the effective shader definitions for each shader type.
  6. coerce-pass-shader is called for each type of shader definition that is used, and combines the object's shader source with that of the pass.
  7. The resulting shader program resource object is returned and stored in the pass.

The fragment shader produced by coerce-pass-shader will look something like this:

in vec2 texcoord;
out vec4 color;
uniform sampler2D texture_image;

void _GLSLTK_main_1(){
  color *= texture(texture_image, texcoord);

void _GLSLTK_main_2(){
  color.a *= (1 - gl_FragCoord.z);

void main(){

Again, this neatly combines the two effects into one source file that can be transparently used in the system. Once it comes to actually drawing the scene, something like the following course of events is going to take place:

  1. paint is called on the scene with the pipeline.
  2. paint-with is called on the fade pass with the scene.
  3. paint is called on the cube with the fade pass.
  4. shader-program-for-pass is called on the fade pass and cube. The previously cached program is retrieved and returned.
  5. The textured-entity part of the cube uses the shader program to set the texture uniform.
  6. The vertex-entity part of the cube uses the shader program to set the transformation matrices.
  7. The vertex-entity part of the cube calls the OpenGL vertex drawing function to finally run the rendering process.
  8. The color output from the fade pass is blitted onto the window for the user to marvel at.

And that is pretty much the entire process. Obviously some steps will be repeated as needed if you add more objects and passes. The pipeline system ensures that the passes are run in the correct order to satisfy dependencies. In case you're wondering about the paint and paint-with inversion, the latter function is useful so that the shader pass gets to have absolute control over what is rendered and how if it so desires. For instance, it could completely ignore the scene and draw something else entirely.

That about concludes the explanation of Trial's shader system as it stands today. As I've mentioned over these two entries there are a few minor places where improvements have to be made, but overall the design seems quite solid. At least I hope I won't suddenly realise a severe limitation somewhere down the line, forcing me to rethink and rewrite everything again.

Next I think I will take a bit of a break from talking about engine details, and instead talk about geometry clipmaps and the adventure I had trying to implement them. Until then~

Quicklisp newsQuicklisp implementation stats for 2017+

· 158 days ago
Here are some raw HTTP request stats for each CL implementation supported by Quicklisp from 2017-01-01 onward. Each request corresponds roughly to downloading a single project.
  • SBCL: 4,518,774 requests
  • Clozure CL: 488,057 
  • CLISP: 34,767
  • LispWorks: 25,700
  • ABCL: 23,913 
  • Allegro: 19,501
  • ECL: 19,027
  • CLASP: 3,335
  • CMUCL: 965
  • MKCL: 79
  • Scieneer: 0
I gathered this info to check Scieneer activity levels to plan future support. Turns out nobody with Scieneer has used Quicklisp since 2013. Bummer!

For older items, see the Planet Lisp Archives.

Last updated: 2018-07-11 15:34