Scott L. BursonFSet 1.4.0 released (repost)

· 10 days ago

[Reposting so it will show up at the top of Planet Lisp]

Greetings FSet users,

For several years I was too busy to do much with Common Lisp, but having left my last job a few months ago, I am now working on a project in CL.  I'm using FSet, of course, and so I've been reminded that it needed some TLC; there were some bugs to fix, and the documentation was very old and possibly hard to find.  So I've put some time into it and prepared a new release.

The first thing I did was to review all the commits Paul Dietz made back in 2020.  These were more extensive than I had realized; he greatly expanded the test suite and fixed a number of bugs.  I have tried to thank him for his work, but he seems to have retired from GrammaTech; I have not been able to reach him.  If anyone is in touch with him. please convey my thanks.

One bug Paul noticed but didn't fix, probably because he thought someone might be depending on the current behavior, was that compare on maps and seqs was not comparing the default; if two maps or seqs had the same contents but different defaults, they would nonetheless be reported as equal.  There is indeed a chance of breaking existing code by fixing this, but I think it's small; in any case, I've decided to risk it — the behavior was clearly a bug.

The only other possibly breaking change I've made is to revamp the APIs of list-relation and query-registry.  I wrote these classes some time ago, specifically for the project I was working on (and have now resumed); they're not well documented, and I'll be surprised if anyone is using them, especially in the case of query-registry.  If I'm wrong and you are using them. post a comment and I'll explain how to convert your code, if it's not obvious.  (I did remove some methods from query-registry that I was no longer using; I can restore them if necessary.)

I've also collected the FSet documentation into one place, and freshened it a little.

As part of this work I have also updated Misc-Extensions, which contains some macros that I like to use (and are used in FSet).  In particular, I made some improvements to GMap, my iteration macro (we all have our own iteration macros, right?), and wrote a README for the system, that should make it a lot easier for people to see what's in it.

 

Joe MarshallMonkeys vs. Shakespeare

· 16 days ago

Google's golang was designed for mediocre programmers that aren't “capable of understanding a brilliant language”. It omits features that are hard to understand or hard to use, like conditional expressions, macros, exceptions, and object systems.

Lisp, on the other hand, was originally designed for smart programmers in order to research artificial intelligence. It contains language constructs like conditional expressions, a powerful macro system, a highly customizable error handling system, and a sophisticated object system.

A million monkeys typing on a million keyboards will eventually produce the works of Shakespeare. Lisp is a language for Shakespeares, while golang is a language for monkeys.

What is the fundamental problem we are solving? If the problem is simply an insufficient amount of software, then we need to write as much software as possible as rapidly as possible. Hire more monkeys. If the problem is gainfully employing all the monkeys quickly, give them crude tools that are easy to use. But if the problem is lack of quality software, then we need the tools to write the best software possible. Hire more Shakespeares and give them powerful tools.

Hiring monkeys is easier and cheaper than hiring Shakespeares. So why not just keep hiring monkeys until you have a Shakespeare equivalent? Experience shows how well this works. Just think of all the projects that were brought in on time and under budget when they doubled the number of monkeys. Just think of any project that was brought in on time and under budget when they doubled the number of monkeys. Surely there must be one?

Joe MarshallA YouTuber's First Look at Lisp

· 22 days ago

There is a guy who is making a series of videos where he takes a “first look” at various programming languages. I watched his “first look” at Lisp. Now he wasn’t going into it completely cold — he had looked at Scheme previously and had seen some Lisp — but it was still an early experience for him.

He gave it a good go. He didn’t have a pathological hatred for parenthesis and seemed willing to give it a fair shake. He downloaded SBCL and started playing with it.

Since he only had allocated an hour to it, he didn't cover much. But he encountered a few pitfalls that Lisp newbies often fall into. I found myself wanting to talk back at the screen and point out where he was going wrong when simple typos were causing errors.

For example, he was trying to format some output, but he forgot the closing double quote on the format string. The reader kept waiting for him to finish the string and simply gobbled up the format args and closing parens as part of the string. He was confused as to why nothing was happening. When he finally typed Control-C, he was faced with a spew of stack trace and a debugger prompt that was of no help to him.

A second problem he encountered was when he typed (print (first (*mylist*))). Any experienced Lisp hacker will obviously see the problem right away: *mylist* is not a function. But the error message was buried in a 30 level deep stack trace that wound back through the reader and REPL and completely obscured the problem.

The Lisp debugger is amazing, but it can be a bit overwhelming to a newbie. When I was TAing Lisp classes, I would often find that students were intimidated by the debugger. They felt that while they were in the debugger there was some sort of impending doom hanging over them and that they had to exit and get back to the normal REPL as quickly as possible. I think if I were teaching a Lisp course now, I would start off by deliberately causing errors and then showing students how to use the debugger to find the problem and recover from it.

PLT Scheme has an interesting “beginner” mode that disables some of the more advanced features of Lisp. Many typos in Lisp cause errors not because they are invalid, but because they mean something advanced, but unintended. For example, the beginner mode disables first-class functions, so accidentally putting in extra parenthesis becomes a syntax error instead of an attempt to funcall something inappropriate. Perhaps a “training wheels” approach would work with Lisp beginners.

But SBCL generates pretty good error messages. It was a case of TLDR. The reviewer just didn't take the time to read and understand what the error message said. He was too busy trying to figure out how to get back to the REPL.

All in all, the reviewer gave it a good go and came away with a positive impression of Lisp. He also took he time to research the language and provided some examples of where Lisp is used today. The video is at https://www.youtube.com/watch?v=nVhzHu2zSEQ if you are interested.

Marco AntoniottiHelping HEΛP Again!

· 29 days ago

In a flurry of ... free time, I also went back to HEΛP and fixed a few bugs that were exposed by some of the things I did with CLAST. Recording the documentation strings from the pesky

  (setf (documentation 'foo 'function) "Foo Fun!")
  

are now all working as expected, at least at top-level and within PROGN-like constructs, e.g., EVAL-WHEN.

Meanwhile, I also updated the documentation and the web page adding a few caveats about how to run the DOCUMENT function, and how to work around issues I have seen in my (not so) extensive tests.

Of course, I put my money where my mouth is: the HEΛP documentation web pages are built with HEΛP.


(cheers)

Marco AntoniottiCLAST reworked

· 30 days ago

Prompted by a post on one of the various Common Lisp fora, I finally got my act together and went back to CLAST, i.e., the Common Lisp Abstract Syntax Tree library that I had in the works for ... some time.

The library has an interesting origin, which I will recount in a different post. Suffice to say that eventually I needed a code walker which did a few complicated things. NIH sydrome immediately kicked in.

The main think I needed were functions inspecting code, as in the example below.

cl-prompt> (clast:find-free-variables '(let ((x 42)) (+ x y)))
(Y)

To achieve this (apparently) simple goal, a (mostly) portable environment library had to be developed and a full set of AST node structures had to be provided.

The result is now finally ready for prime time. In Lispworks you can also see how things actually get parsed. As an example, the picture below shows the result of the following command.

cl-prompt> (clast:parse '(loop for i in '(1 2 3)
                               count (oddp (+ qd i)) into odds))
#<LOOP-FORM  24ECE77F>
NIL

Please try it, report bugs, blast my design choices and suggest improvements.

Thank you


(cheers)

Joe MarshallGoto not that harmful

· 30 days ago

I use goto all the time. When I have a loop, I goto the top of the loop after each iteration. When a function delegates to another function, I just goto the other function.

Gotos that just transfer control have the problem that the context is implicit. It isn’t obvious from the code what parts of the context are expected to persist across the control transfer. But if you explicitly pass arguments along with your control transfer, then you can see exactly what is carried across the control transfer.

A tail call isn’t “optimized” to a goto, it is a goto. It is a goto that passes arguments. Gotos that pass arguments aren’t harmful.

Paolo AmorosoAdding an Exec command and File Browser support to Insphex

· 31 days ago

I implemented the last features originally planned for Insphex, my hex dump tool in Common Lisp for Medley Interlisp.

The first new feature is an Exec command for invoking the program. The command HD works the same way as the function INSPHEX:HEXDUMP and accepts the same arguments, a file name and an optional boolean flag to indicate whether the output should go to a separate window:

← HD FILENAME [NEWIN-P]

The other feature is the addition to the File Browser menu of the Hexdump command, which shows the hex dump of the selected files in as many separate windows:

Hexdump File Browser command of the Insphex hex dump tool on Medley Interlisp.

For other commands that produce output in windows the File Browser lets the user view one window at a time, with menu options for skipping through the windows. Insphex doesn't do anything so elaborate though.

Implementing the features was easy as the relevant Interlisp APIs are well documented and I have experience with adding an Exec command to Stringscope.

The Medley Lisp library modules manual covers the File Browser API from page 115 of the PDF, with the explanation of how to add commands on page 118. It's as simple as registering a callback function the command invokes, INSPHEX::FB-HEXDUMP for Insphex.

An issue I bumped into is that instead of 4 arguments as the manual says, the callback actually requires 5. The last, undocumented argument was likely introduced since the publication of the manual.

#insphex #CommonLisp #Interlisp #Lisp

Discuss... Email | Reply @amoroso@fosstodon.org

Joe MarshallEmbrace the Suck

· 32 days ago
The key point here is our programmers are Googlers, they're not researchers. They're typically fairly young, fresh out of school, probably learned Java, maybe learned C or C++, probably learned Python. They're not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt.
— Rob Pike

How sad. Google used to hire top-notch programmers. Now it appears that they have hired a bunch of mediocre programmers who can't even learn how to properly use exceptions or ternary conditionals. Programmers that need “training wheels” on their C.

And how insulting to Googlers to be told that they are not capable of understanding a brilliant language. To be seen as merely a fungible resource to be “used” to build software. I'm sure some of them are capable of understanding a brilliant language. I'm sure that some are capable of learning how to use Haskell or OCaml or Clojure (or F# or Rust or Erlang or Scala) whatever language is best for the task.

Does success come to those that strive for excellence or those that embrace mediocrity?

Scott L. BursonOn the time complexity of functional collections

· 36 days ago

Clojure made functional collections popular.  Rich Hickey, its inventor, deserves a lot of credit for that.  However, he also propagated an inaccurate way of describing their time complexity on several common operations such as looking up a key in a map.  I don't know exactly what phrase he used at first, but I've seen people describe the time complexity of these operations as "near-constant" or "effectively constant", or sometimes shouting: "effectively constant".  He also seems to have originated the practice I see in the Clojure community of speaking as if the base of the logarithm mattered: "O(log32 n)".  (The "32" should be a subscript, but I don't see an affordance for subscripts in this Blogger UI.)

All of these locutions are wrong.  The only correct way to describe the time complexity of the operations in question is as "O(log n)" or "logarithmic time" ("log time" for short).  Time complexity describes how the time to perform the operation grows as the size of the input (in this case, the collection) grows without bound.  Because the Hash Array-Mapped Trie (HAMT) — the very clever data structure invented by Phil Bagwell — is a tree, the worst-case time to access an element in the tree must be proportional to the depth of the tree, which is proportional to the logarithm of the number of elements (provided that the tree is balanced, which it will be if the hash function is well distributed).  The base of the logarithm is the radix (branching factor) of the tree, which in Clojure's case is 32, but this has no bearing on its time complexity; as everyone knows, logarithms of different bases differ only by a constant factor, and big-O notation ignores constant factors.

I think part of what is going on here is a bit of confusion between the time complexity of an algorithm and its real-world performance.  Consider this sentence from Hickey's HOPL 2020 paper, A History of Clojure:

Performance was excellent, more akin to O(1) than the theoretical bounds of O(logN).

You don't find the time complexity of an algorithm by measurement, but by analyzing the algorithm.  While it's not 100% clear, this sentence certainly gives the impression that he didn't quite understand that.

Let me speculate a little.  The performance of a lookup on a map, implemented as an HAMT, using string keys, has two components: the time to hash the key, and the time to walk the HAMT, using the hash value, to find the map entry containing that key.  I'm going to guess that for the string keys that Rich tried in his testing, the tree-walking time was less than or comparable to the string-hashing time up to a depth of maybe 3 or 4, or maybe larger.  32^4 is 1,048,576, which might be larger than any map he tested; so it's entirely plausible that he just didn't test any collections large enough to see the logarithmic behavior emerge.

If that's right, it certainly speaks well for the performance of the HAMT design.  Let me acknowledge at this point that Rich also had a marketing problem to deal with: he had to convince potential Clojure users that its functional collections would not make their programs unusably slow.  O(1) or "near-constant" certainly sounds better than O(log n).  I can understand the temptation he faced.

But again: time complexity is about how the time grows as the size of the input grows without bound.  And clearly, in this case, there will be some point at which the tree-walking time will begin to be larger than the hashing time.  This will happen sooner for short keys than long ones, and soonest if the keys are integers hashed by the identity function (or maybe by folding a 64-bit integer into a 32-bit hash; probably one or two instructions).  But it will happen.

— That is, it will happen as long as the algorithm doesn't run out of hash bits.  Clojure uses a 32-bit hash; since each tree level consumes 5 bits, that gives it 6.4 levels.  As the tree starts to fill up, the number of collisions will begin to become significant.  I'm not sure what Clojure does with collisions.  Bagwell suggested rehashing to obtain more bits, but I don't know that Clojure does that; it might just do linear search over collision buckets.  In the latter case, the time complexity would actually be linear (O(n)) rather than logarithmic; the linear behavior won't begin to emerge until the map has billions of entries, but again, time complexity isn't about that.

The other point worth making here is that while time complexity is an important fact about the performance of an algorithm, it is not the only important fact.  The amount of time it takes on small instances can also matter; depending on the use case, it can be more important than the time complexity.  There are algorithms in the CS literature (called "galactic algorithms"; TIL!) which have state-of-the-art time complexity, but are not used in practice because their constant factors are too large (I guess in practice this means they have complicated initializations to perform before getting to the meat of the computation).

None of this is intended as a criticism of Hickey's choice of HAMTs for Clojure.  The only reason FSet doesn't use HAMTs is that I wasn't aware of their existence when I was writing it.  Probably I will rectify this at some point, though that's not a trivial thing to do because the change can't be perfectly compatible; FSet's trees are comparison-based, while HAMTs are hash-based, requiring a change to how user-defined classes are interfaced to the library.  Still, I expect HAMTs would be substantially faster in many applications.

Joe MarshallDecode a Float (Solution)

· 40 days ago

We can multiply or divide a floating point number by 2 without changing the bits in the mantissa. So we can rescale the number to be in the range [1, 2) by repeatedly multiplying or dividing by 2. The leftmost bit of the mantissa is always 1, but next bit determines whether the number is in the top half of the range or the bottom half. So if the number is equal to or greater than 1.5, the next bit is 1, otherwise it is 0.

If the number is in the range [1, 2), we can subtract 1 from it. The remaining bits will be shifted left until the leftmost bit is 1. If the number is greater than or equal to 1.5, then subtracting 1 will shift the bits left by one. But if the number is in [1, 1.5), we won’t know how many zero bits will be shifted out when we subtract 1. What we’ll do is add .5 to the number to turn on the next bit, then subtract 1 to shift the bits left by one.

Here is the code in Common Lisp:

(defun float->bits (float)
  (cons 1 (read-bits float 0)))

(defun read-bits (float count)
  (cond ((>= count 52) nil)
        ((> float 2.0d0) (read-bits (/ float 2.0d0) count))
        ((< float 1.0d0) (read-bits (* float 2.0d0) count))
        ((>= float 1.5d0) (cons 1 (read-bits (- float 1.0d0) (1+ count))))
        (t (cons 0 (read-bits (- (+ float .5d0) 1.0d0) (1+ count))))))

Note that all these operations are exact and do not cause round off.

Joe MarshallDecode a Float

· 40 days ago

The leftmost bit of any positive binary number is always 1. So if you were to left-justify a positive binary number, the top bit would always be 1. If the top bit is always 1, there is no need to implement it. Floating point numbers use this trick.

You can determine the bits of an integer using only arithmetic operations by repeatedly dividing by two and collecting the remainders. Today’s puzzle is to determine the bits of a floating point number using only arithmetic operations (no decode-float or integer-decode-float).

Scott L. BursonFSet 1.4.0 Released

· 48 days ago

Greetings FSet users,

For several years I was too busy to do much with Common Lisp, but having left my last job a few months ago, I am now working on a project in CL.  I'm using FSet, of course, and so I've been reminded that it needed some TLC; there were some bugs to fix, and the documentation was very old and possibly hard to find.  So I've put some time into it and prepared a new release.

The first thing I did was to review all the commits Paul Dietz made back in 2020.  These were more extensive than I had realized; he greatly expanded the test suite and fixed a number of bugs.  I have tried to thank him for his work, but he seems to have retired from GrammaTech; I have not been able to reach him.  If anyone is in touch with him. please convey my thanks.

One bug Paul noticed but didn't fix, probably because he thought someone might be depending on the current behavior, was that compare on maps and seqs was not comparing the default; if two maps or seqs had the same contents but different defaults, they would nonetheless be reported as equal.  There is indeed a chance of breaking existing code by fixing this, but I think it's small; in any case, I've decided to risk it — the behavior was clearly a bug.

The only other possibly breaking change I've made is to revamp the APIs of list-relation and query-registry.  I wrote these classes some time ago, specifically for the project I was working on (and have now resumed); they're not well documented, and I'll be surprised if anyone is using them, especially in the case of query-registry.  If I'm wrong and you are using them. post a comment and I'll explain how to convert your code, if it's not obvious.  (I did remove some methods from query-registry that I was no longer using; I can restore them if necessary.)

I've also collected the FSet documentation into one place, and freshened it a little.

As part of this work I have also updated Misc-Extensions, which contains some macros that I like to use (and are used in FSet).  In particular, I made some improvements to GMap, my iteration macro (we all have our own iteration macros, right?), and wrote a README for the system, that should make it a lot easier for people to see what's in it.

 

Joe MarshallD-day, 80 years ago today

· 50 days ago
More than 150,000 troops, 5,000 ships, 13,000 aircraft in the largest amphibious assault in history.

Joe MarshallMultithreading and Immutable Data

· 51 days ago

I was amusing myself by looking at Lisp tutorials. They used the idea of a Tic-Tac-Toe service as a motivating example. You’d be able to play Tic-Tac-Toe against the computer or another opponent.

My immediate thought went to the issue of multithreading. If you were going to serve hundreds of people at once, you’d need to have a multi-threaded service. Multi-threaded code is hard to write and debug, and it is much better if you have a plan before you start than if you try to retrofit it later (that trick never works).

The magic bullet for multi-threading is immutable data. Immutable data is inherently thread-safe. It doesn’t need synchronization or locks. If all your data are immutable, you can pretty much ignore multi-threading issues and your code will just work.

Using a 2D array to represent a Tic-Tac-Toe board is the obvious thing that first comes to mind, but not only are arrays mutable, they virtually require mutation to be of any use. The Lisp tutorials I was looking at all used arrays to represent the board, none of them locked the board or used atomic operations to update it, and all had the potential for race conditions if two threads tried to update the board at the same time. Arrays are essentially inherently thread-unsafe.

I thought about alternative representations for the board. Different representations are more or less amenable for writing code that avoids mutation. I came up with a few ideas:

  • Use a 2d array, but copy it before each mutation. This is horribly inefficient, but it is simple.
  • Use a 1d array, again copying it before each mutation. This isn’t much different from the 2d array, but iterating over the cells in the board is simpler.
  • Keep a list of moves. Each move is a pair of player and position. To determine the state of the board, you iterate over the list of moves and apply them in order. This is a bit more complicated than the array representations, but it is inherently immutable. It also has the advantage that you can rewind the board to any prior position.
  • Encode the board as a pair of bitmaps, one for each player.
  • Encode the board as a single bitmap, with each cell represented by two bits.
  • There are only 39 ways to fill out a Tic-Tac-Toe grid, so you could represent the board as an integer.

Each one of these representations has pros and cons. I wrote up some sample code for each representation and I found that the representation had a large influence on the character of the code that used that representation. In other words, there wasn’t a single general Tic-Tac-Toe program that ended up being specialized to each representation, but rather there were six different Tic-Tac-Toe programs each derived from its own idiosyncratic representation.

In conclusion, it is a good idea to plan on using immutable data when you might be working with a multi-threaded system, and it is worth brainstorming several different representations of your immutable data rather than choosing the first one that comes to mind.

Scott L. BursonFunctional Collections in Programming Languages

· 53 days ago
As I was updating the documentation for FSet, my functional collections library for Common Lisp, I wondered about the history of functional collections in programming languages.  I have found a couple of interesting early examples, but before I present those, I want to address this question: exactly what does it mean for a collection type (or indeed, any type) to be "functional" in an imperative language with assignment?

When I speak of "functional" types in imperative languages, I mean more specifically types with value semantics as opposed to reference semantics.  The distinction is more subtle than that between mutable and immutable types, which it is often conflated with.  Let's consider for a moment a simple type that we all understand: integers in C.  For example:

int a = 2;
int b = a;
++a;

After execution of these statements, a will be 3 and b will be 2.  Note in particular that the assignment of a to b is a value assignment: it copies the value of a into b; it does not make b an alias of a.  If it had made b an alias of a, then b would also have been 3 afterwards.

For contrast, consider this example in Java:

int[] a = {15, 37};
int[] b = a;
a[0] = 42;

After this, both a and b will contain {42, 37}.  Here the assignment is a reference assignment: b doesn't just wind up with the same array value as a; instead, it gets aliased to a, so that changes made to either one are reflected in the other.

These observations suggest a definition of the distinction between value semantics and reference semantics: if a type has value semantics, then assignment of it does not cause aliasing, while with reference semantics, assignment does cause aliasing.

Now let's look at a C++ example:

std::string a = "foo";
std::string b = a;
a.insert(1, "x");

Particularly if you don't know C++, I invite you to puzzle over this for a moment.  What is the value of b?  In fact, it is "foo"; the assignment is value assignment, implemented by copying the contents of the string, so the a.insert operation affects only a.  Strings in C++ have value semantics, as indeed do the STL collection types (vector etc.).  Of course, in C++, you can create a reference or pointer to any type, so you can always get reference semantics when you want it, even for built-in types such as integers.

So now I have a couple of questions for you.  First, are C++ strings mutable?  Given that they have operations like insert, one would be inclined to call them mutable, don't you think?  Let's say that they are.  Okay, then are C/C++ integers also mutable?  We don't usually think of integers as being mutable; we usually think of an operation like ++a as assigning a new value to a, not as incrementing a mutable integer object.  But as we see here, integers and strings have the same behavior vis-a-vis assignment and modification; if we consider strings to be mutable, seems like we have to consider integers mutable as well.

But my purpose isn't really to get you to call integers mutable.  My point is, rather, that the mutable/immutable distinction doesn't capture everything that's going on here.  The more useful distinction is between value semantics and reference semantics.  When I speak of "functional collections", I mean that they have value semantics, not necessarily that they have nothing that looks like a mutating operation.

So the question I wondered about was, what were the first programming languages to provide collections with value semantics?

A case can be made that Lisp was the first, in 1958.  Although Lisp lists are mutable, they are usually treated as immutable once fully constructed.  For example, a function constructing and returning a list might call nreverse on it just before returning it, to destructively reverse it (a common idiom, because constructing a list using cons starts at the end); but usually, the caller of such a function, having received the returned list, would not subsequently mutate it.  Certainly there are and have always been exceptions, but my impression is that, even in Lisp's early years, significant bodies of code were written in which the vast majority of lists were treated functionally (once fully constructed).

But the first language in which collections were treated functionally by definition, rather than by usual convention, appears to have been APL in 1966.  Interestingly, given the very great difference the choice of value or reference semantics makes to the way in which one writes programs in a language, I can't find a clear statement in the APL manual that I'm looking at as to which semantics APL uses for its arrays.  It seems to be something that people expect to go without saying (one reason I'm writing about it).  But I downloaded and compiled GNU APL and tried it out, and, sure enough, it uses value semantics (the left arrow is the assignment operator; user input is in bold):

V←1 2 3 4
V
1 2 3 4
W←V
V[2]←7
V
1 7 3 4
W
1 2 3 4

Another early value-semantics language was SETL.  From Robert B. K. Dewar's The SETL Programming Language (1979):

One important point is that SETL treats tuples as values when it comes to assignments.  Consider the following sections of code:
abc := 12;
cde := abc;
abc := abc + 2; $ cde
still = 12

abc := [1,2,3];
cde := abc;
abc(2) := 0; $ cde
still = [1,2,3]

In SETL the two sequences have similar effects.  If you expected cde to change in the second
sequence, then study it carefully.  If not, then you have the correct idea already.

Here, just as I have done above, Dewar is demonstrating how the value semantics of SETL tuples is like that of integers.

SETL strongly influenced a little-known, proprietary language called Refine, which I worked in from 1987 to 2003.  Refine was originally developed at Kestrel Institute; it had functional collections and was implemented on top of Common Lisp.  It was my experience with Refine that motivated me to write FSet.

Other languages with functional collections appeared in the 1980s and 1990s, including the ML family (primarily Standard ML and OCaml) and Haskell.  No doubt there were others of which I am not aware.  But the language that has probably done the most to popularize functional collections is Rich Hickey's Clojure.

All of which brings me back to FSet.  FSet, of course, has value semantics:

FSET-USER> (isetq x (map ('a (seq 47 33)) ('b (seq 17 3 99))))
#{| (A #[ 47 33 ]) (B #[ 17 3 99 ]) |} ;
a map whose range values are seqs
FSET-USER> (isetq y x)
#{| (A #[ 47 33 ]) (B #[ 17 3 99 ]) |}
FSET-USER> (setf (@ (@ x 'a) 1) 1)     ;
sets element 1 of the seq for 'a
1
FSET-USER> x
#{| (A #[ 47 1 ]) (B #[ 17 3 99 ]) |}
FSET-USER> y
#{| (A #[ 47 33 ]) (B #[ 17 3 99 ]) |}

(Here isetq is an "interactive setq" that suppresses undefined-variable warnings some Lisp implementations issue when you interactively set a previously unknown variable.)  You can see that the update to x doesn't change y; but how does FSet manage this?  Common Lisp's setf macro was designed to support such cases.  In the CLHS sec. 5.1.2.2, we see:

A function form can be used as a place [the first operand of setf] if it falls into one of the following categories:
[...]
· A function call form whose first element is the name of any one of the functions in the next figure [which are ldb, mask-field, and getf], provided that the supplied argument to that function is in turn a place form; in this case, the new place has stored back into it the result of applying the supplied "update" function.

So for instance, you can update bitfields of integer variables:

CL-USER> (setq *print-base* 16)  ; hexadecimal output makes this easier to understand
10
CL-USER> (setq x #x1000)
1000
CL-USER> (setf (ldb (byte 8 4) x) #x32)
32
CL-USER> x
1320

Since it's an integer that's being updated here, clearly setf has the ability to handle updates to types with value semantics.  If you're curious how that works, see the CLHS sec. 5.1.1.2.


Joe MarshallRoll Your Own Syntax

· 55 days ago

Unlike most languages, Lisp represents its programs as data structures. A Lisp program is a set of nested lists. We can look at a Lisp program as a tree, with each nested list as a node in the tree. The first element of each list indicates the kind of node it is. For instance, a sublist beginning with LET binds local variables, a sublist beginning with IF is a conditional, and so on.

In most languages, it difficult or impossible to add new node types to the syntax tree. The syntax is wired into the language parser and if you even can add new syntax, you have to carefully modify the parser to recognize it. In Lisp, adding new node types is quite easy: you just mention them.

To give an example, suppose you wanted to add a new node to the syntax tree called COMMENT, which would have a string and a subexpression as components. You'd use it like this:

(comment "Avoid fencepost error" (+ x 1))

Here's how you could define the semantics of a COMMENT node in Lisp:

(defmacro comment (string expr)
  expr)

That's it. You can now insert arbitrary COMMENT nodes into your Lisp programs.

Compare that to what you would have to do in a language like Java to add a new kind of node to the syntax tree. You'd have to modify the parser, the lexer, the AST classes, and probably a bunch of other stuff. It's non trivial.

For a more complex example, consider adding transactions to a language. In a language like Java, you'd have to modify the parser to recognize the new syntax, and then you'd have to modify the compiler to generate the correct code. In Lisp, you can just define a new node type:

(defmacro with-transaction (body)
  <complex macro elided>)

And then use it like this:

(with-transaction
  (do-something)
  (do-something-else))

Now obviously you should put some thought into doing this. Adding dozens of random new node types to the language would be a bad idea: readers of the code wouldn't be expecting them. But in some cases, a new node type can be just what is called for to abstract out complexity or boilerplate. Lisp gives you that option.

Joe MarshallIf I Were in Charge

· 59 days ago

If I were in charge of Python development, here are a few things I would do:

  • Add (optional) tail recursion. This would make it easier to write pure functional code. It would also make it possible to effectively program in continuation passing style. Making tail recursion optional should placate those that feel that stack traces are important for debugging.
  • Add macros. I am thinking of Lisp-like macros that do code transformation, not C-like macros that simply do token substitution. A good macro system would allow advanced users to create new syntactic forms for the language and provide a way to abstract boilerplate.
  • Allow a way to use statements inside expressions, or beef up the expression syntax to have exception expressions, loop expressions, etc. This, too, would make it easier to write pure functional code.
  • Optional end-of-block markers. These would allow you to automatically fix indentation errors and recover indentation when it is lost.
  • Use true lexical scoping. Changing this might break legacy code that depends on the current scoping quirks, though.
  • Use modern interpretation techniques to get the performance up to a more reasonable level. Performance doesn't matter that much, but Python is notably slow.
  • Get rid of the global interpreter lock so that multithreading works better. Probably easier said than done.

I don't believe any of these necessarily involve fundamental changes to the language. They'd just make the language more flexible, though I'm sure many people would disagree with me.

But, perhaps for the best, they're not going to put me in charge.

Paolo AmorosoBuilding a GUI for Insphex

· 59 days ago

I added a GUI to Insphex, the hex dump tool I'm writing in Common Lisp on the Medley Interlisp environment.

The initial version printed the hex dump only to the standard output, now optionally to a separate TEdit window with a command menu. The menu has items for displaying the next page of output, redisplaying from the beginning of the file, and exiting the program.

Window and command menu of the Insphex hex dump tool for Medley Interlisp.

Most window, menu, and other Medley GUI facilities, like the TEdit rich text editor, provide Interlisp APIs in the IL package that Common Lisp programs such as Insphex can access. However, since the APIs usually rely on Interlisp records, from Common Lisp it's often necessary to write quite a few package qualifiers like this example to create a menu record:

(IL:CREATE IL:MENU
           IL:ITEMS IL:← '(ITEM1 ITEM2 ITEM3)
           IL:MENUFONT IL:← '(IL:MODERN 12)
           IL:TITLE IL:← "Menu"
           IL:CENTERFLG IL:← T)

The XCL:DEFINE-RECORD macro helps reduce package qualifiers by wrapping Interlisp records in equivalent Common Lisp structures with ordinary structure accessors, setters, predicates, and constructors. The structures can be in any package, not just IL like Interlisp symbols. XCL:DEFINE-RECORD is described on page 7-3 (page 143 of the PDF) of the Medley 1.0 release notes.

This way Common Lisp blends well with Interlisp and reduces verbosity. For example, this is the Insphex Common Lisp function that creates the output window:

(DEFUN CREATE-HEX-WINDOW (FILE)
   "Create and return a window to display the hex dump of FILE."
   (LET* ((IN (OPEN FILE :DIRECTION :INPUT :ELEMENT-TYPE '(UNSIGNED-BYTE 8)))
          (COMMANDS (IL:MENUWINDOW (MAKE-MENU :ITEMS '(("Next" :NEXT "Show the next page.")
                                                       ("Reread" :REREAD "Reread the input file.")
                                                       ("Exit" :EXIT "Quit the program."))
                                          :MENUFONT
                                          '(IL:MODERN 12)
                                          :TITLE "Commands" :CENTERFLG T :WHENSELECTEDFN 
                                          #'HANDLE-MENU)))
          (OUT (IL:OPENTEXTSTREAM))
          (TEDIT-PROC (IL:TEDIT OUT))
          (WINDOW (IL:WFROMDS OUT)))
     (IL:ATTACHWINDOW COMMANDS WINDOW 'IL:TOP 'IL:LEFT)
     (IL:WINDOWPROP WINDOW 'INSTREAM IN)
     (IL:WINDOWPROP WINDOW 'OUTSTREAM OUT)
     (IL:WINDOWPROP WINDOW 'BLOCK-OFFSET 0)
     (IL:WINDOWPROP WINDOW 'IL:TITLE (FORMAT NIL "Insphex ~A" FILE))
     (NEXT-HEX-PAGE WINDOW)
     WINDOW))

The INSPHEX::MAKE-MENU constructor creates a Common Lisp INSPHEX::MENU structure that wraps the Interlisp IL:MENU record.

Most of the Insphex GUI functionality is in place but I need to work on a couple of tweaks.

First, the Insphex window should be read-only whereas now the user can type into the editor buffer. Next, I need to clean up all the allocated resources when the user quits the program via various interaction flows, such as closing the window instead of clicking the Exit menu item.

#insphex #CommonLisp #Interlisp #Lisp

Discuss... Email | Reply @amoroso@fosstodon.org

Joe MarshallException Handling for Control Flow

· 61 days ago

Back when I was taking a Software Engineering course we used a language called CLU. CLU was an early object-oriented language. A feature of CLU was that if you wrote your code correctly, the compiler could enforce completely opaque abstract data types. A good chunk of your grade depended on whether you were able to follow insructions and write your data types so that they were opaque.

CLU did not have polymorphism, but it did have discriminated type unions. You could fake simple polymorphism by using a discriminated type union as the representation of an opaque type. Methods on the opaque type would have to dispatch on the union subtype. Now to implement this correctly, you should write your methods to check the union subtype even if you “know” what it is. The course instructors looked specifically for this and would deduct from your grade if you didn't check.

One subproject in the course was a simple symbolic math system. You could put expressions in at the REPL, substitute values, and differentiate them. The core of the implementation was term data type that represented a node in an expression tree. A term was a type union of numeric constant, a symbolic variable, a unary expression, or a binary expression. Unary and binary expressions recursively contained subterms.

The term data type would therefore need four predicates to determine what kind of term it was. It would also need methods to extract the constant value if it was a constant, extract the variable name if it was symbolic, and extract the operator and subterms if it was a unary or binary expression. The way to use the data type was obvious: you'd have a four-way branch on the kind of term where each branch would immediately call the appropriate selectors. But if you wrote your selectors as you were supposed to, the first thing they should do is check that they are called on the appropriate union subtype. This is a redundant check because you just checked this in the four-way branch. So your code was constantly double checking the data. Furthermore, it has to handle the case should the check fail, even though the check obviously can never fail.

I realized that what I needed was a discrimination function that had four continuations, one for each subtype. The discrimination function would check the subtype and then call the appropriate continuation with the subcomponents of the term. This would eliminate the double checking. The code was still type safe because the discrimination function would only invoke the continuation for the correct subtype.

One way to introduce alternative continuations into the control flow is through exceptions. The discrimination function could raise one of four exceptions, each with the appropriate subcomponents as arguments. The caller would not expect a return value, but would have four catch handlers, one for each subtype. The caller would use the try/except syntax, but it would act like a switch statement.

The TA balked at this use of exceptions, but I appealed to the professor who saw what I was trying to do and approved.

There is some debate about whether exceptions should be used for control flow. Many people think that exceptions should only be used for “exceptional” situations and that it is poor form to use them for normal control flow. I think they are taking too narrow a view. Exceptions are a way to introduce alternative paths of control flow. One can use them to, for instance, handle an exceptional situation by jumping to a handler, but that's not the only way to use them.

Of course you should think hard about whether exceptions are the right way to introduce alternative control flow in your use case. Exception syntax is usually kind of klunky and you may need to rewrite the code to insert the exception handling at the right point. Readers of your code are likely to be surprised or baffled by the use of exceptions for control flow. There is often a significant performance penalty if an exception is thrown. But sometimes it is just the trick.

Joe MarshallECS in CLOS

· 62 days ago

Object systems make a natural fit for the game programming domain, but over time people have found that the object-oriented model provided by their favorite language doesn't always fit the use case. So developers have come up with “entity-component systems” (ECS) of varying complexity to fill the gap.

The basic idea is that a game object, called an “entity”, contains or refers to a set of “components” that define its behavior. Entities are too varied to be captured by a single class hierarchy, so we abandon inheritance in favor of composition. An entity that can be displayed on the screen has a “sprite” component, an entity that can move has “position” and “velocity” components, an entity that you can attack has a “hitbox” component and a “health” component. A entity that can attack you has an “attackbox” component. You can play mix and match the components to customize the behavior of the entity. We don't have a hierachy of components because each component can be added more or less independently of the others.

An ECS is an alternative or an augmentation to the built-in object system of a language, but it is an admission that the built-in object is insufficient.

CLOS provides an elegent way to implement an ECS without abandoning CLOS's built-in object system. We define a component as a “mixin” class that can be inherited from using multiple inheritance. We define mixin classes for each component, and then we define entity classes that inherit from the mixin classes. We would define a “sprite” mixin class, a “position” mixin class, etc. So the class of enemy entities would inherit from “sprite”, “position”, “velocity”, “hitbox”, “health”, and “attackbox” classes. A trap entity would inherit from “sprite”, “position”, and “attackbox” classes. A container entity that you could smash open would inherit from “sprite”, “position”, and “hitbox” classes. Etc.

Mixin classes aren't intended to be instantiated on their own, but instead provide slots to the classes that inherits from them. Furthermore, methods can be specialized on mixin classes so that instances derived from the mixin will respond to the method. This allows you to inherit from a mixin providing the particular desired behavior. For example, the “health” mixin class would provide a slot containing the entity's health and a “damage” method to decrease the health. Any entity inheriting from the “health” mixin will react to the damage method. Mixin classes can provide functionality similar to interfaces.

Mixin classes are an exception to Liskov's Substitution Principle, but they are a useful exception. Entities that inherit from a mixin do not have a “is-a” relationship with the mixin, but rather a “has-behavior-of” relationship. An entity inheriting from the “health” mixin is not a “health”, but it has the “health” behavior.

One feature of an ECS is that you can dynamically change the components of an object. For example, once an enemy is defeated, you can remove the “health” and “attackbox” components and add a “corpse” component. Is CLOS you could accomplish this by changing the class of the entity object to a class that doesn't have those mixins.

Of course care must be taken or you will end up with CLOS “soup”. But if you are careful, CLOS can provide a powerful and flexible system for implementing an ECS.

Joe MarshallWhy I Want Tail Recursion

· 63 days ago

The reason I want tail recursion is not to write loops (I can do that with a while loop), but to write in continuation passing style if I need to. Continuation passing style allows you to implement any control flow pattern you can imagine, not just the ones intrinsic to the language. You don’t want to use it all the time, but it’s a valuable fallback when you need some ad hoc advanced control flow. Without tail recursion, any non-trivial use of continuation passing style risks blowing the stack.

Iteration is just the special case of linear recursion that doesn’t accumulate state. 99 percent of the time, you know beforehand that you are looping and can use a looping construct, but sometimes you have the general case where whether you loop or not depends on the data at runtime. If you have tail recursion, you just write the code recursively and the tail recursion mechanism will turn it into a loop if it notices that you aren’t accumulating state.

If your language has lambda and tail recursion, it can implement any other control flow that might have been overlooked by the language designer. If it doesn’t, you're limited to the control flow the language designer bothered to implement.

Joe MarshallRewrite rules

· 64 days ago

I like rewrite rules. They are simple to understand and easy to implement. If you have a rewrite rule (or a set of them) that makes incremental progress in making an expression "simpler" in some sense, you can keep applying it (or them) repeatedly until you have finished the calculation.

If a function directly returns the result of calling another function, we say that the call is in "tail position". We can view this as a simple rewrite rule. Consider the fact-mul function which computes the factorial of a number n and multiplies it by another number m:

fact-mul (n, m) = n! * m

We can define two rewrite rules for this function. If n is 0, then the factorial is 1, so we just rewrite this as m. If n is not 0, we can rewrite this as fact-mul (n-1, n*m). This translates directly into Lisp:

(defun fact-mul (n m)
  (if (zerop n)
      m
     (fact-mul (- n 1) (* n m))))

The rewrite rule doesn't have to rewrite into the same function. For instance, we can rewrite odd? (x) to even? (x-1) and even? (x) to odd? (x-1):

(defun odd? (x)
  (if (zerop x)
      nil
      (even? (1- x))))

(defun even? (x)
  (if (zerop x)
      t  
      (odd? (1- x))))

Yes, this is bog-simple tail recursion, but somehow it seems even simpler to me if I think of it as a rewrite rule.

Even if you don't have tail recursion, or if you have disabled it, it still works, but the number of rewrites is limited by the stack depth.

Joe MarshallIf only ...

· 66 days ago

I have a problem with Lisp. It doesn’t do what I want. Whenever I write a Lisp program, I always find myself wanting new special forms that aren’t in the language. The object system almost never does exactly what I want. The control structures never cover all my use cases.

If only I could add new syntactic forms to the language. If only I could tweak the internals of the object system. If only I could add new kinds of control structure.

Oh, wait. I can.

Joe MarshallMaybe machine (part 2)

· 67 days ago

At the end of 6.004, we had a contest for improving the performance of the Maybe machine. There was an 8-bit ALU included in the chip set we were given, so the logical thing to do was to incorporate it into the machine in place of the shift register. This would naturally improve the performance of arithmetic operations, but there were other reasons that the Maybe machine was slow.

One of the rules of the contest was that we were not allowed to change the clock speed. But the Maybe machine had a four phase clock. The first phase would address the device driving the bus, the second phase would connect it to the bus, the third phase would clock the receiving device, and the fourth phase would keep the data asserted to obey the hold time on the logic.

The four phase clock was technically unnecessary. We had built it in this manner to illustrate set-up and hold times for the bus, but the TTL chips we were using were specifically designed to be used on an open collector bus with no hold time, so you could actually clock the bus on every other phase rather than divide by four and still be within valid hardware specs. The fundamental clock speed was not changed, but the machine would run twice as fast.

The microcode word on the Maybe machine had four parts. The first part loaded the shift register from memory. The second part operated on the shift register. The third part wrote the shift register back to memory, and the fourth part was the next microcode address. But much of a typical microinstruction was wasted time. More often than not, the shift register was loaded from the last location it was written to, so it already contained the data it needed. The load cycle was often unnecessary. If you just left the data in the shift register, it would already be there for the next instruction. A memory to memory move involved a no-op on the shift register, which took clock cycles. Advancing to the next instruction took more clock cycles.

Instead of a four part microinstruction, I made a one-part microinstruction. True, this meant that I would occasionally need more microinstructions, but each one was smaller, and I could omit the no-ops, so I could accomplish the same amount in fewer clock cycles. I eliminated the next-instruction field by replacing the microcode address register with a counter.

The original Maybe machine had a two-address microcode — each instruction would specify a separate load and store address. Most people retained the wide microcode and could specify each input to the ALU independently. But I wanted to squeeze each microinstruction into 16 bits, so I turned the machine into a one-address machine with an accumulator. This made my microcode one quarter of the size, but it made it difficult to find enough bits in the microcode to accomplish all the tasks it needed to do. I didn’t have enough bits to directly address the accumulator, so I attached the accumulator to the output of the ALU. One ALU input came from the bus, the other was fed back from the accumulator. To load the accumulator, you had to pass the data through the ALU.

There weren’t enough bits for jump instructions, but I did a hack: I incremented the instruction address half way through the instruction cycle. This way, you could read the bits of the next instruction while you were still executing the current instruction. So you could do an unconditional jump by placing the target in the next instruction. There still weren’t enough bits for a conditional jump, so the microcode address space ended up being partitioned into sixteen segments and you could only conditionally jump within the same segment. To jump between segments you had to use an unconditional jump.

I obviously had to do a complete rewrite of the microcode. My friend Bob Baldwin helped me out by writing an assembler for the new microinstruction format.

The new microcode ran substantially faster than the original. When the contest came, my machine was clearly faster than the competition. Unfortunately, it crashed on the third benchmark and I ended up disqualified. I think it was a bug involving switching between the segments of microcode, but I never had the opportunity to debug it fully. I learned a hard lesson about keeping things simple.

Joe MarshallMaybe machine

· 67 days ago

When I took 6.004, Computation Structures, we built the “Maybe” machine, a pedagogical microcomputer made from discrete TTL. It had no CPU. It had no ALU. It had a shift register. The microcode could load the shift register from memory, test the low bit, shift the register, and store the register to memory. That was it. It was universal (Turing complete), but it was primitive. If I recall correctly, Steve Ward, Chris Terman, and Dan Nussbaum designed it.

We each built our own Maybe machine on a “nerd kit” — a large briefcase/small suitcase with a power supply, some solderless breadboards, some switches and LEDs, and a bunch of TTL chips. We had to wire up the chips ourselves. I was maybe a bit anal about wiring it up. I color coded the wires, made neat right angle bends, and ran the wires in parallel in the troughs between the breadboards. But when it came to testing the gates, this made it easier to find the wires that had ended up in the wrong place.

When we were building it, we had a test PROM with microcode that would just exercise the logic gates. It had no functional program in it. If the gates were wired correctly, certain patterns would appear on the LEDs.

At last I had it all wired up. Each gate responded appropriately to the test PROM. I double checked to make sure I ran each and every test. I was ready to try the real microcode now. I put my EEPROM under the UV light to erase it, and then programmed it with the microcode. I verified that the PROM contained the correct bits. I popped the PROM into the socket, and powered up the machine. Nothing. The machine did not display the expected pattern of LEDS. It didn’t display anything. Crap.

I erased the EEPROM and reprogrammed it as a test PROM again. I ran all the tests to find which one was failing. All passed. I ran them again. Everything checked out. I started to think maybe a loose wire? I ran the tests again and wiggled the wires. The tests still passed.

Confused, I erased the EEPROM and reprogrammed it with the microcode. I put it back in the Maybe machine and powered it up. Nothing. I was stumped. I had no idea what was wrong. I tried wiggling the wires. I tried wiggling the chips. Nothing.

I reprogrammed the test PROM. Each reprogramming of the test PROM took at least 15 minutes as I waited for the UV light to erase the old program and then wrote in the new one. I verified one last time that all the tests ran correctly and went to consult the TA.

The TA had heard it all before. Some freshman can’t follow instructions, wires it up wrong, and is sloppy about running the tests. He was going to demonstrate the right way to thoroughly test the machine. He had a checklist of each and every test. He was going to exhaustively go through the checklist and tally each one as he went. He was going to show me how to do it right.

He was a little surprised that he didn’t find the problem right away. As each test passed, be became more and more puzzled. Finally, he got to the end of his checklist and every test had passed. So he programmed the microcode into the EEPROM and powered up the machine. Nothing.

Ok, it was a challege. The machine made him look foolish in front of this lower classman. He was going to find the problem. He reprogrammed the test PROM and ran the tests again. As he ran each test, he’d wiggle the wiring and the chips involved in the functional unit being tested. Everything was in order. But when he put in the microcode ROM, everything was dead.

We went through a couple of cycles of this, but it was getting to be dinner time so we decided to call it a night. I could only think that I was going to have to rewire the whole thing from scratch. I was dreading it. I headed back to my dorm room.

But...

It was odd that when you tried to run the microcode, although you didn’t get the expected pattern of LEDs, they did briefly flicker when you powered up the machine. The instruction after loading the LEDs was a conditional branch to itself. The LEDs would be loaded and then stay on until the conditional became true when the user pressed a button. It was almost as if the conditional was taken immediately rather than waiting for the button press. Maybe the conditional was backwards? If the conditional were backwards, the LEDs would flicker because they’d be loaded and the mahine would immediately branch.

When I got to my dorm room, I opened the nerd kit and checked. Sure enough, the conditional was wired to the inverted output of a flip-flop. Every conditional branch would be taken the wrong way. The non-inverted output was the next hole over. The test PROM checked that the wire was connected, but didn’t check that it was connected to the positive sense of the conditional bit.

The next day, I went back to the lab with the fix. I reprogrammed the EEPROM with the microcode and powered up the machine. The LEDs lit up in the expected pattern and the machine ran as expected.

Tim BradshawMonochrome

· 67 days ago

Or, why limitations matter.

After we released Štar, my friend Zyni had a rather distressing and inconclusive exchange with someone on reddit. Apart from making me feel even better about walking away from reddit a decade and more ago, I realised an interesting thing when talking to her about her experience.

Limitations

Here are some curious things that people do.

  • Why, in 2024, would anyone use a digital camera which can only make monochrome pictures?
  • Why, in 2024, would anyone use a film camera?
  • Why, in 2024, would anyone record music in a studio which uses tape?
  • Why, in 2024, would any guitarist use a collection of flaky old FX pedals connected by noisy and unreliable cables, and a valve amplifier which occasionally catches fire1?
  • Why, in 2024, would anyone shoot a movie on film?

Yet people do all these things: there are at least two manufacturers of dedicated monochrome cameras, and you can pay to have your colour digital camera converted to monochrome by removing the filter array; many people use film; studios which use tape still exist and probably are becoming more common; too many guitarists to count use elderly FX pedals, a nest of cables and a valve amplifier; many very famous recent movies have been made on film.

There is, of course, a lot of myth and lore about the special magic properties of these things: people go on endlessly about how monochrome digital sensors have more effective resolution than Bayer sensors, as if anyone needs more resolution today, or how the effective sensitivity is higher, as if anyone needs more sensitivity today. The same for film (no, it’s not magic, no, it’s not somehow better than digital in any objective way), analogue recording, old FX pedals and amplifiers, and movies made on film. If you really think movies made on film are somehow better or ‘more natural’, watch The Holdovers, which was shot digitally but is a really beautiful simulacrum of what movies looked like in about 1972.

The awful truth is that none of these technologies get you anything objectively better, even if what you want to do is reproduce what people did when those technologies were all there was. You can simulate film so well it is impossible to tell the difference, you can simulate analogue tape just as well. Modern digital FX/amplification systems for guitar are a wonder. You can produce really beautiful monochrome images from colour files.

So why, really, do people do these things?

The clue, for me, is in the first one: why would anyone use a dedicated monochrome digital camera? Why would I use one if I could afford to? The answer, for me and others, is because it restricts what you can do. If your camera will only do monochrome, then that’s all you can do, and somehow that matters. I can’t afford a monochrome-only digital camera, but I do use a digital camera which has a monochrome-only workflow (this is why I bought it, in fact): the camera sets a ‘monochrome’ flag in its files which the raw-conversion tool understands, and unless you work hard you never see a colour version of the photographs you’ve taken. And this matters to me: it needs to be difficult to overcome the limitation, so I can think in monochrome2.

And of course I use film as well, and mostly black and white film with some recent excursions into colour reversal (slide) film. Partly I do this because I enjoy working in the darkroom and the tactile quality of film cameras from the 1980s and before, but more important, I think, is how limiting film is. Once you’ve taken a picture, it’s there: you can’t chimp and decide to do it again. You have 36 (or 10, or 1) exposures before you have to change film: every frame matters. Film can’t see in the dark. Mostly it can’t even see in colour (reversal film is now absurdly expensive). 35mm film is always going to have grain. Making photographs using film is a festival of limitations.

It turns out that, for many people, limitations matter. If you take them away, then suddenly you can do anything. So you spend your time fiddling with the parameters of what you can do, and doing nothing as a result, rather than being forced to pull your finger out and do something.

That’s why many people make photographs with monochrome cameras, or on film. That’s why people still make movies on film. That’s why people paint or draw, rather than using some vast digital image-creation system. That’s why people write books on typewriters, or with a pen. That’s why Colin Chapman designed cars the way he did, and why more than 160 companies have made (and still do make) replicas of one of his designs.

Not everyone: not even most people. But some people. Enough people.

And it’s not enough to simply say ‘I won’t use all the features I do not want’: those features need not to be there. Using a camera which can only make monochrome images is not like using a camera which can make colour images but choosing not to. Choosing not to chimp is not like being unable to chimp. Using a film camera is not like using a digital camera and film simulation. Drawing with a pencil on paper is not like using a drawing program but choosing only to use the pencil tool, however good that tool is. Limitations have to be, well, limitations. I don’t know why this is, but they do.

Štar

And that’s why Štar exists: because it’s limited, because it does one thing. Because there is as little syntax as we could make there be. In Štar everything is an iterator3 so any expression like

(for ((<var/s> <iterator>))
  ...)

can be turned into

(multiple-value-bind (v c) <iterator>
  (for ((<var/s> (values v c)))
    ...)

And it will mean exactly the same thing, although often it will be slower4. And all clauses are like that. There is only one case: the form on the right-hand side of a clause is a perfectly general expression evaluated, once, in the way you think it will be. There is no magic syntax, at all. And because Štar does only one thing — iterate — it forces you to use other tools to do other things, and to make sure all these tools compose well with each other. There was once a famous operating system whose designers played the same trick.

Richard Gabriel’s famous essay, usually known as Worse is better, is celebrated for describing the difference between right thing systems and worse is better systems5. It is less celebrated for another distinction it made amongst the right thing systems. That distinction is between big complex systems and diamond-like jewel systems:

The big complex system scenario goes like this:

First, the right thing needs to be designed. Then its implementation needs to be designed. Finally it is implemented. Because it is the right thing, it has nearly 100% of desired functionality, and implementation simplicity was never a concern so it takes a long time to implement. It is large and complex. It requires complex tools to use properly. The last 20% takes 80% of the effort, and so the right thing takes a long time to get out, and it only runs satisfactorily on the most sophisticated hardware.

The diamond-like jewel scenario goes like this:

The right thing takes forever to design, but it is quite small at every point along the way. To implement it to run fast is either impossible or beyond the capabilities of most implementors.

The two scenarios correspond to Common Lisp and Scheme.

Štar aspires to be the diamond-like jewel of iteration frameworks. Its interface is tiny: org.tfeb.* exports six symbols, four of which could be removed with no loss of functionality6. But it aims to be general and to be able to turn this clean, minimal syntax into fast code: apart from the catalogue of built-in iterators, almost all the rest of the interface is to the tools that let you do this for your own iterators.

It may seem odd that Štar is written in Common Lisp, not Scheme, but CL is what we actually use and so producing tools which turn CL into the system we’d like to have is what we care about. In my case, I also remember the movement to reexpress CL as a small core language combined with libraries.

Štar is not for everybody. If you like loop you probably will hate it (but you have no taste, so we don’t care). If you are an adherent of the big complex system school you probably won’t like it either (and we respect your opinion). That’s not who Štar is for: Štar is for the people who appreciate the diamond-like jewel, who believe that limitations matter, in a deep way.

If that’s not for you, that is completely fine: not everybody is the same: not everyone wants a monochrome camera, and not everyone knew, or liked if they did know, that famous operating system7. Some people even like loop, apparently. But understand that it is for some people, and try to avoid sneering at them, if you can.


  1. Well-maintained valve amplifiers hardly ever catch fire. Often they are also electrically safe. 

  2. Why a photographer or a movie maker would want to work in monochrome is another, related question: why, in 2024, intentionally throw away all the colour information in a scene? 

  3. In this way, Štar is like some Scheme compilers which turn everything into \(\lambda\) expressions and then try to make those expressions quick. 

  4. One of the things Štar’s tests do is exactly this transformation as a way of testing that iterator optimizers do not change the semantics of the program. 

  5. Entertainingly, that famous operating system is often disparaged as an example of worse is better. In terms of its implementation that may be justified: its ideas are certainly not examples of worse is better. 

  6. We have argued, at length, about this. We decided that, just as Scheme includes let and, even more extravagantly, let*, Štar should include final, for* and final*. In the language of the Scheme reports, they are library syntax (for*) and library procedures (final, final*, next* which serves no purpose without for*). 

  7. There’s a book about that. 

Joe MarshallReadability at Google

· 71 days ago

Over the years I was a developer at Google I never once tried to pass “readability” for any language. If you didn’t have “readability”, then any code you checked in would need an additionally review by someone who did have “readability”. The theory was that the code base would turn into an unreadable patchwork of different styles if everyone didn't adhere to the same set of rules. By having a “readability” requirement, the code base would be uniform and understandable. In practice, however, it was a social signal that indicated whether you were a card carrying member of the “in” group or not. Having “readability” meant you were a nobleman. Not having it meant you were a commoner.

To get “readability” you had to endure a struggle session with a current holder of “readability”. You would submit a not insubstantial amount of code for review and the reviewer would go over it with a fine-tooth comb pointing out every little syntactic rule you broke. It was a game of gotcha designed to make the reviewer feel smarter than you. You’d then return to your office or cubicle, make changes and iterate. If you were lucky, you would only need two or three iterations before you were bestowed with “readability”. The point was not to demonstrate that you knew how to write readable code, but to demonstrate that you were willing to submit to the authority of the reviewer and follow a set of arbitrary rules, no matter how absurd. It was a hazing ritual, a way to show that you were willing to be a team player.

A problem with having “readability” was that you were expected to uphold the “readability” guidelines. Not only did you have to play the game, you had to perpetuate it. I disagreed with a lot of the guidelines because they were poorly thought out rules for the sake of having rules. There were many situations in which they would reduce the comprehensibility of the code. By not having “readability”, I was free to write code that was easier to understand and maintain, even if bent or broke the codified rules.

This isn’t to say that I ignored Google’s style. It is hard to understand code that switches styles every few lines. I tried to make my code fit in to the surrounding code and look like it was part of it. I wrote new files with same general layout, curly brace conventions, and indentation that a typical Google file has. I didn’t worry about following “readability” guidelines at all, but I didn’t flout the guidelines either. I more concerned about my code being comprehended by the next developer than following a set of ad hoc rules to the letter.

So I got used to submitting my code for “readability” review, in addition to the normal code review. I off-loaded the entire responsibility of “readability” to the reviewer and let them tell me what to do. I didn’t argue with them. I simply followed the directions of the reviewer and made whatever changes were suggested. Often, though, changes to the code to more strictly follow the guidelines would result in noticeably worse code and the reviewers would soon withdraw their suggestions.

Another problem of being a nobleman with “readability” is that you had the obligation to review the code written by the commoners. The people with “readability” was a small enough group that they soon became the bottleneck in the review process. They were constantly overworked with code reviews. We commoners, on the other hand, weren’t expected to review code. Nor were we to blame if someone else is holding up the code review.

Frankly, I didn’t see any upside of having “readability” and many downsides. I was able to get my code checked in without too much trouble and I just didn’t have the time or inclination to get involved in the high-school politics of “readability”. Plenty of other companies do not have a “readability” ritual. They seem to get along just fine by hiring people who can write code that is easy to understand and maintain.

Tim Bradshaw&#352;tar: an iteration construct for Common Lisp

· 72 days ago

Štar is a concise and extensible iteration construct for Common Lisp which aims to be pleasant to use, easy to understand, fast if needed, general, and not to look like Fortran.

Common Lisp has multiple iteration constructs: mapping functions such as mapcar, special-purpose constructs such as dotimes and dolist, the general but somewhat clumsy construct which is do and do*, and finally the extended loop macro which aims to embed a ‘more friendly’ iteration language into CL and succeeds in being so complex that it is often hard to know whether a given form is legal or not without poring over loop’s grammar.

None of these constructs manage to be all three of pleasant to use, easy to understand and general. loop somehow fails to be any of these things in many cases. None are extensible1.

But Common Lisp is a Lisp, and Lisp’s huge advantage is that it is a programming language in which it is easy to write programming languages, or parts of them, like iteration constructs. That is, after all, how most or all of the existing constructs started life.

Lots of these have been written, of course. Štar tries to distinguish itself by being as simple as possible: it has as little special syntax as I could work out how to give it - there is no special little language you need to learn. It also has no inherent knowledge about how to iterate over any particular structure: it doesn’t know how to iterate over lists, or ranges of numbers. Rather it knows that iterating has to answer two questions:

  • is there more?
  • what is the next thing?

In addition it knows how to ask another question:

  • is there any information I can use to make asking the first two questions faster?

What it looks like

(for ((e (in-list l)))
  (print e))

(for (((k v) (in-hash-table h)))
  ...)

(for* ((entry (in-list entries))
       (element (in-list entry)))
  ...)

(defun in-alist (alist)
  (values
    (lambda () (not (null alist)))
    (lambda ()
      (destructuring-bind ((k . v) . more) alist
        (setf alist more)
        (values k v)))))

(for (((k v) (in-alist ...)))
  ...)

These are some simple examples: the last shows how easy it is to teach Štar to iterate over new things, or over existing things in new ways. Not shown here is that it’s also pretty easy to teach it how to optimize new iterators and to make various declarations about things.

What Štar is not

Štar is an iteration construct: what it does is to iterate. It has nothing to do with collecting values. For Štar, iteration and value accumulation are orthogonal problems which should be solved by orthogonal constructs. In particular if you wanted to make a list of the even numbers from a list, you might to this by using Štar together with a value-collection macro:

(collecting
  (for ((e (in-list ...)))
    (when (evenp e)
      (collect e))))

This is yet another way in which Štar differs from loop and many other constructs. In particular Štar’s point of view is that mixing together iteration and value accumulation results in a system that is not very good at either.

Similarly, Štar doesn’t contain a mass of syntax letting you select only certain values, or allowing you to terminate iteration early: you don’t write

(loop for x in l while (numberp x) do ...)

Instead you write

(for ((x (in-list l)))
  (unless (numberp x) (final))
  ...)

The body of an iteration is exactly like the body of defun, except there are some local functions which you can call to skip to the next iteration or finish the iteration.

Štar, of course, doesn’t bundle some destructuring system which will inevitably be subtly incompatible with other destructuring systems while also not being usable independently. If you want destructuring, use a full-fat system of your choice.

You probably get the idea: Štar is a tool whose job is to iterate: not some leaking bag of broken abstractions.

Multiple values

One thing that Štar does do is to take multiple values seriously. A clause which specifies a list of variables will bind them to the multiple values returned by the iterator. Multiple values, unlike destructuring, are something you really have to have in the iteration construct itself.

The thing that doesn’t really matter but everyone cares about

So, OK, Štar is an extensible, general, iteration construct. Obviously it will have traded performance for all this. I mean, it’s the old Lisp story, the one Gabriel told us long ago. Right?

* Running benchmarks
** lists of length 2000, nesting 3
what                                                  seconds      ratio
star                                                   30.312      1.000
loop                                                   30.071      0.992
dolist                                                 21.594      0.712
** range 100000, nesting 2
what                                                  seconds      ratio
star/with-step                                          9.414      1.000
star/no-step                                            9.406      0.999
loop                                                   18.412      1.956
dotimes                                                 9.469      1.006

A sketch of Štar

Štar has three parts, four if you count the iterators:

  • the iteration constructs themselves and bindings they make;
  • a protocol for defining new iterators;
  • a protocol for defining optimizers for iterators;
  • a collection of predefined iterators and optimizers for them.

The first three parts are much more finished than the fourth: most of the existing iterators were written as proofs of concept and may well change, get better, or go away.

Iterators

These are forms (usually, named function calls) which return two values, both functions of no arguments:

  • the first value is called and should return true if there is more to do;
  • the second value is called to return the next value or values of the iterator.

These functions answer the first two questions Štar needs to ask: is there more, and if there is, what is it? These two functions obviously share state in general.

To answer the third question - how do I make things faster? - named iterator functions can have optimizers: these are functions called by Štar at macroexpansion time which tell it how to make things faster. It’s up to an optimizer to ensure the semantics are the same for the optimized and unoptimized versions. Optimizers can specify a set of bindings to make, declarations for them, how to iterate, and some other things.

It’s possible to install and remove optimizers, and to dynamically bind sets of them. This might be useful, for instance, to compile a file where some particular assumptions (‘all vectors are vectors of floats’) are true.

Iteration constructs

There are two:

  • for iterates in parallel;
  • for* defines nested loops: (for* ((...) (...)) ...) is like (for ((...)) (for ((...)) ...)).

Note that because of the way iterators work, sequential binding within one loop makes no sense.

The first argument of for or for* specifies a number of clauses: each clause is of the form <var/s> <iterator>). Multiple values are supported, and it is possible to make various declarations about variables: this matters for for*, where there is no room for declarations for other than the last (innermost) clause. Variables whose name is "_" are ignored by default.

Within the body of an iteration there are four local functions:

  • next skips to the next iteration;
  • next* skips to the next outer-level iteration for for* and is the same as next for for;
  • final returns zero or more values from the current iteration
  • final* returns zero or more values for the outer-level iteration for for* and is the same as final for for.

Provided iterators

There are iterators which work for lists, vectors, general sequences, hash tables, ranges of reals, as well as some more interesting ones. Not all iterators currently have optimizers. You only need to care about writing optimizers if you need to make iterators very fast: they can be significantly fiddly to write.

The iterator over ranges of reals has an optimizer which tries hard to make things pretty fast.

As well as this there are some more interesting iterators. An example is sequentially:

(for ((a (sequentially 1 2)))
  ...)

will bind a sequentually to 1 and 2. But this iterator is a macro which has the behaviour of a FEXPR, so it evaluates its arguments only when needed (and as many times as needed). A variant of sequentially is sequentially* which ‘sticks’ on its last argument. So for instance:

> (let ((v 0))
    (for ((a (sequentially* (incf v)))
          (_ (in-range 3)))
      (print a)))

1 
2 
3 

Another way to write this is:

> (for ((a (let ((v 0)) (sequentially* (incf v))))
        (_ (in-range 4)))
    (print a))

1 
2 
3 
4 

And of course there is a meta-iterator (also implemented as a macro), in-iterators:

> (for ((a (in-iterators
            (in-list '(1 2))
            (in-list '(3 4)))))
    (print a))

1 
2 
3 
4 

It’s possible to construct very general iterators with tools like this.

As I said above, Štar’s current iterators are in a fairly rough state: a lot of this might change.


Notes

Declarations

A form like

(for* ((a (in-range 10))
       (b (in-range 10)))
  ...)

corresponds roughly to

(for ((a (in-range 10)))
  (for ((b (in-range 10)))
    ...)

The problem is now how to make declarations which apply to a. This is hard because CL doesn’t provide the tools you need to know whether a declaration refers to a variable or not: to know whether (declare (foo a)) refers to a or not you need to know, at least, whether foo names a type, which you can’t do. You often can guess, but not always.

So, rather than trying to solve an intractable problem, Štar lets you specify some properties of a variable in the clause that binds it: you can say

(for* (((a :type fixnum) (in-range 10))
       (b (in-range 10)))
  ...)

for instance. This is a bit ugly, but it solves the problem. It is only useful for for*.

Binding

Štar binds, rather than assigns:

(collecting
  (for ((a (in-list 1 2)))
    (collect
      (lambda (&optional (v vp))
        (if vp (setf a v) a)))))

will return two closures over independent bindings.

Epilogue

Štar’s source code is here. The manual is included with the source code but also here.

Štar is pronounced roughly ‘shtar’.

Much of the inspiration for Štar came from my friend Zyni: thanks to her for the inspiration behind it, actually making me write it and for many other things.

Štar is dedicated to her, and to Ian Anderson.


  1. Some implementations have mechanisms for extending loop

Paolo AmorosoInsphex, a hex dump tool in Medley Common Lisp

· 76 days ago

I'm developing the new program Insphex (inspect hex), a hex dump tool that is created with and runs on the Medley Interlisp environment.

Similarly to the Linux command hexdump, it shows the contents of files as hexadecimal values and the corresponding ASCII characters. An early version of the program prints the hex dump to the standard output like this.

Output of the Insphex hex dump tool for Medley Interlisp.

I plan to enhance Insphex to optionally display the dump in a separate window one page at a time. An attached menu will have options for showing the next page and exiting. I'll also provide an Exec command for running the program.

The code is in Common Lisp but will include some Interlisp to access the required system functionality.

Although Insphex is useful in itself, I have three main goals for it. First, I want a real project to practice the process for writing Common Lisp with the residential environment of Medley. This is the native way of coding on Medley and takes full advantage of its development environment and features such as the File Manager and the SEdit editor.

Most Medley tools and facilities are written in Interlisp or expose Interlisp APIs through which the functionality can be invoked. So another goal is to interface with Interlisp from Common Lisp to access the functionality I need like windows and menus.

My third goal is to experiment with displaying textual output in TEdit, the Medley word processor where the hex dump will optionally go.

Although the Interlisp API of TEdit supports advanced editing and formatting, Insphex does only basic text output. The primary feature I want is TEdit's ability to automatically handle repainting the window after it's resized or a hidden portion is exposed. This is handy as by default Interlisp windows mostly don't handle the repaint.

Now that the basic functionality of Insphex is in place I will implement displaying the hex dump in a TEdit window.

#insphex #CommonLisp #Interlisp #Lisp

Discuss... Email | Reply @amoroso@fosstodon.org

Joe MarshallThe Way of Lisp or The Right Thing

· 77 days ago
Interpreting Richard Gabriel with a nod to Tim Peters

Keep it simple.
A simple interface is better than a simple implementation.
Complete is better than simple.
Consistent is better than complete.
Correctness trumps all. Incorrectness is simply not allowed.
If the interface is hard to use, it is probably a bad idea.
If the interface is easy to use, it may be a good idea.
A weak language makes it hard to write good code, but a powerful language makes it easy to write bad code that looks good.
The first thing that comes to mind is not always the best thing.
When performance matters, get your declarations right.
There is always another way to do it, and maybe a better one.
Lists aren’t the only data type.
The right thing and 2 shillings will get you a cup of tea.


For older items, see the Planet Lisp Archives.


Last updated: 2024-07-16 05:11