Skip to content

Lissp gotchas

gilch edited this page Dec 29, 2024 · 59 revisions

docstrings

  • Lambda forms don't have docstrings (because Python lambdas don't either). You'd need to set the .__doc__ attribute. Use defun (or fun) instead.
  • deftypeonce has no special case for docstrings. Set the .__doc__ attribute if you want one. (define Foo.__doc__ "a docstring") works, or use something like X#(attach X : __doc__ "a docstring") as a decorator.
  • Module-level docstrings work the same as Python: if the first top-level statement is a string literal, it's the module's docstring. Anything that Hissp would compile to that works. Setting the __doc__ global also works and may be more useful for metaprogramming because it doesn't have to come first in the compiled output, although it's best if the docstring is easy to find in the source code as well.

Hebigo's class and function macros are designed to work more like Python's statements and may include docstrings.

munging

Targets are not evaluated in the case macro, similar to a quote form. If you need a str target, you can use a symbol if it's an identifier. If it's not, you can use a || token or inject (.#) any read-time string expression.

It's idiomatic to use a symbol in place of a lambda's parameter tuple when only single-character positional parameters are desired. This works because symbols happen to compile to strs which are also iterables of (character) strs. But if anything munges, you'll get more parameters than you asked for. Using || tokens instead wouldn't help, since without munging, special characters are not valid identifiers in Python.

tuple and str are special

Most types in Hissp represent themselves, but the tuple and str atom types have special evaluation rules. You can't just use them literally, but they can be quoted as data. Tuples are invocation forms and strs are raw Python code (with some preprocessing).

there is only str

There are no separate string/symbol/keyword types like in other Lisps, because Python only uses strings for these concepts. Although Lissp has a notation for each of these, Hissp proper only has str atoms and all three token types get read as those, though with differing rules. (But the meaning of a str atom in Hissp depends on its contents and context.) The control (:-), symbol, and Unicode ("-") token distinction are reader-level concepts. They exist in Lissp, but not Hissp proper. A quoted Lissp string literal (like '"foo") gives you a str atom containing the Python code for a Python string literal (like "('foo')").

This is surprising if you expect Lissp Unicode tokens to represent themselves like the other atoms, as their equivalents typically would in other Lisps. Remember, fragments tokens are spelled with |-| in Lissp, not "-", which is (approximately) a shorthand for |('-')|. You have to use inject (.#) on a Lissp Unicode token to make it behave like a fragment token. If it happens to be a valid identifier, you could use a symbol token instead, otherwise it would munge. This also applies to recursively quoted tuples containing Lissp string literals. E.g. '("spam" "ham" "eggs") compiles to (the equivalent of) ("('spam')", "('ham')", "('eggs')"), not ('spam', 'ham', 'eggs') as one might expect. But `(,"spam" .#"ham" |eggs|) or '(spam \h\a\m |eggs|) would.

Control tokens (:-) read as str atoms that happen to begin with :, which can be meaningful at the Hissp level in certain contexts. If you're used to keywords in Common Lisp, you might expect something like :|foo bar| to work, since the | characters just \-escape characters in a range between them. But in Lissp, the | characters are delimiters, like " characters are, so the whole fragment must begin and end with them. Therefore, correct equivalent spellings in Lissp would be |:foo bar| or :foo\ bar, which would each read as a ':foo bar' str atom.

subscription tag

The [## tag is optimized for numeric indexing and slicing. It works well. Although it's possible to write any Python code that begins with that prefix, it has to demunge a single str atom to do it. This makes it a poor choice for expressions containing Python string literals or munged symbols.

E.g., [##'foo'] seems natural, but tokenizes as the tagging tokens [##, and ', and an atom foo']. Correct spellings include [##\'foo'] or [##|'foo']|, which tokenize properly. While acceptable style, a !##"foo" may be more natural in this case.

Similarly, most Python identifiers work, but something like [##+] would result in invalid Python, even if + is defined in scope. You'd instead have to use the munged name, like [##QzPLUS_], but even that doesn't work because there's a demunging step, which takes us back to where we started. Instead, a correct spelling double-munges, like [#QzPLUSQzLOWxLINE_]. The QzPLUS isn't recognized as a munged name, the QzLOWxLINE_ demunges to _, and demunging is not recursive, so this results in the desired QzPLUS_ identifier.

Don't do this. You almost never have to write munged names yourself, and certainly writing a double-munged name is a bad sign. They're meant to be human-readable, but aren't intended to be human-written. Use something like !##+ instead. You can also use the slice builtin and -> if you need chained lookups.

lookup tags in -<>>

The single-argument forms of !#, @#, and [# were designed to write incomplete code that works in ->, not a -<>>. In a ‑<>> context, use the normal two-argument form with an explicit :<> argument instead.

Similarly, nesting a -<>> inside -> to move the thread point for one step just works, but to do the reverse, you'd need to use an explicit :<> as the first argument to ->. Although this kind of thing could happen in macro expansions, it usually doesn't make sense to write it this way yourself for only one step, because we could use :<> directly instead.

pickles

Using types literally that have no literal representation in Python will result in a pickle expression. Unpicklable types will fail to compile at all. This doesn't usually happen accidentally. Types with no representation are usually compiled to the code to construct them, rather than the objects themselves. However, they can appear in the Hissp directly in two ways, either as the result of an injection, or as the result of a macro expansion.

If your use case allows adding dependencies and you need Hissp to be able to pickle more types, consider adding Dill. You can use

(setattr hissp.compiler. 'pickle dill.)

or (in Python)

import dill, hissp; hissp.compiler.pickle = dill

to enable Dill in the compiler.

comments

Comments are not simply ignored by the reader. They're parsed objects of a type that gets dropped by the reader by default. But they can still be arguments for tagging tokens before that happens. If you put a reader tag on its own line, take care not to accidentally add a line comment after it.

prelude

The prelude will replace your _macro_ object if you had one.

auto-qualification

`foo expands to something like __main__..foo. This usually does what you want when using template quotes to make code. An undesired qualification is a sign that it's a local identifier and you should use a gensym or that it's an intentional anaphor and you should explicitly suppress the qualification.

When using template quotes to make data not intended as code, you don't want qualification at all, and you should just unquote each thing. en#tuple or @ are often viable alternatives.

don't import _macro_ from hissp

It may seem natural to try something like (.update (globals) : _macro_ hissp.._macro_). You probably don't want to do this.

You'll lose the standalone property, meaning Hissp must be installed to run the compiled output, and worse, if you do this in multiple modules, their defmacros will leak through the shared mutable _macro_ object, so you'll also lose modularity. It's like assigning attributes to builtins; while there are legitimate uses, you probably don't need to do it, and it can get you into trouble, especially if everyone is doing it.

Use prelude or alias instead. They don't have these problems. If you take care to copy the contents of _macro_, rather than the object itself, you keep modularity. The prelude does this. If you avoid crashing when hissp isn't installed, then you keep the standalone property. The prelude also does this. So a proper full _macro_ import might look something like,

(hissp.._macro_.when (importlib.util..find_spec 'hissp)
  (.update (globals) : _macro_ (copy..copy hissp.._macro_)))

The above is only for illustration. You wouldn't write it this way yourself (although you might write a custom prelude that expands to something like it). A more idiomatic approach would be to inject, which simplifies things.

.#(.update (globals) : _macro_ (copy..copy hissp.._macro_))

Beware of side effects in injections, but this would at least be idempotent. .update always returns None, so nothing gets compiled in and nothing happens at run time, but the macros do get added as a side effect of the form being read, making them available during compilation.

For sharing macros among your own modules in a larger project, you probably just want alias, but it is possible to use the fully-qualified names to pick out individual macro functions and attach them to your module's _macro_ object.

reader tags end in #

E.g. _# discards in Lissp, but #_ discards in EDN and Clojure.

Typically, in other Lisps, reader macros use a leading dispatch character, (e.g. ') with # used to expand the repertoire without wasting another character by dispatching on the next one, making it used in the majority of the cases. Lissp is tokenized by a backtracking regex and instead has the reader tag tokens end in a #. This is a bit more compatible with generic Lisp editors which often don't know what to do reader macro syntax or do the wrong thing. They assume the Lissp tags are normal symbols, which is pretty much the correct behavior, although this can still confuse structural editing or automatic indents when in the function position. Lissp tags process the next parsed object (like EDN tags), not the raw character stream like Common Lisp reader macros. They can also often be applied to their argument without separating whitespace.

the progn optimization

A zero-argument lambda called immediately is replaced with its body. If you're expecting a new stack frame or new local scope in this situation, you'll be confused. Python is able to introspect the call stack, but this isn't often done. Standard Hissp only introduces new locals via the lambda form, and the optimization only kicks in if there aren't any, so it doesn't usually cause problems. However, non-standard assignments could be made with the walrus (:=) operator. This isn't recommended in Hissp code because many macros are defined in terms of lambda, which can make the scope confusing.

It is possible to work around the optimization if you absolutely need a new empty scope, by assigning the lambda and calling by name, or by defining the params in a way the compiler wouldn't recognize, for example,

#> ((lambda () 1 2 3)) ; Optimized.
>>> ((1),
...  (2),
...  (3))  [-1]
3
#> ((lambda | | 1 2 3)) ; Not recognized. Note the space.
>>> (lambda  :
...    ((1),
...     (2),
...     (3))  [-1]
... )()
3

useless quirks

These are unlikely to cause problems but should be noted somewhere.

The _macro_ namespace should only contain the callable objects used in macro expansion. If you set an attribute of _macro_ to None and try to expand it as a macro, the compiler will act as if it's not there and compile it like a function call instead. Setting the attribute to any other non-callable will crash when the compiler tries to call it.

Most namespace types you'd make a _macro_ with aren't clean to start with, including the types.SimpleNamespace used by prelude and defmacro. They have dunder names like __doc__ or __class__, etc. If you attempt to call any of these names, even if there's a callable local in scope, macro invocations take precedence, and the compiler will try to call these objects when compiling your code. Fortunately, this means the mistake would be caught early in most cases, because calling a non-callable would immediately crash.

It's rare that you'd ever use a dunder name for a local callable. (Names accessed as attributes are safe.) But if you need to call one anyway, a simple workaround is to keep it out of the invocation position, like ((-> __class__)) instead of (__class__). The former would call your variable, while the latter would call your _macro_ namespace's class (probably resulting in a SimpleNamespace pickle, since classes are actually callable).

It is possible to implement a namespace more cleanly in Python, for example, by using a class with __getattribute__ to reject access to dunder names, but Hissp's standalone property means it doesn't rely on compiling in any library code, so this isn't an option for Hissp itself, but you could write one if you want.