-
Notifications
You must be signed in to change notification settings - Fork 6
/
asdf3-2014.scrbl
3988 lines (3589 loc) · 199 KB
/
asdf3-2014.scrbl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
#lang scribble/sigplan @nocopyright @preprint
@;-*- Scheme -*-
@(require scribble/base
scriblib/autobib scriblib/footnote
scribble/decode scribble/core scribble/manual-struct scribble/decode-struct
scribble/html-properties scribble/tag
(only-in scribble/core style)
"utils.rkt" "bibliography.scrbl")
@authorinfo["François-René Rideau" "Google" "[email protected]"]
@conferenceinfo["ELS 2014" "May 5--6, Paris, France."]
@copyrightyear{2014}
@(title
(cons @(ASDF3)
(cons ", or Why Lisp is Now an Acceptable Scripting Language"
(if (extended?) (list (linebreak) @smaller{@smaller{(Extended version)}}) '()))))
@;TODO: update abstract and/or introduction to explain who are the many publics
@; and the things to be found in the article
@; motivate changes to ASDF.
@; nix footnotes
@; separate threads
@abstract{
@(ASDF), the @(de_facto) standard build system for @(CL),
has been vastly improved between
@(extended-only "2009") @(short-only "2012") and 2014.
These and other improvements finally bring @(CL) up to par
with "scripting languages"
in terms of ease of writing and @emph{deploying} portable code
that can access and "glue" together
functionality from the underlying system or external programs.
"Scripts" can thus be written in @(CL), and take advantage
of its expressive power, well-defined semantics, and efficient implementations.
We describe the most salient improvements in @(ASDF) @(short-only "3")
and how they enable previously difficult and portably impossible
uses of the programming language.
We discuss past and future challenges
in improving this key piece of software infrastructure,
and what approaches did or didn't work
in bringing change to the @(CL) community.
}
@section[#:style (make-style 'introduction '(unnumbered))]{Introduction}
As of 2013, one can use @(CL) (CL)@extended-only{@note{
@(CL), often abbreviated CL,
is a language defined in the ANSI standard X3.226-1994
by technical committee X3J13.
It's a multi-paradigm, dynamically-typed high-level language.
Though it's known for its decent support for functional programming,
its support for Object-Oriented Programming
is actually what remains unsurpassed still in many ways;
also, few languages even attempt to match either
its syntactic extensibility or its support for interactive development.
It was explicitly designed to allow for high-performance implementations;
some of them, depending on the application,
may rival compiled C programs in terms of speed,
usually far ahead of "scripting" languages and their implementations.
There are over a dozen maintained or unmaintained CL implementations.
No single one is at the same time
the best, shiniest, leanest, fastest, cheapest,
and the one ported to the most platforms.
For instance, SBCL is quite popular for its runtime speed
on Intel-compatible Linux machines;
but it's slower at compiling and loading,
won't run on ARM, and doesn't have the best Windows support;
and so depending on your constraints, you might prefer
Clozure CL, ECL, CLISP or ABCL.
Or you might desire the technical support or additional libraries
from a proprietary implementation.
While it's possible to write useful programs
using only the standardized parts of the language,
fully taking advantage of extant libraries
that harness modern hardware and software techniques
requires the use of various extensions.
Happily, each implementation provides its own extensions
and there exist libraries to abstract over
the discrepancies between these implementations
and provide portable access to threads (@cl{bordeaux-threads}),
Unicode support (@cl{cl-unicode}),
a "foreign function interface" to libraries written in C (@cl{cffi}),
ML-style pattern-matching (@cl{optima}), etc.
A software distribution system, @(Quicklisp),
makes it easy to install hundreds of libraries that use @(ASDF).
The new features in @(ASDF3) were only the last missing pieces in this puzzle.
}}
to @emph{portably} write the programs
for which one traditionally uses so-called "scripting" languages:
one can write small scripts that glue together functionality provided
by the operating system (OS), external programs, C libraries, or network services;
one can scale them into large, maintainable and modular systems;
and one can make those new services available to other programs via the command-line
as well as via network protocols, etc.
The last barrier to making that possible
was the lack of a portable way to build and deploy code
so a same script can run @emph{unmodified} for many users
on one or many machines using one or many different compilers.
This was solved by @(ASDF3).
@(ASDF) has been the @(de_facto) standard build system
for portable CL software since shortly after its release
by Dan Barlow in 2002 @~cite[ASDF-Manual].
@moneyquote{The purpose of a build system is
to enable division of labor in software development}:
source code is organized in separately-developed components
that depend on other components,
and the build system transforms the transitive closure of these components
into a working program.
@(ASDF3) is the latest rewrite of the system.
Aside from fixing numerous bugs,
it sports a new portability layer@extended-only{ @(UIOP)}.
One can now use @(ASDF) to write Lisp programs
that may be invoked from the command line
or may spawn external programs and capture their output
@(ASDF) can deliver these programs as standalone executable files;
moreover the companion script @cl{cl-launch} (see @secref{cl-launch})
can create light-weight scripts that can be run unmodified
on many different kinds of machines, each differently configured.
These features make portable scripting possible.
Previously, key parts of a program had to be configured to match
one's specific CL implementation, OS, and software installation paths.
Now, all of one's usual scripting needs can be entirely fulfilled using CL,
benefitting from its efficient implementations, hundreds of software libraries, etc.
In this article, we discuss how the innovations in @(ASDF3)
enable new kinds of software development in CL.
In @secref{what_it_is}, we explain what @(ASDF) is about;
we compare it to common practice in the C world@;
@extended-only{; this section does not require previous knowledge of CL}.
@;
In @secref{asdf3},
we describe the improvements introduced in @(ASDF3)
and @(ASDF3.1) to solve the problem of software delivery;
this section requires some familiarity with CL@extended-only{
though some of its findings are independent from CL;
for a historical perspective you may want to start with appendices A to F below
before reading this section}.
@;
In @secref{evolving}, we discuss the challenges
of evolving a piece of community software,
concluding with lessons learned from our experience@extended-only{;
these lessons are of general interest to software programmers
though the specifics are related to CL}.
@short-only{
This is the short version of this article.
It sometimes refers to appendices
present only in the extended version@~cite[ASDF3-2014],
that also includes a few additional examples and footnotes.
}
@extended-only{
This is the extended version of this article.
In addition to extra footnotes and examples,
it includes several appendices with historical information
about the evolution of @(ASDF) before @(ASDF3).
There again, the specifics will only interest CL programmers,
but general lessons can be found that are of general interest
to all software practitioners.
Roughly in chronological order, we have
the initial successful experiment in @secref{asdf1};
how it became robust and usable in @secref{asdf2};
the abyss of madness it had to bridge in @secref{pathnames};
improvements in expressiveness in @secref{asdf2.26};
various failures in @secref{failures};
and the bug that required rewriting it all over again in @secref{traverse}.
All versions of this article are available at @url{http://fare.tunes.org/files/asdf3/}:
@hyperlink["http://fare.tunes.org/files/asdf3/asdf3-2014.html"]{extended HTML},
@hyperlink["http://fare.tunes.org/files/asdf3/asdf3-2014.pdf"]{extended PDF},
@hyperlink["http://fare.tunes.org/files/asdf3/asdf3-els2014.html"]{short HTML},
@hyperlink["http://fare.tunes.org/files/asdf3/asdf3-els2014.pdf"]{short PDF}
(the latter was submitted to
@hyperlink["http://www.european-lisp-symposium.org/"]{ELS 2014}).
}
@section[#:tag "what_it_is"]{What @(ASDF) is}
@subsection{@(ASDF): Basic Concepts}
@subsubsection{Components}
@(ASDF) is a build system for CL:
it helps developers divide software into a hierarchy of @bydef{component}s
and automatically generates a working program from all the source code.
Top components are called @bydef{system}s in an age-old Lisp tradition,
while the bottom ones are source @bydef{file}s, typically written in CL.
In between, there may be a recursive hierarchy of @bydef{module}s@extended-only{
that may contain files or other modules
and may or may not map to subdirectories}.
Users may then @(operate) on these components with various build @bydef{operation}s,
most prominently compiling the source code (operation @(compile-op)) and
loading the output into the current Lisp image (operation @(load-op)).
Several related systems may be developed together
in the same source code @bydef{project}.
Each system may depend on code from other systems,
either from the same project or from a different project.
@(ASDF) itself has no notion of projects,
but other tools on top of @(ASDF) do:
@(Quicklisp) @~cite[quicklisp] packages together
systems from a project into a @bydef{release},
and provides hundreds of releases as a @bydef{distribution},
automatically downloading on demand
required systems and all their transitive dependencies.
Further, each component may explicitly declare
a @bydef{dependency} on other components:
whenever compiling or loading a component@;
@extended-only{(as contrasted with running it)}
relies on declarations or definitions
of packages, macros, variables, classes, functions, etc.,
present in another component, the programmer must
declare that the former component @(depends-on) the latter.
@subsubsection{Example System Definition@extended-only{s}}
Below is how the @cl{fare-quasiquote} system is defined (with elisions)
in a file @tt{fare-quasiquote.asd}.
It contains three files, @tt{packages},
@tt{quasiquote} and @tt{pp-quasiquote}
(the @tt{.lisp} suffix is automatically added based on the component class;
see @appref["pathnames"]{Appendix C}).
The latter files each depend on the first file,
because this former file defines the CL packages@note{
Packages are namespaces that contain symbols;
they need to be created before the symbols they contain
may even be read as valid syntax.@;
@extended-only{
Each CL process has a global flat two-level namespace:
symbols, named by strings, live in packages, also named by strings;
symbols are read in the current @cl{*package*},
but the package may be overridden with colon-separated prefix, as in
@cl{other-package:some-symbol}.
However, this namespace isn't global across images:
packages can import symbols from other packages,
but a symbol keeps the name in all packages and knows its "home" package;
different CL processes running different code bases
may thus have a different set of packages,
where symbols have different home packages;
printing symbols on one system and reading them on another may fail
or may lead to subtle bugs.
}}:
@clcode{
(defsystem "fare-quasiquote" ...
:depends-on ("fare-utils")
:components
((:file "packages")
(:file "quasiquote"
:depends-on ("packages"))
(:file "pp-quasiquote"
:depends-on ("quasiquote"))))
}
Among the elided elements were metadata such as @cl{:license "MIT"},
and extra dependency information
@cl{:in-order-to ((test-op (test-op "fare-quasiquote-test")))},
that delegates testing the current system
to running tests on another system.
Notice how the system itself @(depends-on) another system, @cl{fare-utils},
a collection of utility functions and macros from another project,
whereas testing is specified to be done by @cl{fare-quasiquote-test},
a system defined in a different file, @cl{fare-quasiquote-test.asd},
within the same project.
@extended-only{
The @tt{fare-utils.asd} file, in its own project,
looks like this (with a lot of elisions):
@clcode{
(defsystem "fare-utils" ...
:components
((:file "package")
(:module "base"
:depends-on ("package")
:components
((:file "utils")
(:file "strings" :depends-on ("utils"))
...))
(:module "filesystem"
:depends-on ("base")
:components
...)
...))
}
This example illustrates the use of modules:
The first component is a file @tt{package.lisp},
that all other components depend on.
Then, there is a module @cl{base};
in absence of contrary declaration,
it corresponds to directory @tt{base/};
and it itself contains files
@tt{utils.lisp}, @tt{strings.lisp}, etc.
As you can see, dependencies name @bydef{sibling} components
under the same @bydef{parent} system or module,
that can themselves be files or modules.
}
@subsubsection{Action Graph}
@; TODO: add a graph!
The process of building software is modeled as
a Directed Acyclic Graph (DAG) of @bydef{action}s,
where @emph{each action is a pair of an operation and a component.}
The DAG defines a partial order, whereby each action must be @bydef{perform}ed,
but only after all the actions it (transitively) @(depends-on) have already been performed.
For instance, in @cl{fare-quasiquote} above,
the @emph{loading} of (the output of compiling) @tt{quasiquote}
@(depends-on) the @emph{compiling} of @tt{quasiquote},
which itself depends-on
the @emph{loading} of (the output of compiling) @tt{package}, etc.
Importantly, though, this graph is distinct
from the preceding graph of components:
the graph of actions isn't a mere refinement of the graph of components
but a transformation of it that also incorporates
crucial information about the structure of operations.
@short-only{
@(ASDF) extracts from this DAG a @bydef{plan},
which by default is a topologically sorted list of actions,
that it then @bydef{perform}s in order,
in a design inspired by Pitman@~cite[Pitman-Large-Systems].
}
@extended-only{
Unlike its immediate predecessor @(mk-defsystem),
@(ASDF) makes a @bydef{plan} of all actions needed
to obtain an up-to-date version of the build output
before it @bydef{performs} these actions.
In @(ASDF) itself, this plan is a topologically sorted list of actions to be performed sequentially:
a total order that is a linear extension of the partial order of dependencies;
performing the actions in that order ensures that
the actions are always performed after the actions they depend on.
It's of course possible to reify the complete DAG of actions
rather than just extracting from it a single consistent ordered sequence.
Andreas Fuchs did in 2006, in a small but quite brilliant @(ASDF) extension
called @(POIU), the "Parallel Operator on Independent Units".
@(POIU) compiles files in parallel on Unix multiprocessors using @tt{fork},
while still loading them sequentially into a main image, minimizing latency.
We later rewrote @(POIU), making it
both more portable and simpler by co-developing it with @(ASDF).
Understanding the many clever tricks
by which Andreas Fuchs overcame the issues with the @(ASDF1) model
to compute such a complete DAG led to many aha moments,
instrumental when writing @(ASDF3) (see @appref["traverse"]{Appendix F}).
}
Users can extend @(ASDF) by defining
new subclasses of @(operation) and/or @(component)
and the methods that use them,
or by using global, per-system, or per-component hooks.
@subsubsection{In-image}
@moneyquote{@(ASDF) is an @q{in-image} build system},
in the Lisp @(defsystem) tradition:
it compiles (if necessary) and loads software into the current CL image,
and can later update the current image by recompiling and reloading the components that have changed.
For better and worse, this notably differs from common practice in most other languages,
where the build system is a completely different piece of software running in a separate process.@note{
Of course, a build system could compile CL code in separate processes,
for the sake of determinism and parallelism:
our XCVB did @~cite[XCVB-2009]; so does the Google build system.
@;@extended-only{As for the wide variety of Lisp dialects beside CL,
@;they have as many different build systems, often integrated with a module system.}
}
On the one hand, it minimizes overhead to writing build system extensions.@;
@XXX{
Says rpgoldman:
Another reason, of course, is that CL features like macros, etc. rely on global state.
So any out-of-process solution would still have to interact with a running process.
So an out-of-process solution that would cover artifacts like a knowledge base
would likely be MORE complex than the in-process solution...
}
On the other hand, it puts great pressure on @(ASDF) to remain minimal.
Qualitatively, @(ASDF) must be delivered as a single source file
and cannot use any external library,
since it itself defines the code that may load other files and libraries.
Quantitatively, @(ASDF) must minimize its memory footprint,
since it's present in all programs that are built,
and any resource spent is paid by each program.@extended-only{@note{
This arguably mattered more in 2002 when @(ASDF) was first released
and was about a thousand lines long:
By 2014, it has grown over ten times in size,
but memory sizes have increased even faster.
}}
For all these reasons, @(ASDF) follows the minimalist principle that
@moneyquote{anything that can be provided as an extension
should be provided as an extension and left out of the core}.
Thus it cannot afford to support a persistence cache
indexed by the cryptographic digest of build expressions,
or a distributed network of workers, etc.
However, these could conceivably be implemented as @(ASDF) extensions.
@subsection{Comparison to C programming practice}
Most programmers are familiar with C, but not with CL.
It's therefore worth contrasting @(ASDF) to the tools commonly used by C programmers
to provide similar services.
Note though how these services are factored in very different ways in CL and in C.
To build and load software, C programmers commonly use
@(make) to build the software and @tt{ld.so} to load it.
Additionally, they use a tool like @tt{autoconf}
to locate available libraries and identify their features.@extended-only{@note{
@(ASDF3) also provides functionality which would correspond
to small parts of the @tt{libc} and of the linker @tt{ld}.
}}
In many ways these C solutions are better engineered than @(ASDF).
But in other important ways @(ASDF) demonstrates how
these C systems have much accidental complexity
that CL does away with thanks to better architecture.
@itemlist[
@item{
Lisp makes the full power of runtime available at compile-time,
so it's easy to implement a Domain-Specific Language (DSL):
the programmer only needs to define new functionality,
as an extension that is then seamlessly combined
with the rest of the language, including other extensions.
In C, the many utilities that need a DSL
must grow it onerously from scratch;
since the domain expert is seldom also a language expert
with resources to do it right,
this means plenty of mutually incompatible, misdesigned,
power-starved, misimplemented languages that have to be combined
through an unprincipled chaos of
expensive yet inexpressive means of communication.
@; rpgoldman remarks: Greenspun's 10th law
}
@item{
Lisp provides full introspection at runtime and compile-time alike,
as well as a protocol to declare @bydef{features}
and conditionally include or omit code or data based on them.
Therefore you don't need dark magic at compile-time
to detect available features.
In C, people resort to
horribly unmaintainable configuration scripts
in a hodge podge of shell script, @tt{m4} macros, C preprocessing and C code,
plus often bits of @tt{python}, @tt{perl}, @tt{sed}, etc.
}
@item{
@(ASDF) possesses a standard and standardly extensible way to configure
where to find the libraries your code depends on, further improved in @(ASDF2).
In C, there are tens of incompatible ways to do it,
between @tt{libtool}, @tt{autoconf}, @tt{kde-config}, @tt{pkg-config},
various manual @tt{./configure} scripts, and countless other protocols,
so that each new piece of software requires the user
to learn a new @(ad_hoc) configuration method,
making it an expensive endeavor to use or distribute libraries.
}
@item{
@(ASDF) uses the very same mechanism
to configure both runtime and compile-time,
so there is only one configuration mechanism to learn and to use,
and minimal discrepancy.@note{
There is still discrepancy @emph{inherent} with these times being distinct:
the installation or indeed the machine may have changed.
}
In C, completely different, incompatible mechanisms are used
at runtime (@tt{ld.so}) and compile-time (unspecified),
which makes it hard to match
source code, compilation headers, static and dynamic libraries,
requiring complex "software distribution" infrastructures
(that admittedly also manage versioning, downloading and precompiling);
this at times causes subtle bugs when discrepancies creep in.
}
]
Nevertheless, there are also many ways in which @(ASDF) pales in comparison
to other build systems for CL, C, Java, or other systems:
@itemlist[
@item{
@(ASDF) isn't a general-purpose build system.
Its relative simplicity is directly related to it being custom made
to build CL software only.
Seen one way, it's a sign of how little you can get away with
if you have a good basic architecture;
a similarly simple solution isn't available to most other programming languages,
that require much more complex tools to achieve a similar purpose.
Seen another way, it's also the CL community failing to embrace
the outside world and provide solutions with enough generality
to solve more complex problems.@extended-only{@note{
@(ASDF3) could be easily extended to support arbitrary build actions,
if there were an according desire. But @(ASDF1) and 2 couldn't:
their action graph was not general enough,
being simplified and tailored for the common use case
of compiling and loading Lisp code;
and their ability to call arbitrary shell programs
was a misdesigned afterthought (copied over from @(mk-defsystem))
the implementation of which wasn't portable, with too many corner cases.
}}
}
@item{
At the other extreme, a build system for CL could have been made
that is much simpler and more elegant than @(ASDF),
if it could have required software to follow some simple organization constraints,
without much respect for legacy code.
A constructive proof of that is @(quick-build) @~cite[Quick-build],
being a fraction of the size of @(ASDF), itself a fraction of the size of @(ASDF3),
and with a fraction of the bugs — but none of the generality and extensibility
(See @secref{asdf-package-system}).
}
@item{
@(ASDF) it isn't geared at all to build large software
in modern adversarial multi-user, multi-processor, distributed environments
where source code comes in many divergent versions and in many configurations.
It is rooted in an age-old model of building software in-image, what's more
in a traditional single-processor, single-machine environment with a friendly single user,
a single coherent view of source code and a single target configuration.
The new @(ASDF3) design is consistent and general enough
that it could conceivably be made to scale, but that would require a lot of work.
}
]
@section[#:tag "asdf3"]{@(ASDF) 3: A Mature Build}
@subsection{A Consistent, Extensible Model}
Surprising as it may be to the CL programmers who used it daily,
there was an essential bug at the heart of @(ASDF):
it didn't even try to propagate timestamps from one action to the next.
And yet it worked, mostly.
The bug was present from the very first day in 2001,
and even before in @(mk-defsystem) since 1990@~cite[MK-DEFSYSTEM],
and it survived till December 2012,
despite all our robustification efforts since 2009@~cite[Evolving-ASDF].
Fixing it required a complete rewrite of @(ASDF)'s core.
As a result, the object model of @(ASDF) became at the same time
more powerful, more robust, and simpler to explain.
The dark magic of its @(traverse) function
is replaced by a well-documented algorithm.
It's easier than before to extend @(ASDF), with fewer limitations and fewer pitfalls:
users may control how their operations do or don't propagate along the component hierarchy.
Thus, @(ASDF) can now express arbitrary action graphs,
and could conceivably be used in the future to build more than just CL programs.
@extended-only{
@XXX{EXAMPLES!}
}
@moneyquote{The proof of a good design is in the ease of extending it}.
And in CL, extension doesn't require privileged access to the code base.
We thus tested our design by
adapting the most elaborate existing @(ASDF) extensions to use it.
The result was indeed cleaner, eliminating the previous need
for overrides that redefined sizable chunks of the infrastructure.
Chronologically, however, we consciously started this porting process
in interaction with developing @(ASDF3), thus ensuring @(ASDF3)
had all the extension hooks required to avoid redefinitions.
See the entire story in @appref["traverse"]{Appendix F}.
@subsection[#:tag "bundle_operations"]{Bundle Operations}
Bundle operations create a single output file
for an entire system or collection of systems.
The most directly user-facing bundle operations are
@(compile-bundle-op) and @(load-bundle-op):
the former bundles into a single compilation file
all the individual outputs from the @(compile-op)
of each source file in a system;
the latter loads the result of the former.
Also @cl{lib-op} links into a library all the object files in a system
and @cl{dll-op} creates a dynamically loadable library out of them.
The above bundle operations also have so-called @emph{monolithic} variants
that bundle all the files in a system @emph{and all its transitive dependencies}.
Bundle operations make delivery of code much easier.
They were initially introduced as @cl{asdf-ecl},
an extension to @(ASDF) specific to the implementation ECL, back in the day of @(ASDF1).@;
@extended-only{@note{
Most CL implementations
maintain their own heap with their own garbage collector, and then
are able to dump an image of the heap on disk,
that can be loaded back in a new process with all the state of the former process.
To build an application, you thus
start a small initial image, load plenty of code, dump an image, and there you are.
ECL, instead, is designed to be easily embeddable in a C program;
it uses the popular C garbage collector by Hans Boehm & al.,
and relies on linking and initializer functions rather than on dumping.
To build an application with ECL (or its variant MKCL),
you thus link all the libraries and object files together,
and call the proper initialization functions in the correct order.
Bundle operations are important to deliver software using ECL
as a library to be embedded in some C program.
Also, because of the overhead of dynamic linking, loading a single object file
is preferable to a lot of smaller object files.
}}
@cl{asdf-ecl} was distributed with @(ASDF2), though in a way
that made upgrade slightly awkward to ECL users,
who had to explicitly reload it after upgrading @(ASDF),
even though it was included by the initial @cl{(require "asdf")}.
In @extended-only{May} 2012, it was generalized to other implementations
as the external system @cl{asdf-bundle}.
It was then merged into @(ASDF)
during the development of @(ASDF3)@extended-only{ (2.26.7, December 2012)}:
not only did it provide useful new operations,
but the way that @(ASDF3) was automatically upgrading itself for safety purposes
(see @appref["Upgradability"]{Appendix B})
would otherwise have broken things badly for ECL users
if the bundle support weren't itself bundled with @(ASDF).
In @(ASDF3.1), using @cl{deliver-asd-op},
you can create both the bundle from @(compile-bundle-op) and an @(asd) file
to use to deliver the system in binary format only.
@extended-only{
Note that @(compile-bundle-op), @(load-bundle-op) and @cl{deliver-asd-op}
were respectively called @(fasl-op), @(load-fasl-op) and @cl{binary-op}
in the original @cl{asdf-ecl} and its successors up until @(ASDF3.1).
But those were bad names, since every individual @(compile-op) has a fasl
(a fasl, for FASt Loading, is a CL compilation output file),
and since @cl{deliver-asd-op} doesn't generate a binary.
They were eventually renamed,
with backward compatibility stubs left behind under the old name.
}
@subsection{Understandable Internals}
After bundle support was merged into @(ASDF) (see @secref{bundle_operations} above),
it became trivial to implement a new @(concatenate-source-op) operation.
Thus @(ASDF) could be developed as multiple files, which would improve maintainability.
For delivery purpose, the source files would be concatenated in correct dependency order,
into the single file @tt{asdf.lisp} required for bootstrapping.
The division of @(ASDF) into smaller, more intelligible pieces
had been proposed shortly after we took over @(ASDF);
but we had rejected the proposal then
on the basis that @(ASDF) must not depend on external tools
to upgrade itself from source, another strong requirement
(see @appref["Upgradability"]{Appendix B}).
With @(concatenate-source-op),
an external tool wasn't needed for delivery and regular upgrade,
only for bootstrap.
Meanwhile this division had also become more important,
since @(ASDF) had grown so much, having almost tripled in size since those days,
and was promising to grow some more.
It was hard to navigate that one big file, even for the maintainer,
and probably impossible for newcomers to wrap their head around it.
To bring some principle to this division@extended-only{ (2.26.62)},
we followed the principle of one file, one package,
as demonstrated by @(faslpath) @~cite[faslpath-page] and @(quick-build) @~cite[Quick-build],
though not yet actively supported by @(ASDF) itself (see @secref{asdf-package-system}).
This programming style ensures that files are indeed providing related functionality,
only have explicit dependencies on other files, and
don't have any forward dependencies without special declarations.
Indeed, this was a great success in making @(ASDF) understandable,
if not by newcomers, at least by the maintainer himself;@extended-only{@note{
On the other hand, a special setup is now required
for the debugger to locate the actual source code in @(ASDF);
but this price is only paid by @(ASDF) maintainers.
}}
this in turn triggered a series of enhancements that would not otherwise
have been obvious or obviously correct,
illustrating the principle that
@moneyquote{good code is code you can understand,
organized in chunks you can each fit in your brain}.
@subsection{Package Upgrade}
Preserving the hot upgradability of @(ASDF) was always a strong requirement
(see @appref["Upgradability"]{Appendix B}).
In the presence of this package refactoring,
this meant the development of a variant of CL's @(defpackage)
that plays nice with hot upgrade: @(define-package).
Whereas the former isn't guaranteed to work and may signal an error
when a package is redefined in incompatible ways,
the latter will update an old package to match the new desired definition
while recycling existing symbols from that and other packages.
Thus, in addition to the regular clauses from @(defpackage),
@(define-package) accepts a clause @cl{:recycle}:
it attempts to recycle each declared symbol
from each of the specified packages in the given order.
For idempotence, the package itself must be the first in the list.
For upgrading from an old @(ASDF), the @cl{:asdf} package is always named last.
The default recycle list consists in a list of the package and its nicknames.
New features also include @cl{:mix} and @cl{:reexport}.
@cl{:mix} mixes imported symbols from several packages:
when multiple packages export symbols with the same name,
the conflict is automatically resolved in favor of the package named earliest,
whereas an error condition is raised when using the standard @cl{:use} clause.
@cl{:reexport} reexports the same symbols as imported from given packages,
and/or exports instead the same-named symbols that shadow them.
@(ASDF3.1) adds @cl{:mix-reexport} and @cl{:use-reexport},
which combine @cl{:reexport} with @cl{:mix} or @cl{:use} in a single statement,
which is more maintainable than repeating a list of packages.
@; example of define-package
@subsection{Portability Layer}
Splitting @(ASDF) into many files revealed that a large fraction of it
was already devoted to general purpose utilities.
This fraction only grew under the following pressures:
a lot of opportunities for improvement became obvious after dividing @(ASDF) into many files;
features added or merged in from previous extensions and libraries
required new general-purpose utilities;
as more tests were added for new features, and were run on all supported implementations,
on multiple operating systems, new portability issues cropped up
that required development of robust and portable abstractions.
The portability layer, after it was fully documented,
ended up being slightly bigger than the rest of @(ASDF).
Long before that point, @(ASDF) was thus formally divided in two:
this portability layer, and the @(defsystem) itself.
The portability layer was initially dubbed @cl{asdf-driver},
because of merging in a lot of functionality from @cl{xcvb-driver}.
Because users demanded a shorter name that didn't include @(ASDF),
yet would somehow be remindful of @(ASDF), it was eventually renamed @(UIOP):
the Utilities for Implementation- and OS- Portability@note{
U, I, O and P are also the four letters that follow QWERTY
on an anglo-saxon keyboard.
}.
It was made available separately from @(ASDF)
as a portability library to be used on its own;
yet since @(ASDF) still needed to be delivered as a single file @tt{asdf.lisp},
@(UIOP) was @emph{transcluded} inside that file, now built using the
@cl{monolithic-concatenate-source-op} operation.
At Google, the build system actually uses @(UIOP) for portability without the rest of @(ASDF);
this led to @(UIOP) improvements that will be released with @(ASDF "3.1.2").
Most of the utilities deal with providing sane pathname abstractions
(see @appref["pathnames"]{Appendix C}),
filesystem access, sane input/output (including temporary files),
basic operating system interaction —
many things for which the CL standard lacks.
There is also an abstraction layer over the less-compatible legacy implementations,
a set of general-purpose utilities, and a common core for the @(ASDF) configuration DSLs.@note{
@(ASDF3.1) notably introduces a @cl{nest} macro
that nests arbitrarily many forms
without indentation drifting ever to the right.
It makes for more readable code without sacrificing good scoping discipline.
}
Importantly for a build system, there are portable abstractions for compiling CL files
while controlling all the warnings and errors that can occur,
and there is support for the life-cycle of a Lisp image:
dumping and restoring images, initialization and finalization hooks,
error handling, backtrace display, etc.
However, the most complex piece turned out to be a portable implementation of @(run-program).
@subsection{@(run-program)}
With @(ASDF3), you can run external commands as follows:
@clcode|{
(run-program `("cp" "-lax" "--parents"
"src/foo" ,destination))
}|
On Unix, this recursively hardlinks files in directory
@tt{src/foo} into a directory named by the string @cl{destination},
preserving the prefix @tt{src/foo}.
You may have to add @cl{:output t :error-output t}
to get error messages on your @cl{*standard-output*} and @cl{*error-output*} streams,
since the default value, @(nil), designates @tt{/dev/null}.
If the invoked program returns an error code,
@(run-program) signals a structured CL @(error),
unless you specified @cl{:ignore-error-status t}.
This utility is essential for @(ASDF) extensions and CL code in general
to portably execute arbitrary external programs.
It was a challenge to write:
Each implementation provided a different underlying mechanism
with wildly different feature sets and countless corner cases.
The better ones could fork and exec a process
and control its standard-input, standard-output and error-output;
lesser ones could only call the @tt{system(3)} C library function.
Moreover, Windows support differed significantly from Unix.
@(ASDF1) itself actually had a @cl{run-shell-command},
initially copied over from @(mk-defsystem),
but it was more of an attractive nuisance than a solution, despite our many bug fixes:
it was implicitly calling @cl{format}; capturing output was particularly contrived;
and what shell would be used varied between implementations, even more so on Windows.@;
@extended-only{@note{
Actually, our first reflex was to declare the broken @cl{run-shell-command} deprecated,
and move @(run-program) to its own separate system.
However, after our then co-maintainer (and now maintainer) Robert Goldman insisted that
@cl{run-shell-command} was required for backward compatibility and
some similar functionality expected by various @(ASDF) extensions,
we decided to provide the real thing rather than this nuisance,
and moved from @cl{xcvb-driver} the nearest code there was to this real thing,
that we then extended to make it more portable, robust, etc.,
according to the principle:
@bold{Whatever is worth doing at all is worth doing well} (Chesterfield).
}}
@(ASDF3)'s @(run-program) is full-featured,
based on code originally from @(XCVB)'s @cl{xcvb-driver} @~cite[XCVB-2009].
It abstracts away all these discrepancies to provide control over
the program's standard-output, using temporary files underneath if needed.
Since @(ASDF "3.0.3"), it can also control the standard-input and error-output.
It accepts either a list of a program and arguments, or a shell command string.
Thus your previous program could have been:
@clcode{
(run-program
(format nil "cp -lax --parents src/foo ~S"
(native-namestring destination))
:output t :error-output t)
}
where (UIOP)'s @cl{native-namestring} converts the @cl{pathname} object @cl{destination}
into a name suitable for use by the operating system,
as opposed to a CL @cl{namestring} that might be escaped somehow.
You can also inject input and capture output:
@clcode{
(run-program '("tr" "a-z" "n-za-m")
:input '("uryyb, jbeyq") :output :string)
}
returns the string @cl{"hello, world"}.
It also returns secondary and tertiary values @(nil) and @cl{0} respectively,
for the (non-captured) error-output and the (successful) exit code.
@(run-program) only provides a basic abstraction;
a separate system @(inferior-shell) was written on top of @(UIOP),
and provides a richer interface, handling pipelines, @tt{zsh} style redirections,
splicing of strings and/or lists into the arguments, and
implicit conversion of pathnames into native-namestrings,
of symbols into downcased strings,
of keywords into downcased strings with a @dashdash{} prefix.
Its short-named functions @cl{run}, @cl{run/nil}, @cl{run/s}, @cl{run/ss},
respectively run the external command with outputs to the Lisp standard- and error- output,
with no output, with output to a string, or with output to a stripped string.
Thus you could get the same result as previously with:
@clcode{
(run/ss '(pipe (echo (uryyb ", " jbeyq))
(tr a-z (n-z a-m))))
}
Or to get the number of processors on a Linux machine, you can:
@clcode{
(run '(grep -c "^processor.:"
(< /proc/cpuinfo))
:output #'read)
}
@subsection{Configuration Management}
@; XXX too compressed, move some to ext-only
@(ASDF) always had minimal support for configuration management.
@(ASDF3) doesn't introduce radical change,
but provides more usable replacements or improvements for old features.
For instance, @(ASDF1) had always supported version-checking:
each component (usually, a system)
could be given a version string with e.g.
@cl{:version "3.1.0.97"}, and @(ASDF) could be told to check
that dependencies of at least a given version were used, as in
@cl{:depends-on ((:version "inferior-shell" "2.0.0"))}.
This feature can detect a dependency mismatch early,
which saves users from having to figure out the hard way
that they need to upgrade some libraries, and which.
Now, @(ASDF) always required components to use "semantic versioning",
where versions are strings made of dot-separated numbers like @cl{3.1.0.97}.
But it didn't enforce it, leading to bad surprises for the users
when the mechanism was expected to work, but failed.
@(ASDF3) issues a @(warning) when it finds a version that doesn't follow the format.
It would actually have issued an @(error), if that didn't break too many existing systems.
Another problem with version strings was that they had to be written as literals in the @(asd) file,
unless that file took painful steps to extract it from another source file.
While it was easy for source code to extract the version from the system definition,
some authors legitimately wanted their code to not depend on @(ASDF) itself.
Also, it was a pain to repeat the literal version and/or the extraction code
in every system definition in a project.
@(ASDF3) can thus extract version information from a file in the source tree, with, e.g.
@cl{:version (:read-file-line "version.text")}
to read the version as the first line of file @tt{version.text}.
To read the third line, that would have been
@cl{:version (:read-file-line "version.text" :at 2)}
(mind the off-by-one error in the English language).
Or you could extract the version from source code.
For instance, @tt{poiu.asd} specifies
@cl{:version (:read-file-form "poiu.lisp" :at (1 2 2))}
which is the third subform of the third subform of the second form in the file @tt{poiu.lisp}.
The first form is an @cl{in-package} and must be skipped.
The second form is an @cl{(eval-when (...) body...)} the body of which starts with a
@cl{(defparameter *poiu-version* ...)} form.
@(ASDF3) thus solves this version extraction problem for all software —
except itself, since its own version has to be readable by @(ASDF2)
as well as by who views the single delivery file;
thus its version information is maintained by a management script using regexps,
of course written in CL.
Another painful configuration management issue with @(ASDF1) and 2
was lack of a good way to conditionally include files
depending on which implementation is used and what features it supports.
One could always use CL reader conditionals such as @(cl "#+(or sbcl clozure)")
but that means that @(ASDF) could not even see the components being excluded,
should some operation be invoked that involves printing or packaging the code
rather than compiling it — or worse, should it involve cross-compilation
for another implementation with a different feature set.
There was an obscure way for a component to declare a dependency on a @cl{:feature},
and annotate its enclosing module with @cl{:if-component-dep-fails :try-next}
to catch the failure and keep trying.
But the implementation was a kluge in @(traverse)
that short-circuited the usual dependency propagation
and had exponential worst case performance behavior
when nesting such pseudo-dependencies to painfully emulate feature expressions.
@(ASDF3) gets rid of @cl{:if-component-dep-fails}:
it didn't fit the fixed dependency model at all.
A limited compatibility mode without nesting was preserved
to keep processing old versions of SBCL.
As a replacement, @(ASDF3) introduces a new option @cl{:if-feature}
in component declarations, such that a component is only included
in a build plan if the given feature expression is true during the planning phase.
Thus a component annotated with @cl{:if-feature (:and :sbcl (:not :sb-unicode))}
(and its children, if any) is only included on an SBCL without Unicode support.
This is more expressive than what preceded,
without requiring inconsistencies in the dependency model,
and without pathological performance behavior.
@subsection{Standalone Executables}
One of the bundle operations contributed by the ECL team was @(program-op),
that creates a standalone executable.
As this was now part of @(ASDF3), it was only natural
to bring other @(ASDF)-supported implementations up to par:
CLISP, Clozure CL, CMUCL, LispWorks, SBCL, SCL.
Thus @(UIOP) features a @cl{dump-image} function to dump the current heap image,
except for ECL and its successors that follow a linking model and use a @cl{create-image} function.
These functions were based on code from @cl{xcvb-driver}, which had taken them from @(cl-launch).
@(ASDF3) also introduces a @(defsystem) option to specify an entry point as e.g.
@cl{:entry-point "my-package:entry-point"}.
The specified function (designated as a string to be read after the package is created)
is called without arguments after the program image is initialized;
after doing its own initializations,
it can explicitly consult @cl{*command-line-arguments*}@note{
In CL, most variables are lexically visible and statically bound,
but @emph{special} variables are globally visible and dynamically bound.
To avoid subtle mistakes, the latter are conventionally named with enclosing asterisks,
also known in recent years as @emph{earmuffs}.
}
or pass it as an argument to some main function.
Our experience with a large application server at ITA Software
showed the importance of hooks so that various software components may modularly register
finalization functions to be called before dumping the image,
and initialization functions to be called before calling the entry point.
Therefore, we added support for image life-cycle to @(UIOP).
We also added basic support for running programs non-interactively as well as interactively@;
@extended-only{ based on a variable @cl{*lisp-interaction*}}:
non-interactive programs exit with a backtrace
and an error message repeated above and below the backtrace,
instead of inflicting a debugger on end-users;
any non-@(nil) return value from the entry-point function is considered success
and @(nil) failure, with an appropriate program exit status.
Starting with @(ASDF3.1), implementations that don't support standalone executables
may still dump a heap image using the @(image-op) operation,
and a wrapper script, e.g. created by @(cl-launch), can invoke the program;
delivery is then in two files instead of one.
@(image-op) can also be used by all implementations
to create intermediate images in a staged build,
or to provide ready-to-debug images for otherwise non-interactive applications.
@subsection{@(cl-launch)}
Running Lisp code to portably create executable commands from Lisp is great,
but there is a bootstrapping problem:
when all you can assume is the Unix shell,
how are you going to portably invoke the Lisp code
that creates the initial executable to begin with?
We solved this problem some years ago with @(cl-launch).
This bilingual program, both a portable shell script and a portable CL program,
provides a nice colloquial shell command interface to
building shell commands from Lisp code,
and supports delivery as either portable shell scripts or
self-contained precompiled executable files.@;
@extended-only{@note{
@(cl-launch) and the scripts it produces are bilingual:
the very same file is accepted by both language processors.
This is in contrast to self-extracting programs,
where pieces written in multiple languages have to be extracted first
before they may be used, which incurs a setup cost and is prone to race conditions.
}}