-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathrelease_notes-10-02.txt
1224 lines (984 loc) · 57.2 KB
/
release_notes-10-02.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
===============================================
Release notes for the Genode OS Framework 10.02
===============================================
Genode Labs
After the release of the feature-packed version 9.11, we turned our attention
to improving the platform support of the framework. The current release 10.02
bears fruit of these efforts on several levels.
First, we are proud to announce the support for two new base platforms, namely
the NOVA hypervisor and the Codezero microkernel. These new kernels complement
the already supported base platforms Linux, L4/Fiasco, L4ka::Pistachio, and
OKL4. So why do we address so many different kernels instead of focusing our
efforts to one selected platform? Our observation is that different applications
pose different requirements on the kernel. Most kernels have a specific profile
with regard to security, hardware support, complexity, scheduling, resource
management, and licensing that may make them fit well for one application area
but not perfectly suited for a different use case. There is no single perfect
kernel and there doesn't need to be one. By using Genode, applications
developed for one kernel can be ported to all the other supported platforms with
a simple recompile. We believe that making Genode available on a new kernel is
beneficial for the kernel developers, application developers, and users alike.
For kernel developers, Genode brings additional workloads to stress-test their
kernel, and it extends the application area of the kernel. Application
developers can address several kernel platforms at once instead of tying their
programs to one particular platform. Finally, users and system integrators can
pick their kernel of choice for the problem at hand. Broadening the platform
support for Genode helps to make the framework more relevant.
Second, we introduced a new way for managing real-time priorities, which fits
perfectly with the recursive system structure of Genode. This clears the way to
multi-media and other real-time workloads that we target with our upcoming
work. We implemented the concept for the L4ka::Pistachio and OKL4 platforms.
With real-time priorities on OKL4, it is possible to run multiple instances of
the OKLinux kernel at the same time, each instance at a different priority.
Third, we vastly improved the existing framework, extended the ARM architecture
support to cover dynamic loading and the C runtime, introduced a new
thread-context management, added a plugin-concept to our C runtime, and
improved several device drivers.
Even though platform support is the main focus of this release, we introduced a
number of new features, in particular the initial port of the Python 2.6 script
interpreter.
NOVA hypervisor as new base platform
####################################
When we started the development of Genode in 2006 at the OS Group of the
Technische Universität Dresden, it was originally designated to be the user
land of a next-generation and to-be-developed new kernel called NOVA. Because
the kernel was not ready at that time, we had to rely on intermediate solutions
as kernel platform such as L4/Fiasco and Linux during development. These
circumstances led us to the extremely portable design that Genode has today and
motivated us to make Genode available on the whole family of L4 microkernels.
In December 2009, the day we waited for a long time had come. The first version
of NOVA was publicly released:
:Official website of the NOVA hypervisor:
[http://hypervisor.org]
Besides the novel and modern kernel interface, NOVA has a list of features that
sets it apart from most other microkernels, in particular support for
virtualization hardware, multi-processor support, and capability-based
security.
Why bringing Genode to NOVA?
============================
NOVA is an acronym for NOVA OS Virtualization Architecture. It stands for a
radically new approach of combining full x86 virtualization with microkernel
design principles. Because NOVA is a microkernelized hypervisor, the term
microhypervisor was coined. In its current form, it successfully addresses
three main challenges. First, how to consolidate a microkernel system-call API
with a hypercall API in such a way that the API remains orthogonal? The answer
to this question lies in NOVA's unique IPC interface. Second, how to implement
a virtual machine monitor outside the hypervisor without spoiling
performance? The Vancouver virtual machine monitor that runs on top NOVA proves
that a decomposition at this system level is not only feasible but can yield
high performance. Third, being a modern microkernel, NOVA set out to pursue a
capability-based security model, which is a challenge on its own.
Up to now, the NOVA developers were most concerned about optimizing and
evaluating NOVA for the execution of virtual machines, not so much about
running a fine-grained decomposed multi-server operating system. This is where
Genode comes into play. With our port of Genode to NOVA, we contribute the
workload to evaluate NOVA's kernel API against this use case. We are happy to
report that the results so far are overly positive.
At this point, we want to thank the main developers of NOVA Udo Steinberg and
Bernhard Kauer for making their exceptional work and documentation publicly
available, and for being so responsive to our questions. We also greatly
enjoyed the technical discussions we had and look forward to the future
evolution of NOVA.
Challenges
==========
From all currently supported base platforms of Genode, the port to NOVA was the
most venturesome effort. It is the first platform with kernel support for
capabilities and local names. That means no process except the kernel has
global knowledge. This raises a number of questions that seem extremely hard
to solve at the first sight. For example: There are no global IDs for threads
and other kernel objects. So how to address the destination for an IPC message?
Or another example: A thread does not know its own identity per se and there is
no system call similar to 'getpid' or 'l4_myself', not even a way to get a
pointer to a thread's own user-level thread-control block (UTCB). The UTCB,
however, is needed to invoke system calls. So how can a thread obtain its UTCB
in order to use system calls? The answers to these questions must be provided by
user-level concepts. Fortunately, Genode was designed for a capability kernel
right from the beginning so that we already had solutions to most of these
questions. In the following, we give a brief summary of the specifics of Genode
on NOVA:
* We maintain our own system-call bindings for NOVA ('base-nova/include/nova/')
derived from the NOVA specification. We put the bindings under MIT license
to encourage their use outside of Genode.
* Core runs directly as roottask on the NOVA hypervisor. On startup, core
maps the complete I/O port range to itself and implements debug output via
comport 0.
* Because NOVA does not allow rootask to have a BSS segment, we need a slightly
modified linker script for core (see 'src/platform/roottask.ld').
All other Genode programs use Genode's generic linker script.
* The Genode 'Capability' type consists of a portal selector expressing the
destination of a capability invocation and a global object ID expressing
the identity of the object when the capability is specified as an invocation
argument. In the latter case, the global ID is needed because of a limitation
of the current system-call interface. In the future, we are going to entirely
remove the global ID.
* Thread-local data such as the UTCB pointer is provided by the new thread
context management introduced with the Genode release 10.02. It enables
each thread to determine its thread-local data using the current stack
pointer.
* NOVA provides threads without time called local execution contexts (EC).
Local ECs are intended as server-side RPC handlers. The processing time
needed to perform RPC requests is provided by the client during the RPC call.
This way, RPC semantics becomes very similar to function call semantics with
regard to the accounting of CPU time. Genode already distinguishes normal
threads (with CPU time) and server-side RPC handlers ('Server_activation')
and, therefore, can fully utilize this elegant mechanism without changing the
Genode API.
* On NOVA, there are no IPC send or IPC receive operations. Hence, this part
of Genode's IPC framework cannot be implemented on NOVA. However, the
corresponding classes 'Ipc_istream' and 'Ipc_ostream' are never used directly
but only as building blocks for the actually used 'Ipc_client' and
'Ipc_server' classes. Compared with the other Genode base platforms, Genode's
API for synchronous IPC communication maps more directly onto the NOVA
system-call interface.
* The Lock implementation utilizes NOVA's semaphore as a utility to let a
thread block in the attempt to get a contended lock. In contrast to the
intuitive way of using one kernel semaphore for each user lock, we use only
one kernel semaphore per thread and the peer-to-peer wake-up mechanism we
introduced in the release 9.08. This has two advantages: First, a lock does
not consume a kernel resource, and second, the full semantics of the Genode
lock including the 'cancel-blocking' semantics are preserved.
* NOVA does not support server-side out-of-order processing of RPC requests.
This is particularly problematic in three cases: Page-fault handling, signal
delivery, and the timer service.
A page-fault handler can receive a page fault request only if the previous
page fault has been answered. However, if there is no answer for a
page-fault, the page-fault handler has to decide whether to reply with a
dummy answer (in this case, the faulter will immediately raise the same page
fault again) or block until the page-fault can be resolved. But in the latter
case, the page-fault handler cannot handle any other page faults. This is
unfeasible if there is only one page-fault handler in the system. Therefore,
we instantiate one pager per user thread. This way, we can block and unblock
individual threads when faulting.
Another classical use case for out-of-order RPC processing is signal
delivery. Each process has a signal-receiver thread that blocks at core's
signal service using an RPC call. This way, core can selectively deliver
signals by replying to one of these in-flight RPCs with a zero-timeout
response (preserving the fire-and-forget signal semantics). On NOVA however,
a server cannot have multiple RPCs in flight. Hence, we use a NOVA semaphore
shared between core and the signal-receiver thread to wakeup the
signal-receiver on the occurrence of a signal. Because a semaphore-up
operation does not carry payload, the signal has to perform a non-blocking
RPC call to core to pick up the details about the signal. Thanks to Genode's
RPC framework, the use of the NOVA semaphore is hidden in NOVA-specific stub
code for the signal interface and remains completely transparent at API
level.
For the timer service, we currently use one thread per client to avoid the need
for out-of-order RPC processing.
* Because NOVA provides no time source, we use the x86 PIT as user-level time
source, similar as on OKL4.
* On the current version of NOVA, kernel capabilities are delegated using IPC.
Genode supports this scheme by being able to marshal 'Capability' objects as
RPC message payload. In contrast to all other Genode base platforms where
the 'Capability' object is just plain data, the NOVA version must marshal
'Capability' objects such that the kernel translates the sender-local name to
the receiver-local name. This special treatment is achieved by overloading
the marshalling and unmarshalling operators of Genode's RPC framework. The
transfer of capabilities is completely transparent at API level and no
modification of existing RPC stub code was needed.
How to explore Genode on NOVA?
==============================
The Genode release 10.02 supports the NOVA pre-release version 0.1. You can
download the archive here:
:Download NOVA version 0.1:
[http://os.inf.tu-dresden.de/~us15/nova/nova-hypervisor-0.1.tar.bz2]
For building NOVA, please refer to the 'README' file contained in the archive.
Normally, a simple 'make' in the 'build/' subdirectory is all you need to
get a freshly baked 'hypervisor' binary.
The NOVA platform support for Genode resides in the 'base-nova/' repository.
To create a build directory prepared for compiling Genode for NOVA, you can use
the 'create_builddir' tool. From the top-level Genode directory, issue the
following command:
! ./tool/builddir/create_builddir nova_x86 GENODE_DIR=. BUILD_DIR=<dir>
This tool will create a fresh build directory at the location specified
as 'BUILD_DIR'. Provided that you have installed the
[http://genode.org/download/tool-chain - Genode tool chain], you can now build
Genode by using 'make' from within the new build directory.
Note that in contrast to most other kernels, the Genode build process does not
need to know about the source code of the kernel. This is because Genode
maintains its own system-call bindings for this kernel. The bindings reside in
'base-nova/include/nova/'.
NOVA supports multi-boot boot loaders such as GRUB, Pulsar, or gPXE. For
example, a GRUB configuration entry for booting the Genode demo scenario
with NOVA looks as follows, whereas 'genode/' is a symbolic link to the
'bin/' subdirectory of the Genode build directory and the 'config' file
is a copy of 'os/config/demo'.
! title Genode demo scenario
! kernel /hypervisor noapic
! module /genode/core
! module /genode/init
! module /config/demo/config
! module /genode/timer
! module /genode/ps2_drv
! module /genode/pci_drv
! module /genode/vesa_drv
! module /genode/launchpad
! module /genode/nitpicker
! module /genode/liquid_fb
! module /genode/nitlog
! module /genode/testnit
! module /genode/scout
Please note the 'noapic' argument for the NOVA hypervisor. This argument
enables the use of ordinary PIC IRQ numbers, as relied on by our current
PIT-based timer driver.
Limitations
===========
The current NOVA version of Genode is able to run the complete Genode demo
scenario including several device drivers (PIT, PS/2, VESA, PCI) and the GUI.
At version 0.1, however, NOVA is not yet complete and misses some features
needed to make Genode fully functional. The current limitations are:
* No real-time priority support: NOVA supports priority-based scheduling
but, in the current version, it allows each thread to create scheduling
contexts with arbitrary scheduling parameters. This makes it impossible
to enforce priority assignment from a central point as facilitated with
Genode's priority concept.
* No multi-processor support: NOVA supports multi-processor CPUs through
binding each execution context (ECs) to a particular CPU. Because everyone
can create ECs, every process could use multiple CPUs. However, Genode's API
devises a more restrictive way of allocating and assigning resources. In
short, physical resource usage should be arbitrated by core and the creation
of physical ECs should be performed by core only. However, Remote EC creation
is not yet supported by NOVA. Even though, multiple CPU can be used with
Genode on NOVA right now by using NOVA system calls directly, there is no
support at the Genode API level.
* Missing revoke syscall: NOVA is not be able to revoke memory mappings or
destroy kernel objects such as ECs and protection domains. In practice, this
means that programs and complete Genode subsystems can be started but not
killed. Because virtual addresses cannot be reused, code that relies on
'unmap' will produce errors. This is the case for the dynamic loader or
programs that destroy threads at runtime.
Please note that these issues are known and worked on by the NOVA developers.
So we expect Genode to become more complete on NOVA soon.
Codezero kernel as new base platform
####################################
Codezero is a microkernel primarily targeted to ARM-based embedded systems.
It is developed as an open-source project by a British company called B-Labs.
:B-Labs website:
[http://b-labs.com]
The Codezero kernel was first made publicly available in summer 2009. The
latest version, documentation, and community resources are available at the
project website:
:Codezero project website:
[http://l4dev.org]
As highlighted by the name of the project website, the design of the kernel is
closely related to the family of L4 microkernels. In short, the kernel provides
a minimalistic set of functionality for managing address spaces, threads, and
communication between threads, but leaves complicated policy and device access
to user-level components.
To put Codezero in relation to other L4 kernels, here is a quick summary on the
most important design aspects as implemented with the version 0.2, and how
our port of Genode relates to them:
* In the line of the original L4 interface, the kernel uses global name spaces
for kernel objects such as threads and address spaces.
* For the interaction between a user thread and the kernel, the concept of
user-level thread-control blocks (UTCB) is used. A UTCB is a small
thread-specific region in the thread's virtual address space, which is
always mapped. The access to the UTCB can never raise a page fault,
which makes it perfect for the kernel to access system-call arguments,
in particular IPC payload copied from/to user threads. In contrast to other
L4 kernels, the location of UTCBs within the virtual address space is managed
by the user land.
On Genode, core keeps track of the UTCB locations for all user threads.
This way, the physical backing store for the UTCB can be properly accounted
to the corresponding protection domain.
* The kernel provides three kinds of synchronous inter-process communication
(IPC): Short IPC carries payload in CPU registers only. Full IPC copies
message payload via the UTCBs of the communicating parties. Extended IPC
transfers a variable-sized message from/to arbitrary locations of the
sender/receiver address spaces. During an extended IPC, page faults may
occur.
Genode solely relies on extended IPC, leaving the other IPC mechanisms to
future optimizations.
* The scheduling of threads is based on hard priorities. Threads with the
same priority are executed in a round-robin fashion. The kernel supports
time-slice-based preemption.
Genode does not support Codezero priorities yet.
* The original L4 interface leaves the question on how to manage and account
kernel resources such as the memory used for page tables unanswered.
Codezero makes the accounting of such resources explicit, enables the
user-land to manage them in a responsible way, and prevent kernel-resource
denial-of-service problems.
* In contrast to the original L4.v2 and L4.x0 interfaces, the kernel provides
no time source in the form of IPC timeouts to the user land. A time source
must be provided by a user-space timer driver. Genode employs such a timer
services on all platforms so that it is not constricted by this limitation.
In several ways, Codezero goes beyond the known L4 interfaces. The most
noticeable addition is the support of so-called containers. A container is
similar to a virtual machine. It is an execution environment that holds a set
of physical resources such as RAM and devices. The number of containers and the
physical resources assigned to them are static and have to be defined at build
time. The code executed inside a container can roughly be classified by two
categories. First, there are static programs that require strong isolation from the
rest of the system but no classical operating-system infrastructure, for
example special-purpose telecommunication stacks or cryptographic functionality
of an embedded device. Second, there are kernel-like workloads, which use the L4
interface to substructure the container into address spaces, for example a
paravirtualized Linux kernel that uses Codezero address spaces to protect Linux
processes. Genode runs inside a container and facilitates Codezero's L4
interface to implement its multi-server architecture.
The second major addition is the use of a quite interesting flavor of a
capability concept to manage the authorization of processes to access system
resources and system calls. In contrast to most current approaches, Codezero
does not attempt to localize the naming of physical objects such as
address-space IDs and thread ID. So a capability is not referred to via a local
name but a global name. However, for delegating authorization throughout the
system, the capability approach is employed. A process that possesses a capability
to an object can deal with the object. It can further delegate this access
right to another party (to which it holds a capability). In a way, this
approach keeps the kernel interface true to the original L4 interface but
provides a much stronger concept for access control. However, it is important
to point out that the problem of ambient authority is not (yet) addressed by
this concept. If a capability is not used directly but specified as an argument
to a remote service, this argument is passed as a plain value not
protected by the kernel. Because the identity of the referenced object can be
faked by the client, the server has to check the plausibility of the argument.
For the server, however, this check is difficult/impossible because it has no
way to know whether the client actually possesses the capability it is talking
about.
The current port of Genode to Codezero does not make use of the capability
concept for fine-grained communication control, yet. As with the other L4
kernels, each object is identified by a unique ID allocated by a core service.
There is no mechanism in place to prevent faked object IDs.
:Thanks:
We want to thank the main developer of Codezero Bahadir Balban for his great
responsiveness to our feature requests and questions. Without his help, the
port would have taken much more effort. We hope that our framework will be of
value to the Codezero community.
Using Genode with Codezero
==========================
The port of Genode is known to work with the devel branch of Codezero version
0.2 as of 2010-02-19.
To download the Codezero source code from the official source-code repository,
you can use the following commands:
!git clone git://git.l4dev.org/codezero.git
!git checkout -b devel --track origin/devel
In addition to downloading the source code, you will need to apply the small
patch 'base-codezero/lcd.patch' to the Codezero kernel to enable the device
support for the LCD display. Go to the 'codezero.git/' directory and issue:
!patch -p1 < <genode-dir>/base-codezero/lcd.patch
For a quick start with Codezero, please follow the "Getting Started with the
Codezero Development" guide, in particular the installation of the tool chain:
:Getting started with Codezero:
[http://www.l4dev.org/getting_started]
The following steps guide you through building and starting Genode on Codezero
using the Versatilepb platform as emulated by Qemu.
# Create a Genode build directory for the Codezero/Versatilepb platform.
Go to the Genode directory and use the following command where '<build-dir>'
is the designated location of the new Genode build directory and
'<codezero-src-dir>' is the 'codezero.git/' directory with the Codezero
source tree, both specified as absolute directories.
! ./tool/builddir/create_builddir codezero_versatilepb \
! GENODE_DIR=. \
! BUILD_DIR=<genode-build-dir> \
! L4_DIR=<codezero-src-dir>
With the build directory created, Genode targets can immediately be
compiled for Codezero. For a quick test, go to the new build directory and
issue:
! make init
In addition to being a Genode build directory, the directory is already
prepared to be used as Codezero container. In particular, it holds a
'SConstruct' file that will be called by the Codezero build system. In this
file, you will find the list of Genode targets to be automatically built when
executing the Codezero build process. Depending on your work flow, you may
need to adapt this file.
# To import the Genode container into the Codezero configuration system,
go to the 'codezero.git/' directory and use the following command:
! ./scripts/baremetal/baremetal_add_container.py \
! -a -i Genode -s <genode-build-dir>
# Now, we can add and configure a new instance of this container via the
Codezero configuration system:
! ./configure.py
Using the interactive configuration tool, select to use a single container
and set up the following values for this bare-metal container, choose a
sensible 'Container Name' (e.g., 'genode0') and select the 'Genode' entry in
the 'Baremetal Project' menu.
:Default pager parameters:
! 0x40000 Pager LMA
! 0x100000 Pager VMA
These values are important because they are currently hard-wired in the
linker script used by Genode. If you need to adopt these values, make
sure to also update the Genode linker script located at
'base-codezero/src/platform/genode.ld'.
:Physical Memory Regions:
! 1 Number of Physical Regions
! 0x40000 Physical Region 0 Start Address
! 0x4000000 Physical Region 0 End Address
We only use 64MB of memory. The physical memory between 0 and 0x40000 is
used by the kernel.
:Virtual Memory Regions:
! 1 Number of Virtual Regions
! 0x0 Virtual Region 0 Start Address
! 0x50000000 Virtual Region 0 End Address
It is important to choose the end address such that the virtual memory
covers the thread context area. The context area is defined at
'base/include/base/thread.h'.
:Container Devices (Capabilities):
Enable the LCD display in the 'CLCD Menu'.
The configuration system will copy the Genode container template to
'codezero.git/conts/genode0'. Hence, if you need to adjust the container's
'SConscript' file, you need to edit 'codezero.git/conts/genode.0/SConscript'.
The original Genode build directory is only used as template when creating
a new Codezero container but it will never be looked at by the Codezero build
system.
# After completing the configuration, it is time to build both Codezero and
Genode. Thanks to the 'SConscript' file in the Genode container, the Genode
build process is executed automatically:
! ./build.py
You will find the end result of the build process at
! ./build/final.elf
# Now you can try out Genode on Qemu:
! qemu-system-arm -s -kernel build/final.elf \
! -serial stdio -m 128 -M versatilepb &
The default configuration starts the nitpicker GUI server and the launchpad
application. The versatilepb platform driver is quite limited. It does
support the LCD display as emulated by Qemu but no user input, yet.
Limitations
===========
At the current stage, the Genode version for Codezero is primarily geared
towards the developers of Codezero as a workload to stress their kernel. It
still has a number of limitations that would affect the real-world use:
* Because the only platform supported out of the box by the official Codezero
source tree is the ARM-based Versatilebp board, Genode is currently tied to
this hardware platform. When Codezero moves beyond this particular platform,
we will add a modular concept for platform support packages to Genode.
* The current timer driver at 'os/src/drivers/timer/codezero/' is a dummy
driver that just yields the CPU time instead of blocking. It is not
suitable as time source.
* The versatilepb platform driver at 'os/src/drivers/platform/versatilepb/'
does only support the LCD display as provided by Qemu but it was not tested on
real hardware. Because Codezero does not yet allow the assignment of the
Versatilepb PS/2 controller to a container, the current user-input driver is
just a dummy.
* The lock implementation is based on a simple spinlock using an atomic
compare-exchange operation, which is implemented via Codezero's kernel mutex.
The lock works and is safe but it has a number of drawbacks with regard to
fairness, efficiency, and its interaction with scheduling.
* Core's IRQ service is not yet implemented because the IRQ-handling interface
of Codezero is still in flux.
* Because we compile Genode with the same tool chain (Codesourcery ARM tool
chain) as used for Codezero, there are still subtle differences in the
linker scripts, making Genode's dynamic linker not yet functional on
Codezero.
* Even though Codezero provides priority-based scheduling, Genode does not
allow assigning priorities to Codezero processes, yet.
* Currently, all Genode boot modules are linked as binary data against core,
which is then loaded as single image into a container. For this reason, core
must be build after all binaries. This solution is far from being convenient
because changing the list of boot modules requires changes in core's
'platform.cc' and 'target.mk' file.
New thread-context management
#############################
With the current release, we introduced a new stack management concept that is
now consistently used on all Genode base platforms. Because the new concept
does not only cover the stack allocation but also other thread-specific context
information, we speak of thread-context management. The stack of a Genode
thread used to be a member of the 'Thread' object with its size specified as
template argument. This stack-allocation scheme was chosen because it was easy
to implement on all base platforms and is straight-forward to use. But there
are two problems with this approach.
First, the implementation of thread-local storage (TLS) is either platform
dependent or costly. There are kernels with support for TLS, mostly by the
means of a special register that holds a pointer to a thread-local data
structure (e.g., the UTCB pointer). But using such a facility implicates
platform-specific code on Genode's side. For kernels with no TLS support, we
introduced a unified TLS concept that registers stacks alongside with
thread-local data at a thread registry. To access the TLS of a thread, this
thread registry can be queried with the current stack pointer of a caller.
This query, however, is costly because it traverses a data structure. Up to
now, we accepted these costs because native Genode code did not use TLS. TLS
was only needed for code ported from the Linux kernel. However, with NOVA,
there is now a kernel that requires the user land to provide a fast TLS
mechanism to look up the current thread's UTCB in order to perform system
calls. On this kernel, a fast TLS mechanism is important.
The second disadvantage of the original stack allocation scheme is critical
to all base platforms: Stack overflows could not be detected. For each stack,
the developer had to specify a stack size. A good estimation for this value
is hard, in particular when calling functions of library code with unknown
stack usage patterns. If chosen too small, the stack could overflow, corrupting
the data surrounding the 'Thread' object. Such errors are extremely cumbersome
to detect. If chosen too large, memory gets wasted.
For storing thread-specific data (called thread context) such as the stack and
thread-local data, we have now introduced a dedicated portion of the virtual address
space. This portion is called thread-context area. Within the thread-context
area, each thread has a fixed-sized slot, a thread context. The layout of each
thread context looks as follows
[image thread_context]
; lower address
; ...
; ============================ <- aligned at 'CONTEXT_VIRTUAL_SIZE'
;
; empty
;
; ----------------------------
;
; stack
; (top) <- initial stack pointer
; ---------------------------- <- address of 'Context' object
; additional context members
; ----------------------------
; UTCB
; ============================ <- aligned at 'CONTEXT_VIRTUAL_SIZE'
; ...
; higher address
On some platforms, a user-level thread-control block (UTCB) area contains
data shared between the user-level thread and the kernel. It is typically
used for transferring IPC message payload or for system-call arguments.
The additional context members are a reference to the corresponding
'Thread_base' object and the name of the thread.
The thread context is a virtual memory area, initially not backed by real
memory. When a new thread is created, an empty thread context gets assigned
to the new thread and populated with memory pages for the stack and the
additional context members. Note that this memory is allocated from the RAM
session of the process environment and gets not accounted for when using the
'sizeof()' operand on a 'Thread_base' object.
This way, stack overflows are immediately detected because the corresponding
thread produces a page fault within the thread-context area. Data corruption
can never occur.
We implemented this concept for all base platforms and thereby made the
stack-overflow protection and the fast TLS feature available to all platforms.
On L4ka::Pistachio, OKL4, L4/Fiasco, Codezero, and NOVA, the thread-context
area is implemented as a managed dataspace. This ensures that the unused
virtual memory of the sparsely populated thread-context area is never selected
for attaching regular dataspaces into the process' address space. On Linux, the
thread-context area is implemented via a fixed offset added to the local
address for the 'mmap' system call. So on this platform, there is no protection
in place to prevent regular dataspaces from being attached within the
thread-context area.
Please note that in contrast to the original 'Thread' object, which contained
the stack, the new version does not account for the memory consumed by the
stack when using the 'sizeof()' operator. This has to be considered for
multi-threaded servers that want to account client-specific threads to the
memory donated by the corresponding client.
Real-time priorities
####################
There are two application areas generally regarded as predestined for
microkernels, high security and real time. Whereas the development of Genode
was primarily focused on the former application area so far, we observe growing
interest in using the framework for soft real-time applications, in particular
multi-media workload. Most of Genode's supported base platforms already provide
some way of real-time scheduling support, hard priorities with round-robin
scheduling of threads with the same priority being the most widely used
scheduling scheme. What has been missing until now was a way to access these
facilities through Genode's API or configuration interfaces. We deferred the
introduction for such interfaces for a very good reason: It is hard to get
right. Even though priority-based scheduling is generally well understood, the
combination with dynamic workload where differently prioritized processes are
started and deleted at runtime and interact with each other is extremely hard
to manage. At least, this had been our experience with building complex
scenarios with the Dresden real-time operating system (DROPS). Combined with
optimizations such as time-slice donating IPC calls, the behaviour of complex
scenarios tended to become indeterministic and hardly possible to capture.
Genode imposes an additional requirement onto all its interfaces. They have to
support the recursive structure of the system. Only if any subsystem of
processes is consistent on its own, it is possible to replicate it at an arbitrary
location within Genode's process tree. Assigning global priorities to single
processes, however, would break this condition. For example, non-related
subsystems could interfere with each other if both used the same range of
priorities for priority-based synchronization within the respective subsystem.
If executed alone, each of those subsystems would run perfectly but integrated
into one setup, they would interfere with each other, yielding unpredictable
results. We have now found a way to manage real-time priorities such that the
recursive nature Genode is not only preserved but actually put to good use.
Harmonic priority-range subdivision
===================================
We call Genode's priority management concept harmonic priority-range
subdivision. Priorities are not assigned to activities as global values but
they can be virtualized at each node in Genode's process tree. At startup time,
core assigns the right to use the complete range of priorities to the init
process. Init is free to assign those priorities to any of the CPU sessions it
creates at core, in particular to the CPU sessions it creates on behalf its
children and their grandchildren. Init, however, neither knows nor is it
interested in the structure of its child subsystems. It only wants to make sure
that one subsystem is prioritized over another. For this reason, it uses the
most significant bits of the priority range to express its policy but leaves
the lesser significant bits to be defined by the respective subsystems. For
example, if init wants to enforce that one subsystem has a higher priority than
all others, it would need to distinguish two priorities. For each CPU-session
request originating from one of its clients, it would diminish the supplied
priority argument by shifting the argument by one bit to the right and
replacing the most significant bit with its own policy. Effectively, init
divides its own range of priorities into two subranges. Both subranges, in
turn, can be managed the same way by the respective child. The concept works
recursively.
Implementation
==============
The implementation consists of two parts. First, there is the actual management
implemented as part of the parent protocol. For each CPU session request,
the parent evaluates the priority argument and supplements its own policy.
At this management level, a logical priority range of 0...2^16 is used to pass
the policy arguments from child to parent. A lower value represents a higher
priority. The second part is the platform-specific code in core that translates
priority arguments into kernel priorities and assigns them to physical
threads. Because the typical resolution for priority values is lower than 2^16,
this quantization can lead to the loss of the lower-significant priority bits.
In this case, differently prioritized CPU sessions can end up using the same
physical priority. For this reason, we recommend to not use priorities for
synchronization purposes.
Usage
=====
The assignment of priorities to subsystems is done via two additional tags in
init's 'config' file. The '<priolevels>' tag specifies how many priority levels
are distinguished by the init instance. The value must be a power of two. Each
'<start>' node can contain an optional '<priority>' declaration, which holds a
value between -priolevels + 1 and 0. This way, priorities can only be lowered,
never alleviated above init's priority. If no '<priority>' tag is specified,
the default value of 0 (init's own priority) is used. For an example, here is a
'config' file starting several nested instances of the init process using
different priority subranges.
! <config>
! <!--
! divides priority range 1..128 into
! 65..128 (prio 0)
! 1..64 (prio -1)
! -->
! <priolevels>2</priolevels>
! <start>
! <filename>init</filename>
! <priority>0</priority>
! <ram_quota>5M</ram_quota>
! <config>
! <!--
! divides priority range 65..128 into
! 113..128 (prio 0)
! 97..112 (prio -1)
! 81..96 (prio -2)
! 65..80 (prio -3)
! -->
! <priolevels>4</priolevels>
! <start>
! <filename>init</filename>
! <!-- results in platform priority 112 -->
! <priority>-1</priority>
! <ram_quota>512K</ram_quota>
! </start>
! <start>
! <filename>init</filename>
! <!-- results in platform priority 96 -->
! <priority>-2</priority>
! <ram_quota>2M</ram_quota>
! <config>
! <start>
! <filename>init</filename>
! <ram_quota>768K</ram_quota>
! </start>
! </config>
! </start>
! </config>
! </start>
! <start>
! <filename>init</filename>
! <!-- results in platform priority 64 -->
! <priority>-1</priority>
! <ram_quota>6M</ram_quota>
! <config></config>
! </start>
! </config>
On kernels that support priorities and where priority 128 is used as priority
limit (this is the case for OKL4 and Pistachio), this configuration should
result in the following assignments of physical priorities to process-tree
nodes:
[image priorities]
The red marker shows the resulting priority of the corresponding process.
; 128 : core
; 128 : core->init
; 128 : core->init->init
; 112 : core->init->init->init
; 98 : core->init->init->init.2
; 98 : core->init->init->init.2->init
; 64 : core->init->init.2
With Genode 10.02, we implemented the described concept for the OKL4 and
L4ka::Pistachio base platforms first. On both platforms, a priority range of 0
to 128 is used.
On L4/Fiasco, we were not yet able to apply this concept because on this
kernel, the used lock implementation is based on a yielding spinlock.
If a thread at a high priority would attempt to acquire a contended lock,
it would infinitely yield the CPU to itself, letting all other threads in
the system starve. In order to make real-time priorities usable on L4/Fiasco
we would need to change the lock first.
Base framework
##############
Read-only dataspaces
====================
Until now, we have not handled ROM dataspaces any different from RAM dataspaces
in core except for their predefined content. With the Genode workload becoming
more complex, ROM files tend to get shared between different processes and need
protection. Now, dataspaces of ROM modules are always mapped read-only.
Enabled the use of super pages by default
=========================================
Since release 9.08, we support super pages as an experimental feature. Now,
this feature is enabled by default on L4/Fiasco, L4ka::Pistachio, and NOVA.
Enabled managed dataspaces by default
=====================================
We originally introduced managed dataspaces with the release 8.11. However,
because we had no pressing use cases, it remained a experimental feature
until now. The new thread-context management introduced with this release
prompted us to promote managed dataspaces to become a regular feature.
Originally there was one problem holding us back from this decision, which
was the handling of cyclic references between nested dataspaces. However,
we do now simply limit the number of nesting levels to a fixed value.
Streamlined server framework
============================
We removed the 'add_activation()' functionality from the server and pager
libraries because on all platforms server activations and entry points have
a one-to-one relationship. This API was originally intended to support
platforms that are able to trigger one of many worker threads via a single
entry point. This was envisioned by an early design of NOVA. However, no
kernel (including NOVA) supports such a feature as of today.
Furthermore, we added a dedicated 'Pager_capability' type. On most
platforms, a pager is simply a thread. So using a 'Thread_capability' as type
for the 'Pager_capability' was sufficient. On NOVA, however, a pager is not
necessarily a thread. So we need to reflect this difference in the types.
PD session interface
====================
To support capability kernels with support for local names, it is not
sufficient to provide the parent capability to a new child by passing a plain
data argument to the new child during ELF loading anymore. We also need to tell
the kernel about the delegated right of the child to talk to its parent. This is
achieved using the new 'assign_parent' function of the PD session interface.
This function allows the creator of a new process to register the parent
capability.
Singleton services
==================
There are services, in particular device drivers, that support only one session
at a time. This characteristic was not easy to express in the framework.
Consequently, such services tended to handle the case of a second session
request inconsistently. We have now enhanced the 'Root_component' template with
a policy parameter to 'Root_component' that allows the specification of a
session-creation policy. The most important policy is whether a service can
have a single or multiple clients.
[http://genode.org/documentation/api/inline?code/base/include/root/component.h - See the improved template...]
Out-of-order RPC replies
========================
In the previous release, we introduced a transitional API for supporting
out-of-order RPC replies. This API is currently used by the timer and
signal services but is declared deprecated. The original implementation
used a blocking send operation to deliver replies, which is not desired
and can cause infinite blocking times in the presence of misbehaving clients.
Therefore, we changed the implementation to send explicit replies with no
timeout. Thanks to Frank Kaiser for pointing out this issue.
Operating-system services and libraries
#######################################
Python scripting
================
We have ported a minimal Python 2.6.4 interpreter to Genode. The port is
provided with the 'libports' repository. It is based on the official
Python code available from the website:
:Python website:
[http://www.python.org]
To fetch the upstream Python source code, call 'make prepare' from within the
'libports' directory. To include Python in your build process, add 'libports'
to your 'build.conf' file.
A test program for the script interpreter is provided at
'libports/src/test/python'. When building this test program, a shared library
'python.lib.so' will be generated. A sample Genode configuration
('config_sample') file that starts a Python script can be found within this
directory. If you are not using Linux as a Genode base platform, do not forget
to add 'python.lib.so' to your boot module list.
We regard this initial port as the first step to make a complete Python
runtime. At the current stage, there is support for 'Rom_session' Python
scripts to serve basic scripting needs, currently geared towards automated
testing. Modules and standard modules are not yet supported.
Plugin-interface for the C library
==================================
The recent addition of the lwIP stack to Genode stimulated our need to make the
C runtime extensible by providing multiple back ends, lwIP being one of them.
Therefore, we introduced a libc-internal plugin interface, which is able to
dispatch libc calls to one of potentially many plugins. The plugin interface
covers the most used file operations and a few selected networking functions.
By default, if no plugin is used, those functions point to dummy
implementations. If however, a plugin is linked against a libc-using program,
calls to 'open' or 'socket' are directed to the registered plugins, resulting
in plugin-specific file handles. File operations on such a file handle are then
dispatched by the corresponding plugin.
The first functional plugin is the support for lwIP. This makes it possible to
compile BSD-socket based network code to use lwIP on Genode. Just add the
following declaration in your 'target.mk':
! LIBS += libc libc_lwip lwip
The 'libc' library is the generic C runtime, 'lwip' is the raw lwIP stack, and
'libc_lwip' is the lwip plugin for the C runtime - the glue between 'lwip' and
'libc'. The initialization of lwip is not yet part of the 'lwip' plugin.
:Limitations:
We expand the libc-plugin interface on a per case basis. Please refer to
'libc/include/libc-plugin/plugin.h' to obtain the list of currently supported
functions. Please note that 'select' is not yet supported.
ARM architecture support for the C library
==========================================
We enhanced our port of the FreeBSD libc with support for the ARM
architecture. In the ARM version, the following files are excluded:
:libm: 'e_acosl.c', 'e_asinl.c', 'e_atan2l.c', 'e_hypotl.c', 's_atanl.c',
's_cosl.c', 's_frexpl.c', 's_nextafterl.c', 's_nexttoward.c',
's_rintl.c', s_scalbnl.c', 's_sinl.c', 's_tanl.c', 's_fmal.c',
:libc-gen: 'setjmp.S'
Atomic operation on ARM are not supported. Although these operations are
defined in 'machine/atomic.h', their original FreeBSD implementations are
not functional because we do not emulate the required FreeBSD environment
(see: 'sysarch.h'). However, these functions are not a regular part of
the libc anyway and are not referenced from any other libc code.