Releases: open-power/skiboot
skiboot-5.5.0-rc2
skiboot-5.5.0-rc2 was released on Monday April 3rd 2017. It is the second
release candidate of skiboot 5.5, which will become the new stable release
of skiboot following the 5.4 release, first released November 11th 2016.
skiboot-5.5.0-rc2 contains all bug fixes as of :ref:skiboot-5.4.3
and :ref:skiboot-5.1.19
(the currently maintained stable releases).
For how the skiboot stable releases work, see :ref:stable-rules
for details.
The current plan is to cut the final 5.5.0 by April 8th, with skiboot 5.5.0
being for all POWER8 and POWER9 platforms in op-build v1.16 (Due April 12th).
This is a short cycle as this release is mainly targetted towards POWER9
bringup efforts.
Following skiboot-5.5.0, we will move to a regular six week release cycle,
similar to op-build, but slightly offset to allow for a short stabilisation
period. Expected release dates and contents are tracked using GitHub milestone
and issues: https://github.com/open-power/skiboot/milestones
Over :ref:skiboot-5.5.0-rc1
, we have the following changes:
NVLINK2
-
Introduce NPU2 support
NVLink2 is a new feature introduced on POWER9 systems. It is an
evolution of of the NVLink1 feature included in POWER8+ systems but
adds several new features including support for GPU address
translation using the Nest MMU and cache coherence.Similar to NVLink1 the functionality is exposed to the OS as a series
of virtual PCIe devices. However the actual hardware interfaces are
significantly different which limits the amount of common code that
can be shared between implementations in the firmware.This patch adds basic hardware initialisation and exposure of the
virtual NVLink2 PCIe devices to the running OS. -
npu2: Add OPAL calls for nvlink2 address translation services (see :ref:
OPAL_NPU2
)Adds three OPAL calls for interacting with NPU2 devices:
:ref:OPAL_NPU_INIT_CONTEXT
, :ref:OPAL_NPU_DESTROY_CONTEXT
and
:ref:OPAL_NPU_MAP_LPAR
.These are used to setup and configure address translation services
(ATS) for a process/partition on a given NVLink2 device.
POWER9
-
hdata/memory: ignore homer and occ reserved ranges
We populate these from the HOMER BARs in the PBA directly. There's no
need to take the hostboot supplied values so just ignore the
corresponding reserved ranges. -
hdata/vpd: Parse the OpenPOWER OPFR record
Parse the OpenPOWER FRU VPD (OPFR) record on OpenPOWER instead
of the VINI records. -
hdata/vpd: Parse additional VINI records
These records provide hardware version details, CCIN extension information,
card type details and hardware characteristics of the FRU -
hdata/cpu: account for p9 shared caches
On P9 the L2 and L3 caches are shared between pairs of SMT=4 cores.
Currently this is not accounted for when creating caches nodes in
the device tree. This patch adds additional checking so that a
cache node is only created for the first core in the pair and
the second core will reference the cache correctly. -
hdata: print backtraces on HDAT errors
-
hdat: ignore zero length reserves
Hostboot can export reserved regions with a length of zero and these
should be ignored rather than being turned into reserved range. While
we're here fix a memory leak by moving the "too large" region check
to before we allocate space for the label. -
SLW: Add init for power9 power management
This patch adds new function to init core for power9 power management.
SPECIAL_WKUP_* SCOM registers, if set, can hold the cores from going into
idle states. Hence, clear PPM_SPECIAL_WKUP_HYP_REG scom register for each
core during init. (This init are not required for MAMBO)
PCI
-
hw/phb3: Adjust ECRC on root port dynamically
The Samsung NVMe adapter is lost when it's connected to PMC 8546 PCIe
switch, until ECRC is disabled on the root port. We found similar issue
prevously when Broadcom adapter is connected to same part of PCIe switch
and it was fixed by commit 60ce59c ("hw/phb3: Disable ECRC on Broadcom
adapter behind PMC switch"). Unfortunately, the commit doesn't fix
the Samsung NVMe adapter lost issue.This fixes the issues by disable ECRC generation/check on root port
when PMC 8546 PCIe switch ports are found. This can be extended for
other PCIe switches or endpoints in future: Each PHB maintains the
count of PCI devices (PMC 8546 PCIe switch ports currently) which
require to disable ECRC on root port. The ECRC functionality is
enabled when first PMC 8546 switch port is probed and disabled when
last PMC 8546 switch port is destroyed (in PCI hot remove scenario).
Except PHB's reinitialization after complete reset, the ECRC on
root port is untouched. -
core/pci: Fix lost NVMe adapter behind PMC 8546 switch
The NVMe adapter in below PCI topology is lost. The root cause is
the presence bit on its PCI slot is missed, but the PCIe link has
been up. The PCI core doesn't probe the adapter behind the slot,
leading to lost NVMe adapter in the particular case.- PHB3 root port
- PLX switch 8748 (10b5:8748)
- PLX swich 9733 (10b5:9733)
- PMC 8546 swtich (11f8:8546)
- NVMe adapter (1c58:0023)
This fixes the issue by overriding the PCI slot presence bit with
PCIe link state bit. -
hw/phb4: Locate AER capability position if necessary
-
core/pci: Disable surprise hotplug on root port
-
core/pci: Ignore PCI slot capability on root port
We are creating PCI slot on root port, where the PCI slot isn't
supported from hardware. For this case, we shouldn't read the PCI
slot capability from hardware. When bogus data returned from the
hardware, we will attempt to the PCI slot's power state or enable
surprise hotplug functionality. All of them can't be accomplished
without hardware support.This leaves the PCI slot's capability list 0 if PCICAP_EXP_CAP_SLOT
isn't set in hardware (pcie_cap + 0x2). Otherwise, the PCI slot's
capability list is retrieved from hardware (pcie_cap + 0x14). -
phb4: Default to PCIe GEN2 on DD1
Default to PCIe GEN2 link speeds on DD1 for stability.
Can be overridden using nvram pcie-max-link-speed=4 parameter.
-
phb3/4: Set max link speed via nvram
This adds an nvram parameter pcie-max-link-speed to configure the max
speed of the pcie link. This can be set from the petitboot prompt
using: ::nvram -p ibm,skiboot --update-config pcie-max-link-speed=4
This takes preference over anything set in the device tree and is
global to all PHBs.
Tests
-
Mambo/Qemu boot tests: expect (and fail) on checkstop
This allows us to fail a lot faster if we checkstop
skiboot-5.5.0-rc1
skiboot-5.5.0-rc1 was released on Tuesday March 28th 2017. It is the first
release candidate of skiboot 5.5, which will become the new stable release
of skiboot following the 5.4 release, first released November 11th 2016.
skiboot-5.5.0-rc1 contains all bug fixes as of :ref:skiboot-5.4.3
and :ref:skiboot-5.1.19
(the currently maintained stable releases).
For how the skiboot stable releases work, see :ref:stable-rules
for details.
The current plan is to cut the final 5.5.0 by April 8th, with skiboot 5.5.0
being for all POWER8 and POWER9 platforms in op-build v1.16 (Due April 12th).
This is a short cycle as this release is mainly targetted towards POWER9
bringup efforts.
Following skiboot-5.5.0, we will move to a regular six week release cycle,
similar to op-build, but slightly offset to allow for a short stabilisation
period. Expected release dates and contents are tracked using GitHub milestone
and issues: https://github.com/open-power/skiboot/milestones
Over skiboot-5.4, we have the following changes:
New Platforms
-
SuperMicro's (SMC) P8DNU: An astbmc based POWER8 platform
-
Add a generic platform to help with bringup of new systems.
-
Four POWER9 based systems (NOTE: All POWER9 systems should be considered
for bringup use only at this point):- Romulus
- Witherspoon (a POWER9 system with NVLink2 attached GPUs)
- Zaius (OpenCompute platform, also known as "Barreleye 2")
- ZZ (FSP based system)
New features
-
System reset IPI facility and Mambo implementation
Add an opal call :ref:OPAL_SIGNAL_SYSTEM_RESET
which allows system reset
exceptions to be raised on other CPUs and act as an NMI IPI. There
is an initial simple Mambo implementation, but allowances are made
for a more complex hardware implementation.The Mambo implementation is based on the RFC implementation for POWER8
hardware (see https://patchwork.ozlabs.org/patch/694794/) which we hope
makes it into a future release.This implements an in-band NMI equivalent.
-
add CONTRIBUTING.md, ensuring that people new to the project have a one-stop
place to find out how to get started. -
interrupts: Add optional name for OPAL interrupts
This adds the infrastructure for an interrupt source to provide
a name for an interrupt directed toward OPAL. Those names will
be put into an "opal-interrupts-names" property which is a
standard DT string list corresponding 1:1 with the "opal-interrupts"
property. PSI interrupts get names, and this is visible in Linux
through /proc/interrupts -
platform: add OPAL_REBOOT_FULL_IPL reboot type
There may be circumstances in which a user wants to force a full IPL reboot
rather than using fast reboot. Add a new reboot type, OPAL_REBOOT_FULL_IPL,
that disables fast reboot. On platforms which don't support fast reboot,
this will be equivalent to a normal reboot. -
phb3: Trick to allow control of the PCIe link width and speed
This implements a hook inside OPAL that catches 16 and 32 bit writes
to the link status register of the PHB.It allows you to write a new speed or a new width, and OPAL will then
cause the PHB to renegociate.Example:
First read the link status on PHB4: ::
setpci -s 0004:00:00.0 0x5a.w a103
It's at x16 Gen3 speed (8GT/s)
bits 0x0ff0 are the width and 0x000f the speed. The width can be
1 to 16 and the speed 1 to 3 (2.5, 5 and 8GT/s)Then try to bring it down to 1x Gen1 : ::
setpci -s 0004:00:00.0 0x5a.w=0xa011
Observe the result in the PHB: ::
/ # lspci -s 0004:00:00.0 -vv 0004:00:00.0 PCI bridge: IBM Device 03dc (prog-if 00 [Normal decode]) .../... LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk- DLActive+ BWMgmt- ABWMgmt+
And in the device: ::
/ # lspci -s 0004:01:00.0 -vv .../... LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
-
core/init: Add hdat-map property to OPAL node.
Exports the HDAT heap to the OS. This allows the OS to view the HDAT heap
directly. This allows us to view the HDAT area without having to use
getmemproc. -
Add a generic platform: If /bmc in device tree, attempt to init one
For the most part, this gets us somewhere on some OpenPOWER systems
before there's a platform file for that machine.Useful in bringup only, and marked as such with scary looking log
messages.
Core
-
asm: Don't try to set LPCR:LPES1 on P8 and P9, the bit doesn't exist.
-
pci: Add a framework for quirks
In future we may want to be able to do fixups for specific PCI devices in
skiboot, so add a small framework for doing this.This is not intended for the same purposes as quirks in the Linux kernel,
as the PCI devices that quirks can match for in skiboot are not properly
configured. This is intended to enable having a custom path to make
changes that don't directly interact with the PCI device, for example
adding device tree entries. -
hw/slw: fix possible NULL dereference
-
slw: Print enabled stop states on boot
-
uart: Fix Linux pass-through policy, provide NVRAM override option
-
libc/stdio/vsnprintf.c: add explicit fallthrough, this silences a recent
(GCC 7.x) warning -
init: print the FDT blob size in decimal
-
init: Print some more info before booting linux
The kernel command line from nvram and the stdout-path are
useful to know when debugging console related problems. -
Makefile: Disable stack protector due to gcc problems
Depending on how it was built, gcc will use the canary from a global
(works for us) or from the TLS (doesn't work for us and accesses
random stuff instead).Fixing that would be tricky. There are talks of adding a gcc option
to force use of globals, but in the meantime, disable the stack
protector. -
Stop using 3-operand cmp[l][i] for latest binutils
Since a5721ba270, binutils does not support 3-operand cmp[l][i].
This adds (previously optional) parameter L. -
buddy: Add a simple generic buddy allocator
-
stack: Don't recurse into __stack_chk_fail
-
Makefile: Use -ffixed-r13
We use r13 for our own stuff, make sure it's properly fixed -
Always set ibm,occ-functional-state correctly
-
psi: fix the xive registers initialization on P8, which seems to be fine
for real HW but causes a lof of pain under qemu -
slw: Set PSSCR value for idle states
-
Limit number of "Poller recursion detected" errors to display
In some error conditions, we could spiral out of control on this
and spend all of our time printing the exact same backtrace.Limit it to 16 times, because 16 is a nice number.
-
slw: do SLW timer testing while holding xscom lock
We add some routines that let a caller get the xscom lock once and
then do a bunch of xscoms while holding it.
In some situations without this, it could take long enough to get
the xscom lock that the 1ms timeout would expire and we'd falsely
think the SLW timer didn't work when in fact it did. -
wait_for_resource_loaded: don't needlessly sleep for 5ms
-
run pollers in cpu_process_local_jobs() if running job synchonously
-
fsp: Don't recurse pollers in ibm_fsp_terminate
-
chiptod: More hardening against -1 chip ID
-
interrupts: Rewrite/correct doc for opal_set/get_xive
-
cpu: Don't enable nap mode/PM mode on non-P8
-
platform: Call generic platform probe and init UART there
-
psi: Don't register more interrupts than the HW supports
-
psi: Add DT option to disable LPC interrupts
I2C and TPM
-
p8i2c: Use calculated poll_interval when booting OPAL
Otherwise we'd default to 2seconds (TIMER_POLL) during boot on
chips with a functional i2c interrupt, leading to slow i2c
during boot (or hitting timeouts instead). -
i2c: Add i2c_run_req() to crank the state machine for a request
-
tpm_i2c_nuvoton: work out the polling time using mftb()
-
tpm_i2c_nuvoton: handle errors after reading the tpm fifo
-
tpm_i2c_nuvoton: cleanup variables in tpm_read_fifo()
-
tpm_i2c_nuvoton: handle errors after writting the tpm fifo
-
tpm_i2c_nuvoton: cleanup variables in tpm_write_fifo()
-
tpm_i2c_nuvoton: handle errors after writing sts.commandReady in step 5
-
tpm_i2c_nuvoton: handle errors after writing sts.go
-
tpm_i2c_nuvoton: handle errors after checking the tpm fifo status
-
tpm_i2c_nuvoton: return burst_count in tpm_read_burst_count()
-
tpm_i2c_nuvoton: isolate the code that handles the TPM_TIMEOUT_D timeout
-
tpm_i2c_nuvoton: handle errors after reading sts.commandReady
-
tpm_i2c_nuvoton: add tpm_status_read_byte()
-
tpm_i2c_nuvoton: add tpm_check_status()
-
tpm_i2c_nuvoton: rename defines to shorter names
-
tpm_i2c_interface: decouple rc from being done with i2c request
-
tpm_i2c_interface: set timeout before each request
-
i2c: Add nuvoton quirk, disallowing i2cdetect as it locks TPM
p8-i2c reset things manually in some error conditions
-
stb: create-container and wrap skiboot in Secure/Trusted Boot container
We produce UNSIGNED skiboot.lid.stb and skiboot.lid.xz.stb as build
artifacts.These are suitable blobs for flashing onto Trusted Boot enabled op-build
builds WITH the secure boot jumpers ON (i.e. NOT in secure mode).
It's just enough of the Secure and Trusted Boot container format to
make Hostboot behave.
PCI
-
core/pci: Support SRIOV VFs
Currently, skiboot can't see SRIOV VFs. It introduces some troubles
as I can see: The device initialization logic (phb->ops->device_init())
isn't applied to VFs, meaning we have to maintain same and duplicated
mechanism in kernel for VFs only. It introduces difficulty to code
maintaining and prone to lose sychronization.This was motivated by bug reported by Carol: The VF's Max Payload
Size (MPS) isn't matched with PF's...
skiboot-5.4.3
skiboot-5.4.3 was released on Monday January 16th, 2017. It replaces
:ref:skiboot-5.4.2
as the current stable release.
Over :ref:skiboot-5.4.2
, we have a small number of bug fixes:
- Makefile: Disable stack protector due to gcc problems
- Makefile: Use -ffixed-r13.
We use r13 for our own stuff, make sure it's properly fixed - phb3: Lock the PHB on set_xive callbacks
- arch_flash_arm: Don't assume mtd labels are short
- Stop using 3-operand cmp[l][i] for latest binutils
- hw/phb3: fix error handling in complete reset
skiboot-5.1.19
skiboot-5.1.19
skiboot-5.1.19 was released on Monday 16th January 2017.
skiboot-5.1.19 is the 20th stable release of 5.1, it follows skiboot-5.1.18
(which was released 26th August 2016).
This release contains a few minor bug fixes.
Changes are:
Generic:
- Makefile: Disable stack protector due to gcc problems
- stack: Don't recurse into __stack_chk_fail
- Makefile: Use -ffixed-r13
We did not find evidence of this ever being a problem, but this fix
is good and preventative. - Limit number of "Poller recursion detected" errors to display
In some error conditions, we could spiral out of control on this
and spend all of our time printing the exact same backtrace.
Limit it to 16 times, because 16 is a nice number.
FSP based Systems:
- fsp: Don't recurse pollers in ibm_fsp_terminate
If we were to terminate in a poller, we'd call op_display() which
called pollers which hit the recursive poller warning, which ended
in not much fun at all.
PCI:
- hw/phb3: set PHB retry state correctly when fresetting during a creset
- phb3: Lock the PHB on set_xive callbacks
Those are called by the interrupts core and thus skip the locking
implicit in the PCI opal calls. - hw/{phb3, p7ioc}: Return success for freset on empty PHB
OPAL_CLOSED is returned when fundamental reset is issued on the
PHB who doesn't have subordinate devices (root port excluded).
The kernel raises an error message, which is unnecessary. This
returns OPAL_SUCCESS for this case to avoid the error message. - hw/phb3: fix error handling in complete reset
During a complete reset, when we get a timeout waiting for pending
transaction in state PHB3_STATE_CRESET_WAIT_CQ, we mark the PHB as broken
and return OPAL_PARAMETER.
Change the return code to OPAL_HARDWARE which is way more sensible, and set
the state to PHB3_STATE_FENCED so that the kernel can retry the complete
reset.
skiboot-5.4.2
skiboot-5.4.2 was released on Friday December 2nd 2016. It replaces
:ref:skiboot-5.4.1
as the current stable release.
Over :ref:skiboot-5.4.1
, we have two bug fixes exclusively aimed at machines
with TPMs:
- i2c: Add nuvoton TPM quirk, disallowing i2cdetect as it can hard lock the TPM
- p8-i2c improve I2C reset code path, solves getting stuck resetting i2c engine
skiboot 5.4.1
skiboot-5.4.1 was released on Tuesday November 29th 2016. It replaces
skiboot-5.4.0 as the current stable release.
Over skiboot-5.4.0, we have a few changes:
- Nuvoton i2c TPM driver: bug fixes and improvements, especially around
timeouts and error handling. - Limit number of "Poller recursion detected" errors to display.
In some error conditions, we could spiral out of control on this
and spend all of our time printing the exact same backtrace. - slw: do SLW timer testing while holding xscom lock.
In some situations without this, it could take long enough to get
the xscom lock that the 1ms timeout would expire and we'd falsely
think the SLW timer didn't work when in fact it did. - p8i2c: Use calculated poll_interval when booting OPAL.
Otherwise we'd default to 2seconds (TIMER_POLL) during boot on
chips with a functional i2c interrupt, leading to slow i2c
during boot (or hitting timeouts instead). - i2c: More efficiently run TPM I2C operations during boot, avoiding hitting
timeouts - fsp: Don't recurse pollers in ibm_fsp_terminate
skiboot 5.4.0
skiboot-5.4.0 was released on Friday November 11th 2016. It is the new stable
skiboot release, taking over from the 5.3.x series (first released August 2nd,
2016). It comes after four release candidates, which have helped to shake out
a few issues.
skiboot-5.4.0 contains all bug fixes as of :ref:skiboot-5.3.7
and :ref:skiboot-5.1.18
(the currently maintained stable releases).
Skiboot 5.4.x becomes the new stable release. For how the skiboot stable
releases work, see :ref:stable-rules
for details.
Over :ref:skiboot-5.4.0-rc4
, we have a few changes:
-
libstb: bump up the byte timeout for tpm i2c requests
This bumps up the byte timeout for tpm i2c requests from 10ms to 30ms.
Some p8dtu systems are getting i2c request timeout. -
external/pflash: Perform the correct cleanup when -F is used to operate on
a file. -
Add SuperMicro p8dtu1u and p8dtu2u platforms
-
Revert "core/ipmi: Set interrupt-parent property".
This reverts commit d997e48 (introduced
in 5.4.0-rc1)A problem was found with pre 4.2 linux kernels where a spurious WARNING
would be emitted. This change doesn't matter enough to scare users
so we can just revert it. ::Warning was: [ 0.947741] irq: irq-62==>hwirq-0x3e mapping failed: -22 [ 0.947793] ------------[ cut here ]------------ [ 0.947838] WARNING: at kernel/irq/irqdomain.c:485
-
libflash/libffs: Fix possible NULL dereference
Previous Release Candidates
There were four release candidates for skiboot 5.4.0:
- :ref:
skiboot-5.4.0-rc4
- :ref:
skiboot-5.4.0-rc3
- :ref:
skiboot-5.4.0-rc2
- :ref:
skiboot-5.4.0-rc1
Changes since skiboot 5.3
Over skiboot-5.3, we have the following changes:
New Features
-
Add SuperMicro p8dtu1u and p8dtu2u platforms
-
Initial Trusted Boot support (see :ref:
stb-overview
).
There are several limitations with this initial release:- Only Nuvoton TPM 2.0 is supported
- Requires hardware rework on late revision Habanero or Firestone boards
in order to install TPM. - Add i2c Nuvoton TPM 2.0 Driver
- romcode driver for POWER8 secure ROM
- See Device tree docs: :ref:
device-tree/tpm
and :ref:device-tree/ibm,secureboot
- See :ref:
stb-overview
-
Support
ibm,skiboot
NVRAM partition with skiboot configuration options.- These should generally only be used if you either completely know what
you are doing or need to work around a skiboot bug. They are not
intended for end users and are explicitly NOT ABI. - Add support for supplying the kernel boot arguments from the
bootargs
configuration string in theibm,skiboot
NVRAM partition. - Enabling the experimental fast reset feature is done via this method.
- These should generally only be used if you either completely know what
-
Add support for nap mode on P8 while in skiboot
- While nap has been exposed to the Operating System since day 1, we have
not utilized low power states when in skiboot itself, leading to higher
power consumption during boot.
We only enable the functionality after the 0x100 vector has been
patched, and we disable it before transferring control to Linux.
- While nap has been exposed to the Operating System since day 1, we have
-
libflash: add 128MB MX66L1G45G part
-
Pointer validation of OPAL API call arguments.
- If the kernel called an OPAL API with vmalloc'd address
or any other address range in real mode, we would hit
a problem with aliasing. Since the top 4 bits are ignored
in real mode, pointers from 0xc.. and 0xd.. (and other ranges)
could collide and lead to hard to solve bugs. This patch
adds the infrastructure for pointer validation and a simple
test case for testing the API - The checks validate pointers sent in using
opal_addr_valid()
- If the kernel called an OPAL API with vmalloc'd address
-
Fast reboot for P8
This makes reboot take an awful lot less time, somewhere between four
and ten times faster than a full IPL. It is currently experimental and not
enabled by default.
You can enable the experimental support via nvram option: ::nvram -p ibm,skiboot --update-config experimental-fast-reset=feeling-lucky
WARNING: While we think we've managed to work out or around most of
the kinks with fast-reset, we are not enabling it by default in 5.4.Notably, fast reset will not happen in the following scenarios:
-
platform error
Most of the time, if we're rebooting due to a platform error, we should
trigger a checkstop. However, if we haven't been told what we should do
to trigger a checkstop (e.g. on an FSP machine), then we should still
fail to fast-reboot.So, fast-reboot is disabled in the OPAL_CEC_REBOOT2 code path
for the OPAL_REBOOT_PLATFORM_ERROR reboot type. -
FSP code update
-
Unrecoverable HMI
-
A PHB is in CAPI mode
If a PHB is in CAPI mode, we cannot safely fast reboot - the PHB will be
fenced during the reboot resulting in major problems when we load the new
kernel.In order to handle this safely, we need to disable CAPI mode before
resetting PHBs during the fast reboot. However, we don't currently support
this.In the meantime, when fast rebooting, check if there are any PHBs with a
CAPP attached, and if so, abort the fast reboot and revert to a normal
reboot instead.
-
Documentation
There have been a number of documentation fixes this release. Most prominent
is the switch to Sphinx (from the Python project) and ReStructured Text (RST)
as the documentation format. RST and Sphinx enable both production of pretty
documentation in HTML and PDF formats while remaining readable in their raw
form to those with no knowledge of RST.
You can build a HTML site by doing the following: ::
cd doc/
make html
As always, documentation patches are very, very welcome as we attempt to
document the OPAL API, the device tree bindings and important parts of
OPAL internals.
We would like the Device Tree documentation to follow the style that can be
included in the Device Tree Specification.
General
-
Make console-log time more readable: seconds rather than timebase
Log format is now[SECONDS.(tb%512000000),LEVEL]
-
Flash (PNOR) code improvements
- flash: Make size 64 bit safe
This makes the size of flash 64 bit safe so that we can have flash
devices greater than 4GB. This is especially useful for mambo disks
passed through to Linux. - core/flash.c: load actual partition size
We are downloading 0x20000 bytes from PNOR for CAPP, but currently the
CAPP lid is only 40K. - flash: Rework error paths and messages for multiple flash controllers
Now that we have mambo bogusdisk flash, we can have many flash chips.
This is resulting in some confusing output messages.
- flash: Make size 64 bit safe
-
core/init: Fix "failure of getting node in the free list" warning on boot.
-
slw: improve error message for SLW timer stuck
-
Centaur / XSCOM error handling
- print message on disabling xscoms to centaur due to many errors
- Mark centaur offline after 10 consecutive access errors
-
XSCOM improvements
- xscom: Map all HMER status codes to OPAL errors
- xscom: Initialize the data to a known value in
xscom_read
In case of error, don't leave the data random. It helps debugging when
the user fails to check the error code. This happens due to a bug in the
PRD wrapper app. - chip: Add a quirk for when core direct control XSCOMs are missing
-
p8-i2c: Don't crash if a centaur errored out
-
cpu: Make endian switch message more informative
-
cpu: Display number of started CPUs during boot
-
core/init: ensure that HRMOR is zero at boot
-
asm: Fix backtrace for unexpected exception
-
cpu: Remove pollers calling heuristics from
cpu_wait_job
This will be handled bytime_wait_ms()
. Also remove a useless
smt_medium()
.
Note that this introduce a difference in behaviour: time_wait
will only call the pollers on the boot CPU whilecpu_wait_job()
could call them on any. However, I can't think of a case where
this is a problem. -
cpu: Remove global job queue
Instead, target a specific CPU for a global job at queuing time.
This will allow us to wake up the target using an interrupt when
implementing nap mode.
The algorithm used is to look for idle primary threads first, then
idle secondaries, and finally the less loaded thread. If nothing can
be found, we fallback to a synchronous call. -
lpc: Log LPC SYNC errors as unrecoverable ones for manufacturing
-
lpc: Optimize SerIRQ dispatch based on which PSI IRQ fired
-
interrupts: Add new source
->attributes()
callback
This allows a given source to provide per-interrupt attributes
such as whether it targets OPAL or Linux and it's estimated
frequency.The former allows to get rid of the double set of ops used to
decide which interrupts go where on some modules like the PHBs
and the latter will be eventually used to implement smart
caching of the source lookups. -
opal/hmi: Fix a TOD HMI failure during a race condition.
-
platform: Add BT to Generic platform
NVRAM
- Support
ibm,skiboot
partition for skiboot specific configuration options - flash: Size NVRAM based on ECC for OpenPOWER platforms
If NVRAM has ECC (as per the ffs header) then the actual size of the
partition is less than reported by the ffs header in the PNOR then the
actual size of the partition is less than reported by the ffs header.
NVLink/NPU
- Fix reserved PE#
- NPU bdfn allocation bugfix
- Fix bad PE number check
NPUs have 4 PEs which are zero indexed, so {0, 1, 2, 3}. A bad PE number
check in npu_err_inject checks if the PE number is greater than 4 as a
fail case, so it would wrongly perform operations on a non-existant PE 4. - Use PCI virtual device
- assert the NPU irq min is aligned.
- program NPU BUID reg properly
- npu: reword "error" to indicate it's actually a warning
Incorrect FWTS annotation.
Without this patch, you get spurious FirmWare Test Suite (FWTS) warnings
about NVLink not working on machines that aren't fully populated with
GPUs. - external: NP...
skiboot-5.4.0-rc4
skiboot-5.4.0-rc4 was released on Tuesday November 8th 2016. It is the
fourth (and hopefully final) release candidate of skiboot 5.4, which will
become the new stable release of skiboot following the 5.3 release, first
released August 2nd 2016.
skiboot-5.4.0-rc4 contains all bug fixes as of skiboot-5.3.7
and skiboot-5.1.18 (the currently maintained stable releases).
For how the skiboot stable releases work, see https://github.com/open-power/skiboot/blob/skiboot-5.4.0-rc4/doc/stable-skiboot-rules.rst for details.
Since this is a release candidate, it should NOT be put into production.
With this release candidate, I'm hoping that it's the last one, and that within
the week we're able to tag a final 5.4.0 release. There is one bit of code I'm
hoping to merge in before the final 5.4.0, and that's the p8dtu platform
definition. The aim is for skiboot-5.4.x to be in op-build v1.13, which is due
by November 23rd 2016.
Over skiboot-5.4.0-rc3, we have a few changes:
-
Add BMC platform to enable correct OEM IPMI commands
An out of tree platform (p8dtu) uses a different IPMI OEM command
for IPMI_PARTIAL_ADD_ESEL. This exposed some assumptions about the BMC
implementation in our core code.Now, with platform.bmc, each platform can dictate (or detect) the BMC
that is present. We allow it to be set at runtime rather than purely
statically in struct platform as it's possible to have differing BMC
implementations on the one machine (e.g. AMI BMC or OpenBMC). -
hw/ipmi-sensor: Fix setting of firmware progress sensor properly.
On FSP systems, OPAL was incorrectly setting firmware status
on a sensor id "00" which doesn't exist. -
pflash: remove stray d in from info message
-
libflash/pflash: support whole chip erase on mtd access
-
boot_test: fix typo in console message
-
core/pci: Fix criteria in pci_cfg_reg_filter(), i.e. NVLink didn't work.
-
Remove KERNEL_COMMAND_LINE mention from config.h
We removed the functionality but not the define.
skiboot-5.4.0-rc3
skiboot-5.4.0-rc3 was released on Wednesday November 2nd 2016. It is the
third release candidate of skiboot 5.4, which will become the new stable
release of skiboot following the 5.3 release, first released August 2nd 2016.
skiboot-5.4.0-rc3 contains all bug fixes as of :ref:skiboot-5.3.7
and :ref:skiboot-5.1.18
(the currently maintained stable releases).
For how the skiboot stable releases work, see :ref:stable-rules
for details.
Since this is a release candidate, it should NOT be put into production.
The current plan is to release a new release candidate every week until we
feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which
is due by November 23rd 2016.
Over :ref:skiboot-5.4.0-rc2
, we have a few changes:
- pflash: Fail when file is larger than partition
You can still shoot yourself in the foot by passing --force. - core/flash: Don't do anything clever for OPAL_FLASH_{READ, WRITE, ERASE}
This fixes a bug where opal-prd and opal-gard could fail.
Fixes:<https://github.com/open-power/skiboot/issues/44>
_ - boot-tests: force BMC to boot from non-golden side
- fast-reset: Send special reset sequence to operational CPUs only.
Fixes fast-reset for cases where there are garded CPUs - Secure/Trusted boot: be much clearer about what is being measured where.
- Secure/Trusted boot: be more resilient to disabled TPM(s).
- Secure/Trusted boot: The
force-secure-mode
NVRAM setting introduced
temporarily in :ref:skiboot-5.4.0-rc2
has changed behaviour. Now, by
default, thesecure-mode
flag in the device tree is obeyed. As always,
any skiboot NVRAM options are in no way ABI, API or supported and may cause
unfinished verbose analogies to appear in release notes relating to the
dangers of using developer only options. - gard: Fix compiler warning on modern GCC targetting ARM 32-bit
- opal-prd: systemd scripts improvements, only run on supported systems
skiboot 5.4.0 Release Candidate 2
skiboot-5.4.0-rc2
skiboot-5.4.0-rc2 was released on Wednesday October 26th 2016. It is the
second release candidate of skiboot 5.4, which will become the new stable
release of skiboot following the 5.3 release, first released August 2nd 2016.
skiboot-5.4.0-rc2 contains all bug fixes as of :ref:skiboot-5.3.7
and :ref:skiboot-5.1.18
(the currently maintained stable releases).
For how the skiboot stable releases work, see :ref:stable-rules
for details.
Since this is a release candidate, it should NOT be put into production.
The current plan is to release a new release candidate every week until we
feel good about it. The aim is for skiboot-5.4.x to be in op-build v1.13, which
is due by November 23rd 2016.
Over :ref:skiboot-5.4.0-rc1
, we have a few changes:
Secure and Trusted Boot
skiboot 5.4.0-rc2 improves upon the progress towards Secure and Trusted Boot
in rc1. It is important to note that this is not a complete, end-to-end
secure/trusted boot implementation.
With the current code, it is now possible to verify and measure resources
loaded from PNOR by skiboot (namely the CAPP and BOOTKERNEL partitions).
Note that this functionality is currently only available on systems that
use the libflash backend. It is NOT enabled on IBM FSP based systems.
There is some support for some simulators though.
-
libstb/stb.c: ignore the secure mode flag unless forced in NVRAM
For this stage in Trusted Boot development, we are wishing to not
force Secure Mode through the whole firmware boot process, but we
are wanting to be able to test it (classic chicken and egg problem with
build infrastructure).We disabled secure mode if the secure-enabled devtree property is
read from the device tree IF we aren't overriding it through NVRAM.
Seeing as we can only increase (not decrease) what we're checking through
the NVRAM variable, it is safe.The NVRAM setting is force-secure-mode=true in the ibm,skiboot partition.
However, if you want to force secure mode even if Hostboot has not set
the secure-enabled proprety in the device tree, set force-secure-mode
to "always".There is also a force-trusted-mode NVRAM setting to force trusted mode
even if Hostboot has not enabled it int the device tree.To indicate to Linux that we haven't gone through the whole firmware
process in secure mode, we replace the 'secure-enabled' property with
'partial-secure-enabled', to indicate that only part of the firmware
boot process has gone through secure mode.
Command line arguments to BOOTKERNEL
-
core/init.c: Fix bootargs parsing
Currently the bootargs are unconditionally deleted, which causes
a bug where the bootargs passed in by the device tree are lost.This patch deletes bootargs only if it needs to be replaced by the NVRAM
entry.This patch also removes KERNEL_COMMAND_LINE config option in favour of
using the NVRAM or a device tree.
pflash utility
-
external/pflash: Make MTD accesses the default
Now that BMC and host kernel mtd drivers exist and have matured we
should use them by default.This is especially important since we seem to be telling everyone to use
pflash (pflash world domination plans are continuing on schedule). -
external/pflash: Catch incompatible combination of flags
-
external/common: arm: Don't error trying to wrprotect with MTD access
-
libflash/libffs: Use blocklevel_smart_write() when updating partitions
Other changes
-
extract-gcov: build with -m64 if compiler supports it.
Fixes build break on 32bit ppc64 (e.g. PowerMac G5, where user space
is mostly 32bit).
Fast Reset
-
fast-reset: disable fast reboot in event of platform error
Most of the time, if we're rebooting due to a platform error, we should
trigger a checkstop. However, if we haven't been told what we should do
to trigger a checkstop (e.g. on an FSP machine), then we should still
fail to fast-reboot.So, disable fast-reboot in the OPAL_CEC_REBOOT2 code path
for OPAL_REBOOT_PLATFORM_ERROR reboot type. -
fast-reboot: disable on FSP code update or unrecoverable HMI
-
fast-reboot: abort fast reboot if CAPP attached
If a PHB is in CAPI mode, we cannot safely fast reboot - the PHB will be
fenced during the reboot resulting in major problems when we load the new
kernel.In order to handle this safely, we need to disable CAPI mode before
resetting PHBs during the fast reboot. However, we don't currently support
this.In the meantime, when fast rebooting, check if there are any PHBs with a
CAPP attached, and if so, abort the fast reboot and revert to a normal
reboot instead.
OpenPOWER Platforms
For all hardware platforms that aren't IBM FSP machines:
-
Revert "flash: Move flash node under ibm,opal/flash/"
This reverts commit e1e6d00.
Breaks DT enough that it makes people cranky, reverting for now.
This could break access to flash with existing kernels in POWER9 simulators -
flash: rework flash_load_resource to correctly read FFS/STB
This fixes the previous reverts of loading the CAPP partition with
STB headers (which broke CAPP partitions without STB headers).The new logic fixes both CAPP partition loading with STB headers and
addresses a long standing bug due to differing interpretations of FFS.The f_part utility that constructs PNOR files just sets actualSize=totalSize
no matter on what the size of the partition is. Prior to this patch,
skiboot would always load actualSize, leading to longer than needed IPL.The pflash utility updates actualSize, so no developer has really ever
noticed this, apart from maybe an inkling that it's odd that a freshly
baked PNOR from op-build takes ever so slightly longer to boot than one
that has had individual partitions pflashed in.With this patch, we now compute actualSize. For partitions with a STB
header, we take the payload size from the STB header. For partitions
that don't have a STB header, we compute the size either by parsing
the ELF header or by looking at the subpartition header and computing it.We now need to read the entire partition for partitions with subpartitions
so that we pass consistent values to be measured as part of Trusted Boot.As of this patch, the actualSize field in FFS is not relied on for
partition size, we determine it from the content of the partition.However, this patch will break loading of partitions that are not ELF
and do not contain subpartitions. Luckily, nothing in-tree makes use of
that.
PCI
-
pci: Check power state before powering off slot
Prevents the erroneous "Error -1 powering off slot" error message.
Contributors
Since :ref:skiboot-5.4.0-rc1
, we have 23 csets from 8 developers.
A total of 876 lines added, 621 removed (delta 255)
Developers with the most changesets
============================ = =======
Developer # %
============================ = =======
Stewart Smith 7 (30.4%)
Cyril Bur 5 (21.7%)
Mukesh Ojha 3 (13.0%)
Gavin Shan 3 (13.0%)
Claudio Carvalho 2 (8.7%)
Chris Smart 1 (4.3%)
Andrew Donnellan 1 (4.3%)
Nageswara R Sastry 1 (4.3%)
============================ = =======
Developers with the most changed lines
========================== === =======
Developer # %
========================== === =======
Stewart Smith 424 (45.7%)
Mukesh Ojha 204 (22.0%)
Gavin Shan 173 (18.6%)
Cyril Bur 69 (7.4%)
Claudio Carvalho 35 (3.8%)
Andrew Donnellan 13 (1.4%)
Chris Smart 8 (0.9%)
Nageswara R Sastry 2 (0.2%)
========================== === =======
Developers with the most lines removed
============================ = =======
Developer # %
============================ = =======
Gavin Shan 9 (1.4%)
Chris Smart 4 (0.6%)
============================ = =======
Developers with the most signoffs (total 16)
============================ = =======
Developer # %
============================ = =======
Stewart Smith 16 (100.0%)
============================ = =======
Developers with the most reviews (total 4)
============================ = =======
Developer # %
============================ = =======
Vasant Hegde 2 (50.0%)
Andrew Donnellan 2 (50.0%)
============================ = =======
Developers with the most test credits (total 1)
============================ = =======
Developer # %
============================ = =======
Pridhiviraj Paidipeddi 1 (100.0%)
============================ = =======
Developers who gave the most tested-by credits (total 1)
============================ = =======
Developer # %
============================ = =======
Gavin Shan 1 (100.0%)
============================ = =======
Developers with the most report credits (total 3)
============================ = =======
Developer # %
============================ = =======
Pridhiviraj Paidipeddi 1 (33.3%)
Andrei Warkenti 1 (33.3%)
Michael Neuling 1 (33.3%)
============================ = =======
Developers who gave the most report credits (total 3)
============================ = =======
Developer # %
============================ = =======
Stewart Smith 2 (66.7%)
Gavin Shan 1 (33.3%)
============================ = =======