Home

Welcome to the XCSF wiki! Please feel free to help improve these pages.

Overview

XCSF is rule-based and maintains a population of classifiers where each classifier cl consists of:

a condition component cl.C that determines whether the rule matches input x
an action component cl.A that selects an action a to be performed for a given x
a prediction component cl.P that computes the expected payoff for performing a upon receipt of x

XCSF thus generates rules of the general form:

IF matches ← cl.C(x) THEN perform action a ← cl.A(x) and EXPECT payoff p ← cl.P(x).

For each step within a learning trial, XCSF constructs a match set [M] composed of classifiers in the population set [P] whose cl.C matches x. If [M] contains fewer than θ_mna actions, a covering mechanism generates classifiers with matching cl.C and random cl.A.

For each possible action a_k in [M], XCSF estimates the expected payoff by computing the fitness-weighted average prediction P(a_k). That is, for each action a_k and classifier prediction p_j in [M], the system prediction P_k = ∑_j F_jp_j / ∑_jF_j.

A system action is then randomly or probabilistically selected during exploration, and the highest payoff action P_k used during exploitation. Classifiers in [M] advocating the chosen action are subsequently used to construct an action set [A]. The action is then performed and a scalar reward r ∈ ℜ received, along with the next sensory input.

Upon reaching a terminal state within the environment (as is always the case in single-step problems), each classifier cl_j ∈ [A] has its experience incremented and fitness, error, and set size as updated using the Widrow-Hoff delta rule with learning rate β ∈ [0,1] as follows.

Error: ε_j ← ε_j + β (| r − p_j | − ε_j)
Accuracy: κ_j =
- 1 if ε_j < ε₀
- α ( ε_j / ε₀ )^−ν otherwise.
  With target error threshold ε₀ and accuracy offset α ∈ [0,1], and slope ν.
Relative accuracy: κ_j' = (κ_j · num_j) / ∑_j κ_j · num_j
Where classifiers are initialised with numerosity num = 1.
Fitness: F_j ← F_j + β(κ_j' − F_j)
Set size estimate: as_j ← as_j + β(|[A]| − as_j)

Thereafter, cl.C, cl.A, and cl.P are updated according to the representation adopted.

The evolutionary algorithm (EA) is applied to classifiers within [A] if the average time since its previous execution exceeds θ_EA. Upon invocation, the time stamp ts of each classifier is updated. Two parents are chosen based on their fitness via roulette wheel selection (or tournament) and λ number of offspring are created via crossover with probability χ and mutation with probability μ.

Offspring parameters are initialised by setting the error and fitness to the parental average, and discounted by reduction parameters for error ε_R and fitness F_R. Offspring exp and num are set to one. If subsumption is enabled and the offspring are subsumed by either parent with sufficient accuracy (ε_j < ε₀) and experience (exp_j > θ_sub) it is not included in [P]; instead the parents' num is incremented.

The resulting offspring are added to [P] and the maximum (micro-classifier) population size N is enforced by removing classifiers selected via roulette with the deletion vote.

The deletion vote is set proportionally to the set size estimate as. However, the vote is increased by a factor F̅ / F_j for classifiers that are sufficiently experienced (exp_j > θ_del) and with small fitness (F_j < δF̅) where F̅ is the [P] mean fitness, and typically δ = 0.1. Classifiers selected for deletion have their (micro-classifier) num decremented, and in the event that num < 1, are removed from [P].

In multi-step problems, the previous action set [A]_-1 is updated after each step using a γ ∈ [0,1] discounted reward (similar to Q-learning) and the EA may be run therein.

Schematic

A schematic illustration of XCSF is shown above. For supervised learning, a single (dummy) action can be used such that [A] = [M] and the system prediction made directly accessible to the environment.

A number of interacting pressures have been identified. A set pressure provides more frequent reproduction opportunities for more general rules. In opposition is a fitness pressure which represses the reproduction of inaccurate and over-general rules. Many forms of cl.C, cl.A, and cl.P have been used for classifier knowledge representation since the original ternary conditions, integer actions, and scalar predictions. Some of these are implemented here.

Related Literature:

S. W. Wilson (1995) "Classifier fitness based on accuracy" https://doi.org/10.1162/evco.1995.3.2.149
S. W. Wilson (2001) "Function approximation with a classifier system"
S. W. Wilson (2002) "Classifiers that approximate functions" https://doi.org/10.1023/A:1016535925043
M. V. Butz (2006) "Rule-based evolutionary online learning systems" https://doi.org/10.1007/b104669
R. J. Urbanowicz and J. H. Moore (2009) "Learning classifier systems: A complete introduction, review, and roadmap" https://doi.org/10.1155/2009/736398
L. Bull (2015) "A brief history of learning classifier systems: From CS-1 to XCS and its variants" https://doi.org/10.1007/s12065-015-0125-y
R. J. Urbanowicz and W. N. Browne (2017) "Introduction to learning classifier systems" https://doi.org/10.1007/978-3-662-55007-6

Features

This project implements both supervised learning via the updating of match set predictions directly and reinforcement learning via the updating of action set predictions with an environment reward. All mutation rates self-adapted.

Python Library

See Python Library Usage.

Conditions

Always matching dummy conditions
Hyperrectangles
Hyperellipsoids
Neural networks
GP trees
Dynamical GP graphs
Ternary bitstrings (binarises inputs)
Both conditions and actions in single dynamical GP graphs
Both conditions and actions in single neural networks

Actions

Integers
Neural networks

Predictions

Piece-wise constant
Linear least squares
Quadratic least squares
Linear recursive least squares
Quadratic recursive least squares
Stochastic gradient descent neural networks

This project is released under the terms of the GNU General Public License v3.0 (GPLv3).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly