Abstract: this paper presents an extended specification for the Ultra evolutionary framework. Architectural choices are detailed and their conceptual significance is discussed. Key implementation aspects are also covered.


Logical view

Ultra implements an object-oriented layered architecture, built on the base of five classes:

  • basic_search and evolution;
  • problem;
  • layered_population;
  • parameters.
classDiagram note "High level logical view" namespace ultra { class basic_search { +run() } class evolution class layered_population class parameters class problem { +params : parameters +sset : symbol_set } } evolution "1" <-- "1" basic_search layered_population "1" <-- "1" evolution problem "1" <-- "1" basic_search problem "1" *-- parameters

The purpose of the layered architecture is to clearly separate foundational concepts, such as the internal structure of individuals, from higher-level abstractions more directly related to the evolutionary process.

This is the minimal skeleton of a program using the Ultra framework:

ultra::basic_search<a_given_evolutionary_strategy> s(a_problem_instance, a_fitness_function);
s.run();

The basic_search class allows users to select the evolutionary strategy to employ and provides unified interface for different types of search. In most cases, users can simply rely on ultra::search, a convenient instantiation of ultra::basic_search that implements a general-purpose evolutionary strategy:

ultra::search s(a_problem_instance, a_fitness_function);
s.run();

There are various specialisations of the ultra::basic_search for different tasks (de::search for Differential Evolution, ga::search for Genetic Algorithms, hga::search for Heterogeneous Genetic Algorithms and src::search for Symbolic Regression and Classification).

All strategies, regardless of the specific search class, rely on ultra::problem or one of its specialisations, to access problem parameters / constraints (via the params data member), and the building blocks for individuals (via the sset data member).

classDiagram note "Problem logical view" namespace ultra { class problem { +params : parameters +sset : symbol_set } class symbol class symbol_set { +insert(symbol, weight) } class w_symbol { +sym : symbol +weight : unsigned } } symbol_set "1" *-- "*" w_symbol problem "1" *-- "1" symbol_set w_symbol "1" *-- "1" symbol

Being an evolutionary framework, Ultra performs its work with the help of the ultra::evolution and ultra::population classes.

Problem

classDiagram note "Problem in-depth view" class problem { +params : parameters +sset : symbol_set } class symbol class symbol_set { +insert(symbol, weight) } class w_symbol { +sym : symbol +weight : unsigned } class `de::problem` { +problem(dimensions, interval) +problem(intervals) +insert(interval, category)* real } class `ga::problem` { problem(dimensions, interval) +problem(intervals) +insert(interval, category)* integer } class `hga::problem` { +insert(...)* terminal } class `src::problem` { +problem(dataframe) +problem(filepath) +problem(input_stream) +setup_symbols() +setup_terminals() } symbol_set "1" *-- "*" w_symbol problem "1" *-- "1" symbol_set w_symbol "1" *-- "1" symbol problem <|-- `de::problem` problem <|-- `hga::problem` problem <|-- `ga::problem` problem <|-- `src::problem`

Specialisations of the problem class simplify the setup of the evolutionary environment and the evaluation of individuals. For instance:

ultra::de::problem prob(5, {-5.12, 5.12});
ultra::de::search search(prob, function_to_be_optimized);
const auto result(search.run());

defines a five-dimensional search space where each variable ranges over the interval \([-5.12;5.12]\), and then launches an optimisation based on differential evolution.

Population

Default population: layered structure

The default population is organized in multiple subgroups or layers. In the picture, each subgroup is represented as a torus to mark the fact that many evolutionary strategies may use a Trivial Geography scheme, where individuals are viewed as having a 1-dimensional spatial structure, essentially a circle, where the first and last positions are considered adjacent. The production of an individual for location i is permitted to involve only parents from i's local neighborhood.

A subgroup/layer with trivial geography scheme

This behaviour can be disabled by setting parameters::population::mate_zone to a sufficiently large value. Each layer is implemented in the linear_population class.

The global population is implemented in the layered_population class. The first and last layers are highlighted in distinct colours to reflect their special handling in some algorithms. Notably, the ALPS algorithm segregates individuals based on their ages:

  • the first layer contains the youngest individuals;
  • upper layers contain progressively older individuals;
  • the last layer contains the oldest individual (without an age limit).

In contrast, the standard evolutionary strategy treats each layer in the same way.

Regardless of the chosen strategy, all layers are are evolved in parallel. Possible interactions among layers depend on the strategy. For example, using the standard evolutionary strategy and the differential evolutionary strategy, there is no direct interaction; with ALPS the i-th layer may sometimes access the i-1-th layer for selection / recombination and upper layers for replacement.

The linear_population class provides a mutex via the method ([[nodiscard]] auto &mutex() const) which must be used to coordinate access in multi-threaded environments.

Individuals (core evolutionary units)

classDiagram note "Individuals" class individual { +age() age_t +inc_age() +signature() hash_t +load(stream) bool +save(stream) bool -load_impl(stream)* bool -save_impl(stream)* bool -hash()* hash_t #signature_ : hash_t -age_ : age_t } class `gp::individual` { +genome : matrix~gene~ +mutation() +crossover() +exons() +is_valid() bool } class `de::individual` { +genome : vector~double~ +crossover() +apply_each() +is_valid() bool } class `ga::individual` { +genome : vector~int~ +mutation() +crossover() +apply_each() +is_valid() bool } class `hga::individual` { +genome : vector~value_t~ +modify() +mutation() +crossover() +is_valid() bool } individual <|-- `gp::individual` : Inheritance individual <|-- `de::individual` : Inheritance individual <|-- `ga::individual` : Inheritance individual <|-- `hga::individual` : Inheritance

In ULTRA, an individual denotes a single candidate solution within an evolutionary process. Although different search strategies (GP, GA, DE, HGA) encode solutions differently, they all share a common architectural treatment that reflects ULTRA's layered design: a semantic base class expresses shared capabilities, while specialised derived classes encode representation-specific details.

ultra::individual (base class)

At the core of the individual hierarchy sits a minimal, abstract class ultra::individual. This base class defines the identity and persistence semantics common to all individuals:

  • a structural signature that uniquely represents the inherent (genomic) form of the individual. Logically equivalent yet syntactically distinct individuals map to the same signature, enabling efficient comparison, hashing, and diversity metrics;
classDiagram note "Logically equivalent individuals share the same signature" class individual { +signature() hash_t #signature_ : hash_t -hash()* hash_t } class derived_individual { -genome +is_valid() -hash() hash_t -- final } individual <|-- derived_individual
  • an age attribute tracking how long the individual has persisted in a population;
  • serialization support via a non-virtual interface (NVI) pattern (load/save forwarding to load_impl/save_impl), keeping derived implementations focused on representation specifics.
classDiagram note "Base class enforces invariants and postconditions" class individual { +load(stream) bool +save(stream) bool -load_impl(stream)* bool -save_impl(stream)* bool } class concrete_individual { -load_impl(stream) bool -save_impl(stream) bool } individual <|-- concrete_individual

The base type deliberately does not define a genome container or variation operators. Those are the responsibility of the derived classes, consistent with the separation between foundational concepts and evolutionary strategy layers in the overall architecture.

Derived individuals

Each evolutionary domain within ULTRA implements its own individual type, inheriting from ultra::individual and enriching it with representation and operator semantics appropriate to the underlying algorithm:

  • GP individual (ultra::gp::individual)

    Represents a straight-line program whose genome is a sequence of gene elements arranged in a matrix-like structure. It provides program-centric iterators (exon views), specialised crossover variants (e.g. tree crossover), and validity checking tailored to symbolic structures;

  • GA individual (ultra::ga::individual)

    Encodes a combinatorial solution as a vector of integers. Standard genetic operators such as crossover and mutation are provided, and iteration and element access follow the usual container model.

  • DE individual (ultra::de::individual)

    Uses a vector of real numbers to represent a point in a continuous search space. Differential evolution-specific crossover and parameterised operations are supported, with convenience methods for applying transforms across genes.

  • HGA individual (ultra::hga::individual)

    Offers a heterogeneous genome, typically a vector of a generic value type. It uses a modify proxy to control mutation, enabling fine-grained and representation-aware modification while preserving encapsulation.

Despite their differing genomes and operators, all concrete individual types share:

  • a common signature interface (from the base) for structural equivalence and hashing;
  • consistent serialization entry points via base class hooks;
  • value semantics: individuals are copyable and inexpensive to reason about, with thread safety delegated to the caller at the algorithm or population level rather than enforced internally.

Rationale and design benefits

This architecture balances generalisation and specialisation.

The base class encapsulates identity and persistence without imposing a genomic format. Derived classes implement all representation-specific logic, keeping variation and container semantics local to the individual type.

Generic algorithms and populations can reason about identities and lifecycles uniformly, while still taking advantage of the full expressive power of each representation.

By separating foundational concerns from representation details, ULTRA achieves a modular and extensible model for individuals across evolutionary strategies, enabling new representations to be added without perturbing the core framework.

gp::individual and straight-line programs

In the symbolic regression / GP (genetic programming) domain within Ultra, individuals are encoded as Straight Line Program (SLP) genomes, a linear, tabular representation of a program without loops or branches. Unlike traditional tree-based GP, where programs are expressed as hierarchical trees, an SLP consists of a sequence of instructions that compute values in terms of previously computed results, terminals and functions. This structure enables efficient reuse of intermediate computations and lends itself naturally to genetic operators defined over fixed-length linear genomes.

gp::individual → Straight Line Program (SLP) representation

In Ultra's gp::individual:

  • the genome is stored in a matrix<gene>, indexing both program position and category. Each row corresponds to a locus (instruction) and each column corresponds to a symbolic category type;
  • each gene contains a function symbol plus arguments. Arguments may be terminals, constants or backward addresses into previously defined rows. This enforces a directed acyclic evaluation order: each instruction can only refer to earlier results.
  • the SLP encoding allows the individual to describe potentially complex expressions as a sequence of simple, ordered computations. By evaluating from the last locus back to the start, the full expression is reconstructed and executed.

The class maintains strong semantic invariants:

  • only valid function symbols and categories are used;
  • all backward references point to already computed loci;
  • the evaluation order is implicit in the linear genome;
  • a canonical hash is computed via pack() to uniquely represent the semantic content of the individual for comparison and caching purposes.
classDiagram note "Straight Line Program individual" class `gp::individual` { -genome_ : matrix~gene~ -active_crossover_type_ : crossover_t +parameters() unsigned +categories() unsigned +empty() bool +clear() +pack(sink) hash_t +is_valid() bool +exons() exon_view +cexons() const_exon_view +mutate(random_engine) +crossover() } class gene { +func : function +args : vector~value_t~ +category() unsigned } class value_t { terminal constant address } class exon_view { +begin() +end() } class hash_sink { +write(bytes) +final() hash_t } class matrix~T~ { +rows() +cols() +operator()(row,col) } matrix~gene~ --> gene : contains gene --> value_t : uses `gp::individual` --> exon_view : exposes `gp::individual` --> hash_sink : UID via pack() `gp::individual` --> matrix~gene~ : stores

To manipulate these individuals in an evolutionary context, the implementation provides:

  • a suite of recombination operators (one-point, two-points, uniform, tree-based crossover) tailored to the SLP structure;
  • a mutation operator that selectively perturbs active expressions while preserving SLP validity;
  • exon views to iterate over active parts of an individual without exposing introns or structural details.

This representation blends the compactness and simplicity of linear genomes with the expressive power needed for symbolic regression and in practice can outperform traditional GP tree representations on a variety of regression tasks.

Evaluators and evaluator_proxy

In ULTRA, an evaluator is the component responsible for assigning a fitness value to an individual. Conceptually, it represents the problem-specific objective function of an evolutionary run.

Rather than prescribing a concrete base class, ULTRA models evaluators through the Evaluator concept: any callable object that, given an individual, returns a value satisfying the Fitness concept. This approach keeps evaluators lightweight, composable and easy to customise, while allowing the library to enforce the required semantic constraints at compile time.

flowchart LR Individual -->|evaluated by| Evaluator Evaluator -->|wrapped by| EvaluatorProxy EvaluatorProxy --> Cache EvaluatorProxy -->|delegates to| Evaluator classDef core fill:#eef,stroke:#333,stroke-width:1px classDef infra fill:#f8f8f8,stroke:#666,stroke-width:1px class Evaluator core class Individual core class EvaluatorProxy infra class Cache infra

Evaluators are required to be const-invocable, reflecting the fact that fitness evaluation is typically a read-only operation on the individual and that evaluators are often accessed through logically-const interfaces or shared across threads. When internal state is needed (for statistics, counters, or heuristics), it must be made explicit through logical constness or externalised into auxiliary components.

Optional evaluator capabilities

ULTRA supports optional evaluator capabilities without imposing additional interfaces:

  • Persistence. Evaluators may optionally provide load(std::istream &) and save(std::ostream &) const member functions. When present, these are automatically detected and invoked; otherwise, persistence is treated as a no-op.
  • Fast evaluation. Evaluators may optionally provide a fast(const individual &) member function that computes an approximate or heuristic fitness. This is useful for pre-filtering, speculative evaluation or guiding search without incurring the full evaluation cost.

These capabilities are detected at compile time and integrated transparently, allowing simple evaluators to remain minimal while enabling more advanced use cases when needed.

evaluator_proxy

Evaluation of an individual and evaluator_proxy

While evaluators define how fitness is computed, evaluator_proxy defines how evaluators are used within the evolutionary engine.

sequenceDiagram participant EA as Evolutionary algorithm participant Proxy as evaluator_proxy participant Cache participant Eva as evaluator EA->>Proxy: operator()(individual) Proxy->>Cache: lookup(signature) alt cache hit Cache-->>Proxy: cached fitness Proxy-->>EA: fitness else cache miss Proxy->>Eva: operator()(individual) Eva-->>Proxy: fitness Proxy->>Cache: insert(signature, fitness) Proxy-->>EA: fitness end

evaluator_proxy acts as a surrogate for an evaluator, augmenting it with infrastructure-level concerns that should not be embedded in the evaluator itself:

  • memoisation of fitness values via an internal cache;
  • transparent reuse of fitness values for semantically equivalent individuals;
  • optional persistence of both evaluator state and cached results;
  • access to fast (approximate) evaluation when supported.

Fitness values are cached using the individual's signature as a key. When caching is enabled, repeated evaluation of identical or equivalent individuals is avoided, significantly improving performance for expensive evaluators. Cache mutation is treated as logically const, ensuring that the observable semantics of the evaluator remain unchanged.

Importantly, evaluator_proxy preserves the original evaluator interface: from the perspective of the evolutionary algorithm, it behaves exactly like an evaluator, while silently providing caching, persistence and fast-evaluation support when available.

flowchart TD StreamIn["std::istream"] --> ProxyLoad["evaluator_proxy::load"] ProxyLoad -->|optional| EvaLoad["evaluator::load"] ProxyLoad --> CacheLoad["cache::load"] ProxySave["evaluator_proxy::save"] --> StreamOut["std::ostream"] EvaSave["evaluator::save"] --> ProxySave CacheSave["cache::save"] --> ProxySave EvaLoad -.->|"only if provided"| EvaSave

This separation of responsibilities keeps evaluators focused on problem semantics and allows ULTRA to manage performance, scalability, and reproducibility concerns in a centralised and consistent way.

src::evaluator

src::evaluator → symbolic regression and classification

These evaluators are dataset-aware, policy-based and designed to support both regression and classification problems with minimal duplication.

At a high level, an evaluator:

  • receives an individual;
  • applies it to a dataset through an oracle;
  • measures the discrepancy between predicted and target values.
  • aggregates this discrepancy into a fitness score where greater is better and the maximum value is typically 0.
flowchart TD A[Individual P] --> B[Evaluator] B --> C[Create oracle] C --> D[Iterate over dataset] D --> E[Apply oracle to example] E --> F[Error functor] F --> G[Incremental average] G --> H[Negate error] H --> I[Fitness value]

Dataset awareness

All evaluators derive (directly or indirectly) from the src::evaluator<D> base class. This class abstracts access to the underlying dataset and transparently supports both:

  • plain datasets;
  • multi_dataset, where only the currently selected dataset is exposed.

This design allows evaluators to be written once and reused across single-task and multi-task scenarios without conditional logic in user code.

Sum-of-errors evaluators

classDiagram note "Evaluators for regression and classification" namespace src { class evaluator { #data()* } class sum_of_errors_evaluator { +operator()(P) +fast(P) +oracle(P) } class mae_evaluator { } class rmae_evaluator { } class mse_evaluator { } class count_evaluator { } class gaussian_evaluator { } class binary_evaluator { } } evaluator <|-- sum_of_errors_evaluator sum_of_errors_evaluator <|-- mae_evaluator sum_of_errors_evaluator <|-- rmae_evaluator sum_of_errors_evaluator <|-- mse_evaluator sum_of_errors_evaluator <|-- count_evaluator evaluator <|-- gaussian_evaluator evaluator <|-- binary_evaluator

Most regression and classification evaluators in ULTRA are specialisations of sum_of_errors_evaluator<P, F, D>

where:

  • P is the individual type;
  • F is an error functor applied to a single training example;
  • D is the dataset type.

The evaluator iterates over the dataset, applies the error functor to each example and computes the average error using a numerically stable incremental scheme. The returned fitness is the negated average error, ensuring that:

  • higher fitness values are better;
  • the theoretical maximum fitness is 0;
  • results from full and approximate evaluations are directly comparable.

A faster approximation mode is also provided, evaluating only one example every few steps while preserving the same scale.

Error functors

Error functors encapsulate how an error is computed for a single training example. They own an oracle initialised from the individual and are responsible for handling illegal or undefined outputs.

ULTRA provides several standard error functors, including:

  • Mean absolute error (MAE). Robust to outliers.
  • Relative mean absolute error (RMAE). Scale-invariant and bounded.
  • Mean squared error (MSE). Sensitive to large deviations.
  • Count error. Exact-match classification metric.

This separation allows new metrics to be introduced without modifying evaluator logic.

Classification evaluators

For classification tasks, ULTRA provides specialised evaluators that do not follow the simple "sum of errors" pattern:

  • Gaussian evaluator. Uses class-conditional Gaussian models to derive probabilistic predictions. Correct predictions are rewarded proportionally to confidence, while incorrect ones are strongly penalised;
  • Binary evaluator. Optimised for two-class problems. Incorrect predictions are penalised based on both misclassification and confidence.

Both evaluators still follow the same overarching contract: greater fitness is better and values are comparable across individuals.

Fitness

Fitness is a scalar/vector value assigned to an individual which reflects how well the individual solves the task.

The literature identifies (at least) four distinct measures of fitness:

  • raw fitness
  • standardized fitness
  • adjusted fitness
  • normalized fitness

but we are mostly interested in the first two.

The raw fitness is the measurement of fitness that is stated in the natural terminology of the problem itself, so the better value may be either smaller or larger.

For example in an optimal control problem, one may be trying to minimize some cost measure, so a lesser value of raw fitness is better.

Since raw fitness is defined in problem-specific terms, its interpretation may vary significantly across domains. Therefore, Ultra adopts standardized fitness as the primary measure, ensuring consistency across tasks. The only requirement for standardized fitness is that bigger values represent better individuals (this may differ in other frameworks).

In Ultra many of the fitness functions (see class src::evaluator and its specialisations) define the optimal fitness as the scalar value 0, or the vector (0, ... 0), and use negative values for sub-optimal solutions. However, this is not mandatory.

Often, excluding the simplest cases, fitness alone is not enough to understand goodness and flaws of the candidate solution.

Thread safety

ULTRA is designed around value semantics and explicit ownership. Thread safety is therefore addressed at the algorithm and population level, not by making individual core types internally synchronised.

Concurrency model

  • Parallelism in ULTRA is achieved by operating on distinct objects (e.g. different individuals or populations).
  • Core types such as individuals, genomes, and operators are not internally thread-safe.
  • Sharing mutable objects across threads is considered an advanced use case and must be handled explicitly by the caller.

Value types and ownership

Most ULTRA core types are designed as value types:

  • copyable and movable;
  • inexpensive to reason about;
  • free of hidden shared state.

To preserve these properties, internal synchronisation primitives (such as mutexes) are intentionally avoided.

Const functions and object invariants

All observable state (including structural signatures and hashes) is maintained eagerly and updated as part of well-defined mutation operations (modify, apply_each, load, recombination operators).

As a consequence:

  • const member functions do not perform lazy computation;
  • const functions do not modify internal state;
  • concurrent calls to const member functions on the same instance are safe provided no concurrent mutation occurs;
  • external synchronisation is required only when an object may be mutated concurrently with reads.

This design enforces clear object invariants and predictable behaviour, while avoiding the complexity and overhead of internal lazy caches.

External synchronisation

When concurrent access to the same object is required (for example during migration, statistics collection, or logging), synchronisation must be applied outside the object:

  • at the population level;
  • at the algorithm or scheduler level;
  • via an explicit user-provided lock.

ULTRA does not attempt to make individual objects safe for arbitrary concurrent use. This approach avoids hidden performance costs.