XMSS Library
Hash optimization

Problem definition

The library provides a software solution for XMSS. XMSS is based on a lot of hashes using one of two algorithms: SHA-256 and/or SHAKE256/256. The library is portable; it supports 32-bit and 64-bit hardware platforms, both little endian and big endian. Many hardware platforms have options that can accelerate those hash functions. Utilizing such hardware options not only makes the library more performant, it also decreases the footprint since the software implementation is no longer required. Therefore, the library allows for overriding its default implementations of SHA-256 and/or SHAKE256/256.

This document provides guidance on when and how to optimize the hash algorithms.

Optimization options

The library consists of two parts: signing and verification. The verification library has been written such that it optimizes binary code size. The signing library has been written such that it optimizes robustness and versatility. Often, the hardware platform for the signing and the verification are different. Hence, the rationale for optimization is also different for both parts of the library.

The library allows for the following optimizations:

  • disabling either SHA-256 or SHAKE256/256, if that specific hash algorithm is not used
  • override either (or both) algorithms, if hardware acceleration is available on the target platform

For the signing library, disabling an algorithm is not very relevant. The signing library is not optimized for size anyway and disabling an algorithm only improves performance marginally. Performance can be improved by utilizing hardware acceleration, but it mostly effects key generation, which is a one-time initialization.

For the verification library, disabling an algorithm improves both performance and footprint. If the parameter set(s) for signatures is chosen such that only one of the hash algorithms is used in practice, it makes sense to disable the unused algorithm. If hardware acceleration is available, then providing an override also improves both performance and footprint.

There are two ways of overriding a default hash implementation. It is preferred to use the internal block format of the library, but it requires that your alternative implementation is compatible with it. If not, it is possible to override using a generic digest interface, albeit with less optimization benefits.

An override with an alternative software implementation, optimized for the specific bitness and endianness of the target platform, may improve performance. However, the provided default implementations are already optimized and the marginal improvement obtained by providing an alternative software implementation may not be worth the trouble. An override with an alternative software implementation using the generic interface will almost certainly be bigger and slower than the the default implementation. In general, it will only be worthwhile to invest in an alternative hash implementation if it utilizes some form of hardware acceleration.

When hardware acceleration is available, then it certainly is worthwhile to override the default implementation. The software code that implements a hash algorithm is quite a substantial portion of the library and replacing it with either a few dedicated assembly instructions or with a call to an embedded crypto engine will reduce both the binary code size as well as the execution time.

Per target platform and per algorithm, consider the following decision tree for optimization:

flowchart TD
    optimization_required{Is optimization\nrequired?}
        optimization_required -->|no| done((Done))

        optimization_required -->|yes| algorithm_required{Is the hash\nalgorithm\nrequired?}

        algorithm_required -->|no| disable_algorithm[[Compile with\nXMSS_XXX=Disabled]]
        disable_algorithm --> done

        algorithm_required -->|yes| support_internal{Does the optimized\nimplementation support\nthe internal interface?}
            support_internal -->|yes| implement_internal[[Implement the\ninternal interface]]
            implement_internal --> configure_internal[[Compile with\nXMSS_XXX=OverrideInternal]]
            configure_internal --> done

            support_internal -->|no| implement_generic[[Implement the\ngeneric interface]]
            implement_generic --> configure_generic[[Compile with\nXMSS_XXX=OverrideGeneric]]
            configure_generic --> done

To override the SHA-256 algorithm, use the CMake variable XMSS_SHA256; to override the SHAKE256/256 algorithm, use the CMake variable XMSS_SHAKE256_256.

Disabling a hash algorithm

This option must not be selected if the verification library should be able to verify signatures generated by third parties and when it cannot be guaranteed which hash algorithm the third party signer will use.

If, on the other hand, a selection for a single hash algorithm can definitively be made, then it is actually recommended to disable the unused algorithm. This not only removes the unused algorithm from the binary, but it also removes all code to select between the algorithms, reducing the binary size even further as well as improving the execution speed slightly.

Overriding a hash algorithm

Warning
Overriding a default hash implementation has implications for evaluation/certification of your product.

The default hash implementations were subject to the same development security as the development of the other parts of the XMSS library. When you override a hash algorithm, it is your responsibility to provide sufficient assurance to any auditors/evaluators/certifiers that the relevant development standards are met.

The library uses an optimized internal format for intermediate hash values and message blocks. If your hardware platform is compatible with this internal format, it is preferred to use this internal override option. For example, this format is compatible with the SHA extensions in modern Intel processors (SHA256RNDS2, SHA256MSG1, SHA256MSG2) and with the ARMv8 SHA-256 extensions (sha256h, sha256h2, sha256su0, sha256su1). For hardware platforms that have cryptographic engines that do not provide access to their internal state, the generic override interface can be used instead.

To provide an override, compile the library with XMSS_SHA256 and/or XMSS_SHAKE256_256 defined to either OverrideInternal (preferred) or OverrideGeneric.

Your override implementation should include one of the following headers that provide the prototypes for the functions to implement:

Additionally, to override either or both algorithms, you will have to provide additional source files to compile and/or additional libraries to link with. To provide additional source files, use the CMake variable XMSS_HASH_OVERRIDE_SOURCES; to provide additional libraries to link with, use the CMake variable XMSS_HASH_OVERRIDE_LINK_LIBRARIES. The provided hash override(s) will be tested by make test against the NIST Known Answer Tests (KATs).