Encoding Spaces

EncodingSpace describes how the last dimension of a PyTorch amplitude tensor maps into Merlin’s canonical Fock basis. Use it after you have decided to pass an amplitude input, but before constructing the StateVector that goes into QuantumLayer.forward.

This is different from choosing angle encoding versus amplitude encoding. If your input is an ordinary real-valued feature matrix such as (batch_size, n_features), start with Angle Encoding and Amplitude Encoding. This page is for tensors whose values are intended to be state amplitudes and whose final dimension indexes a meaningful basis.

This page covers the input encoding API:

Choosing an encoding for PyTorch amplitudes

Leading tensor dimensions are treated as batch dimensions. EncodingSpace only decides what the final dimension means.

PyTorch input intent

Encoding

Use it when

(..., fock_size) amplitudes already in full Fock order

EncodingSpace.FOCK

The tensor comes from another photonic simulator, a previous Merlin layer, or code that already emits Merlin’s full Fock basis order.

(..., n_unbunched_states) amplitudes over collision-free states

EncodingSpace.UNBUNCHED

The model should only place one photon in any mode, but the circuit or output measurement still uses full Fock ordering internally.

(..., 2**n_qubits) amplitudes over binary logical states

EncodingSpace.DUAL_RAIL

Your latent state is qubit-like and each logical bit should be encoded by one photon shared over two modes.

(..., product(widths)) amplitudes over discrete feature blocks

EncodingSpace(modes_per_photon=widths)

The last dimension indexes joint choices from fixed-size blocks, such as one three-choice block and one two-choice block.

(..., 2**sum(qubit_groups)) grouped-qubit amplitudes

EncodingSpace.qloq(qubit_groups)

You want QLOQ-style grouping, where each group of logical qubits becomes one photon over a higher-dimensional mode block.

Do not use EncodingSpace directly for integer class labels, token IDs, or ordinary real feature vectors. Convert those to amplitudes intentionally, or use angle encoding / a classical PyTorch preprocessing module first.

Logical basis and Fock basis

Merlin simulates photonic states in the Fock basis. A Fock basis state is an occupation tuple such as (1, 0, 1, 0), meaning one photon in mode 0 and one photon in mode 2.

Many amplitude inputs are easier to express in a logical basis:

  • A collision-free two-photon input over four modes has only six unbunched states, not the full ten-state Fock basis.

  • A dual-rail two-qubit state has four logical states, embedded into four modes with two photons.

  • An amplitude tensor over two discrete feature blocks, one with three choices and one with two choices, has six logical states. Those states can be embedded as one photon in each feature block.

EncodingSpace stores that mapping. StateVector.from_tensor(..., encoding=...) validates the logical tensor, embeds it into full Fock order, and returns a StateVector that can be passed to forward.

StateVector.from_tensor stores the amplitudes you provide and lets StateVector normalize lazily when a normalized dense view or layer execution needs it. The examples below therefore pass raw illustrative tensors directly to from_tensor instead of normalizing before construction.

The usual workflow is:

import torch
from merlin.core import EncodingSpace, StateVector

encoding = EncodingSpace.DUAL_RAIL
logical = torch.tensor([1.0, 0.0, 0.0, 1.0])

state = StateVector.from_tensor(logical, encoding=encoding)
mapping = state.logical_to_fock_map()

assert state.n_modes == 4
assert state.n_photons == 2
assert mapping == {(0, 0): 2, (0, 1): 3, (1, 0): 5, (1, 1): 6}

Use EncodingSpace.logical_to_fock_map() when you want occupation tuples, and use StateVector.logical_to_fock_map() when you want the indices of those states in the stored Fock tensor.

Built-in encodings

FOCK

EncodingSpace.FOCK means the tensor is already in Merlin’s full Fock ordering. You must provide n_modes and n_photons because the same tensor width can correspond to different physical systems.

import torch
from merlin.core import EncodingSpace, StateVector

amplitudes = torch.arange(1, 11, dtype=torch.float32)
state = StateVector.from_tensor(
    amplitudes,
    n_modes=4,
    n_photons=2,
    encoding=EncodingSpace.FOCK,
)

assert state.tensor.shape[-1] == 10
assert state.encoding is EncodingSpace.FOCK

Use this when you already have a full Fock-sized tensor, such as simulator output from another tool.

UNBUNCHED

EncodingSpace.UNBUNCHED accepts only collision-free logical states and embeds them into the full Fock tensor. For four modes and two photons, the logical tensor has six components instead of the full ten.

import torch
from merlin.core import EncodingSpace, StateVector

n_modes, n_photons = 4, 2
logical_size = EncodingSpace.UNBUNCHED.logical_basis_size(
    n_modes=n_modes,
    n_photons=n_photons,
)
logical = torch.arange(1, logical_size + 1, dtype=torch.float32)

state = StateVector.from_tensor(
    logical,
    n_modes=n_modes,
    n_photons=n_photons,
    encoding=EncodingSpace.UNBUNCHED,
)

mapping = EncodingSpace.UNBUNCHED.logical_to_fock_map(
    n_modes=n_modes,
    n_photons=n_photons,
)
assert all(max(fock_state) <= 1 for fock_state in mapping.values())
assert state.tensor.shape[-1] == 10

Use this when your model should only put one photon in any mode, but the downstream circuit or measurement still expects full Fock ordering.

DUAL_RAIL

EncodingSpace.DUAL_RAIL represents each logical qubit with one photon shared over two modes. A two-qubit logical tensor therefore has four components, embedded into four modes with two photons.

import torch
from merlin.core import EncodingSpace, StateVector

logical = torch.tensor([1.0, 0.0, 0.0, 1.0])
state = StateVector.from_tensor(logical, encoding=EncodingSpace.DUAL_RAIL)

assert state.n_modes == 4
assert state.n_photons == 2
assert EncodingSpace.DUAL_RAIL.logical_to_fock_map(
    n_modes=4,
    n_photons=2,
)[(1, 1)] == (0, 1, 0, 1)

This is the most direct choice for qubit-like binary features or logical two-level systems.

Partitioned encodings

Use EncodingSpace(modes_per_photon=[...]) when each photon has its own local set of modes. In ML terms, this is useful when the last tensor dimension indexes amplitudes over discrete feature blocks with different numbers of choices. For example, one block might represent a three-choice feature and the second block might represent a binary feature, giving 3 * 2 = 6 joint logical states.

This is not a replacement for a PyTorch embedding layer. If your raw data is an integer category or a table of real features, first decide how to turn it into amplitudes. The partitioned encoding only describes what those amplitudes mean once you have them.

import torch
from merlin.core import EncodingSpace, StateVector

encoding = EncodingSpace(modes_per_photon=[3, 2])
logical = torch.zeros(encoding.logical_basis_size())
logical[encoding.logical_basis_states().index((2, 1))] = 1.0

state = StateVector.from_tensor(logical, encoding=encoding)

assert state.n_modes == 5
assert state.n_photons == 2
assert encoding.logical_to_fock_map()[(2, 1)] == (0, 0, 1, 0, 1)

Each entry in modes_per_photon reserves a block of modes for one photon. The product of the block widths is the logical tensor length.

QLOQ encodings

EncodingSpace.qloq(qubit_groups=[...]) is a convenience constructor for Qubit Logic on Qudits (QLOQ). A group of k logical qubits becomes one photon delocalized over 2**k modes. For example, qubit_groups=[2, 2] creates modes_per_photon=(4, 4) and a 16-component logical tensor.

QLOQ was introduced for quantum circuit compression in Lysaght et al., Quantum circuit compression using qubit logic on qudits. The example below uses the same grouping idea for an ML latent state rather than a chemistry VQE: a compact 16-component latent vector is embedded into an 8-mode, 2-photon photonic state and passed through a QuantumLayer.

import torch
from merlin import CircuitBuilder, MeasurementStrategy, QuantumLayer
from merlin.core import EncodingSpace, StateVector
from merlin.core.computation_space import ComputationSpace

encoding = EncodingSpace.qloq(qubit_groups=[2, 2])
latent = torch.zeros(encoding.logical_basis_size(), dtype=torch.complex64)
latent[0] = 1.0
latent[-1] = 1.0

state = StateVector.from_tensor(latent, encoding=encoding)
assert encoding.modes_per_photon == (4, 4)
assert state.n_modes == 8
assert state.n_photons == 2

builder = CircuitBuilder(n_modes=state.n_modes)
builder.add_entangling_layer(trainable=False)
layer = QuantumLayer(
    input_size=0,
    builder=builder,
    n_photons=state.n_photons,
    measurement_strategy=MeasurementStrategy.probs(
        computation_space=ComputationSpace.FOCK,
    ),
)

probabilities = layer(state)
assert probabilities.shape[-1] == layer.output_size

Migration from manual Fock tensors

Older code sometimes created a full Fock-sized tensor manually, filled a small subset of entries, and passed that full tensor to StateVector.from_tensor. That still works with EncodingSpace.FOCK, but it is no longer necessary when the data is naturally logical.

Manual full-Fock construction:

import torch
from merlin.core import EncodingSpace, StateVector

full = torch.zeros(10, dtype=torch.complex64)
full[2] = 1.0
full[6] = 1.0

state = StateVector.from_tensor(
    full,
    n_modes=4,
    n_photons=2,
    encoding=EncodingSpace.FOCK,
)

Equivalent logical construction:

import torch
from merlin.core import EncodingSpace, StateVector

logical = torch.tensor([1.0, 0.0, 0.0, 1.0], dtype=torch.complex64)

state = StateVector.from_tensor(
    logical,
    encoding=EncodingSpace.DUAL_RAIL,
)

Use the logical form when possible. It records the intended encoding in the StateVector, validates the compact tensor shape, and keeps the mapping available through logical_to_fock_map().

Testing status

The examples on this page are mirrored by tests/core/test_encoding_examples.py so that the documented workflows stay aligned with Merlin’s runtime behavior.