Encoding Spaces
EncodingSpace describes how the last dimension of a PyTorch amplitude
tensor maps into Merlin’s canonical Fock basis. Use it after you have decided
to pass an amplitude input, but before constructing the
StateVector that goes into
QuantumLayer.forward.
This is different from choosing angle encoding versus amplitude encoding. If
your input is an ordinary real-valued feature matrix such as
(batch_size, n_features), start with Angle Encoding and Amplitude Encoding.
This page is for tensors whose values are intended to be state amplitudes and
whose final dimension indexes a meaningful basis.
This page covers the input encoding API:
Choosing an encoding for PyTorch amplitudes
Leading tensor dimensions are treated as batch dimensions. EncodingSpace
only decides what the final dimension means.
PyTorch input intent |
Encoding |
Use it when |
|---|---|---|
|
|
The tensor comes from another photonic simulator, a previous Merlin layer, or code that already emits Merlin’s full Fock basis order. |
|
|
The model should only place one photon in any mode, but the circuit or output measurement still uses full Fock ordering internally. |
|
|
Your latent state is qubit-like and each logical bit should be encoded by one photon shared over two modes. |
|
|
The last dimension indexes joint choices from fixed-size blocks, such as one three-choice block and one two-choice block. |
|
|
You want QLOQ-style grouping, where each group of logical qubits becomes one photon over a higher-dimensional mode block. |
Do not use EncodingSpace directly for integer class labels, token IDs, or
ordinary real feature vectors. Convert those to amplitudes intentionally, or
use angle encoding / a classical PyTorch preprocessing module first.
Logical basis and Fock basis
Merlin simulates photonic states in the Fock basis. A Fock basis state is an
occupation tuple such as (1, 0, 1, 0), meaning one photon in mode 0 and one
photon in mode 2.
Many amplitude inputs are easier to express in a logical basis:
A collision-free two-photon input over four modes has only six unbunched states, not the full ten-state Fock basis.
A dual-rail two-qubit state has four logical states, embedded into four modes with two photons.
An amplitude tensor over two discrete feature blocks, one with three choices and one with two choices, has six logical states. Those states can be embedded as one photon in each feature block.
EncodingSpace stores that mapping. StateVector.from_tensor(...,
encoding=...) validates the logical tensor, embeds it into full Fock order,
and returns a StateVector that can be passed to
forward.
StateVector.from_tensor stores the amplitudes you provide and lets
StateVector normalize lazily when a normalized dense view or layer
execution needs it. The examples below therefore pass raw illustrative tensors
directly to from_tensor instead of normalizing before construction.
The usual workflow is:
import torch
from merlin.core import EncodingSpace, StateVector
encoding = EncodingSpace.DUAL_RAIL
logical = torch.tensor([1.0, 0.0, 0.0, 1.0])
state = StateVector.from_tensor(logical, encoding=encoding)
mapping = state.logical_to_fock_map()
assert state.n_modes == 4
assert state.n_photons == 2
assert mapping == {(0, 0): 2, (0, 1): 3, (1, 0): 5, (1, 1): 6}
Use EncodingSpace.logical_to_fock_map() when you want
occupation tuples, and use StateVector.logical_to_fock_map() when you want the
indices of those states in the stored Fock tensor.
Built-in encodings
FOCK
EncodingSpace.FOCK means the tensor is already in Merlin’s full Fock
ordering. You must provide n_modes and n_photons because the same
tensor width can correspond to different physical systems.
import torch
from merlin.core import EncodingSpace, StateVector
amplitudes = torch.arange(1, 11, dtype=torch.float32)
state = StateVector.from_tensor(
amplitudes,
n_modes=4,
n_photons=2,
encoding=EncodingSpace.FOCK,
)
assert state.tensor.shape[-1] == 10
assert state.encoding is EncodingSpace.FOCK
Use this when you already have a full Fock-sized tensor, such as simulator output from another tool.
UNBUNCHED
EncodingSpace.UNBUNCHED accepts only collision-free logical states and
embeds them into the full Fock tensor. For four modes and two photons, the
logical tensor has six components instead of the full ten.
import torch
from merlin.core import EncodingSpace, StateVector
n_modes, n_photons = 4, 2
logical_size = EncodingSpace.UNBUNCHED.logical_basis_size(
n_modes=n_modes,
n_photons=n_photons,
)
logical = torch.arange(1, logical_size + 1, dtype=torch.float32)
state = StateVector.from_tensor(
logical,
n_modes=n_modes,
n_photons=n_photons,
encoding=EncodingSpace.UNBUNCHED,
)
mapping = EncodingSpace.UNBUNCHED.logical_to_fock_map(
n_modes=n_modes,
n_photons=n_photons,
)
assert all(max(fock_state) <= 1 for fock_state in mapping.values())
assert state.tensor.shape[-1] == 10
Use this when your model should only put one photon in any mode, but the downstream circuit or measurement still expects full Fock ordering.
DUAL_RAIL
EncodingSpace.DUAL_RAIL represents each logical qubit with one photon
shared over two modes. A two-qubit logical tensor therefore has four
components, embedded into four modes with two photons.
import torch
from merlin.core import EncodingSpace, StateVector
logical = torch.tensor([1.0, 0.0, 0.0, 1.0])
state = StateVector.from_tensor(logical, encoding=EncodingSpace.DUAL_RAIL)
assert state.n_modes == 4
assert state.n_photons == 2
assert EncodingSpace.DUAL_RAIL.logical_to_fock_map(
n_modes=4,
n_photons=2,
)[(1, 1)] == (0, 1, 0, 1)
This is the most direct choice for qubit-like binary features or logical two-level systems.
Partitioned encodings
Use EncodingSpace(modes_per_photon=[...]) when each photon has its own
local set of modes. In ML terms, this is useful when the last tensor dimension
indexes amplitudes over discrete feature blocks with different numbers of
choices. For example, one block might represent a three-choice feature and the
second block might represent a binary feature, giving 3 * 2 = 6 joint
logical states.
This is not a replacement for a PyTorch embedding layer. If your raw data is an integer category or a table of real features, first decide how to turn it into amplitudes. The partitioned encoding only describes what those amplitudes mean once you have them.
import torch
from merlin.core import EncodingSpace, StateVector
encoding = EncodingSpace(modes_per_photon=[3, 2])
logical = torch.zeros(encoding.logical_basis_size())
logical[encoding.logical_basis_states().index((2, 1))] = 1.0
state = StateVector.from_tensor(logical, encoding=encoding)
assert state.n_modes == 5
assert state.n_photons == 2
assert encoding.logical_to_fock_map()[(2, 1)] == (0, 0, 1, 0, 1)
Each entry in modes_per_photon reserves a block of modes for one photon.
The product of the block widths is the logical tensor length.
QLOQ encodings
EncodingSpace.qloq(qubit_groups=[...]) is a convenience constructor for
Qubit Logic on Qudits (QLOQ). A group of k logical qubits becomes one
photon delocalized over 2**k modes. For example,
qubit_groups=[2, 2] creates modes_per_photon=(4, 4) and a
16-component logical tensor.
QLOQ was introduced for quantum circuit compression in Lysaght et al.,
Quantum circuit compression using qubit logic on qudits.
The example below uses the same grouping idea for an ML latent state rather
than a chemistry VQE: a compact 16-component latent vector is embedded into an
8-mode, 2-photon photonic state and passed through a QuantumLayer.
import torch
from merlin import CircuitBuilder, MeasurementStrategy, QuantumLayer
from merlin.core import EncodingSpace, StateVector
from merlin.core.computation_space import ComputationSpace
encoding = EncodingSpace.qloq(qubit_groups=[2, 2])
latent = torch.zeros(encoding.logical_basis_size(), dtype=torch.complex64)
latent[0] = 1.0
latent[-1] = 1.0
state = StateVector.from_tensor(latent, encoding=encoding)
assert encoding.modes_per_photon == (4, 4)
assert state.n_modes == 8
assert state.n_photons == 2
builder = CircuitBuilder(n_modes=state.n_modes)
builder.add_entangling_layer(trainable=False)
layer = QuantumLayer(
input_size=0,
builder=builder,
n_photons=state.n_photons,
measurement_strategy=MeasurementStrategy.probs(
computation_space=ComputationSpace.FOCK,
),
)
probabilities = layer(state)
assert probabilities.shape[-1] == layer.output_size
Migration from manual Fock tensors
Older code sometimes created a full Fock-sized tensor manually, filled a small
subset of entries, and passed that full tensor to StateVector.from_tensor.
That still works with EncodingSpace.FOCK, but it is no longer necessary
when the data is naturally logical.
Manual full-Fock construction:
import torch
from merlin.core import EncodingSpace, StateVector
full = torch.zeros(10, dtype=torch.complex64)
full[2] = 1.0
full[6] = 1.0
state = StateVector.from_tensor(
full,
n_modes=4,
n_photons=2,
encoding=EncodingSpace.FOCK,
)
Equivalent logical construction:
import torch
from merlin.core import EncodingSpace, StateVector
logical = torch.tensor([1.0, 0.0, 0.0, 1.0], dtype=torch.complex64)
state = StateVector.from_tensor(
logical,
encoding=EncodingSpace.DUAL_RAIL,
)
Use the logical form when possible. It records the intended encoding in the
StateVector, validates the compact tensor shape, and keeps the mapping
available through logical_to_fock_map().
Testing status
The examples on this page are mirrored by
tests/core/test_encoding_examples.py so that the documented workflows stay
aligned with Merlin’s runtime behavior.