.. _encoding_space: =============== Encoding Spaces =============== ``EncodingSpace`` describes how the last dimension of a PyTorch amplitude tensor maps into Merlin's canonical Fock basis. Use it after you have decided to pass an amplitude input, but before constructing the :class:`~merlin.core.state_vector.StateVector` that goes into ``QuantumLayer.forward``. This is different from choosing angle encoding versus amplitude encoding. If your input is an ordinary real-valued feature matrix such as ``(batch_size, n_features)``, start with :doc:`angle_amplitude_encoding`. This page is for tensors whose values are intended to be state amplitudes and whose final dimension indexes a meaningful basis. This page covers the input encoding API: - :class:`~merlin.core.encoding_space.EncodingSpace` - :meth:`~merlin.core.state_vector.StateVector.from_tensor` - :meth:`~merlin.core.state_vector.StateVector.logical_to_fock_map` Choosing an encoding for PyTorch amplitudes ------------------------------------------- Leading tensor dimensions are treated as batch dimensions. ``EncodingSpace`` only decides what the final dimension means. .. list-table:: :header-rows: 1 :widths: 30 25 45 * - PyTorch input intent - Encoding - Use it when * - ``(..., fock_size)`` amplitudes already in full Fock order - ``EncodingSpace.FOCK`` - The tensor comes from another photonic simulator, a previous Merlin layer, or code that already emits Merlin's full Fock basis order. * - ``(..., n_unbunched_states)`` amplitudes over collision-free states - ``EncodingSpace.UNBUNCHED`` - The model should only place one photon in any mode, but the circuit or output measurement still uses full Fock ordering internally. * - ``(..., 2**n_qubits)`` amplitudes over binary logical states - ``EncodingSpace.DUAL_RAIL`` - Your latent state is qubit-like and each logical bit should be encoded by one photon shared over two modes. * - ``(..., product(widths))`` amplitudes over discrete feature blocks - ``EncodingSpace(modes_per_photon=widths)`` - The last dimension indexes joint choices from fixed-size blocks, such as one three-choice block and one two-choice block. * - ``(..., 2**sum(qubit_groups))`` grouped-qubit amplitudes - ``EncodingSpace.qloq(qubit_groups)`` - You want QLOQ-style grouping, where each group of logical qubits becomes one photon over a higher-dimensional mode block. Do not use ``EncodingSpace`` directly for integer class labels, token IDs, or ordinary real feature vectors. Convert those to amplitudes intentionally, or use angle encoding / a classical PyTorch preprocessing module first. Logical basis and Fock basis ---------------------------- Merlin simulates photonic states in the Fock basis. A Fock basis state is an occupation tuple such as ``(1, 0, 1, 0)``, meaning one photon in mode 0 and one photon in mode 2. Many amplitude inputs are easier to express in a logical basis: - A collision-free two-photon input over four modes has only six unbunched states, not the full ten-state Fock basis. - A dual-rail two-qubit state has four logical states, embedded into four modes with two photons. - An amplitude tensor over two discrete feature blocks, one with three choices and one with two choices, has six logical states. Those states can be embedded as one photon in each feature block. ``EncodingSpace`` stores that mapping. ``StateVector.from_tensor(..., encoding=...)`` validates the logical tensor, embeds it into full Fock order, and returns a ``StateVector`` that can be passed to :class:`~merlin.algorithms.layer.QuantumLayer.forward`. ``StateVector.from_tensor`` stores the amplitudes you provide and lets ``StateVector`` normalize lazily when a normalized dense view or layer execution needs it. The examples below therefore pass raw illustrative tensors directly to ``from_tensor`` instead of normalizing before construction. The usual workflow is: .. code-block:: python import torch from merlin.core import EncodingSpace, StateVector encoding = EncodingSpace.DUAL_RAIL logical = torch.tensor([1.0, 0.0, 0.0, 1.0]) state = StateVector.from_tensor(logical, encoding=encoding) mapping = state.logical_to_fock_map() assert state.n_modes == 4 assert state.n_photons == 2 assert mapping == {(0, 0): 2, (0, 1): 3, (1, 0): 5, (1, 1): 6} Use :meth:`EncodingSpace.logical_to_fock_map() ` when you want occupation tuples, and use :meth:`StateVector.logical_to_fock_map() ` when you want the indices of those states in the stored Fock tensor. Built-in encodings ------------------ FOCK ^^^^ ``EncodingSpace.FOCK`` means the tensor is already in Merlin's full Fock ordering. You must provide ``n_modes`` and ``n_photons`` because the same tensor width can correspond to different physical systems. .. code-block:: python import torch from merlin.core import EncodingSpace, StateVector amplitudes = torch.arange(1, 11, dtype=torch.float32) state = StateVector.from_tensor( amplitudes, n_modes=4, n_photons=2, encoding=EncodingSpace.FOCK, ) assert state.tensor.shape[-1] == 10 assert state.encoding is EncodingSpace.FOCK Use this when you already have a full Fock-sized tensor, such as simulator output from another tool. UNBUNCHED ^^^^^^^^^ ``EncodingSpace.UNBUNCHED`` accepts only collision-free logical states and embeds them into the full Fock tensor. For four modes and two photons, the logical tensor has six components instead of the full ten. .. code-block:: python import torch from merlin.core import EncodingSpace, StateVector n_modes, n_photons = 4, 2 logical_size = EncodingSpace.UNBUNCHED.logical_basis_size( n_modes=n_modes, n_photons=n_photons, ) logical = torch.arange(1, logical_size + 1, dtype=torch.float32) state = StateVector.from_tensor( logical, n_modes=n_modes, n_photons=n_photons, encoding=EncodingSpace.UNBUNCHED, ) mapping = EncodingSpace.UNBUNCHED.logical_to_fock_map( n_modes=n_modes, n_photons=n_photons, ) assert all(max(fock_state) <= 1 for fock_state in mapping.values()) assert state.tensor.shape[-1] == 10 Use this when your model should only put one photon in any mode, but the downstream circuit or measurement still expects full Fock ordering. DUAL_RAIL ^^^^^^^^^ ``EncodingSpace.DUAL_RAIL`` represents each logical qubit with one photon shared over two modes. A two-qubit logical tensor therefore has four components, embedded into four modes with two photons. .. code-block:: python import torch from merlin.core import EncodingSpace, StateVector logical = torch.tensor([1.0, 0.0, 0.0, 1.0]) state = StateVector.from_tensor(logical, encoding=EncodingSpace.DUAL_RAIL) assert state.n_modes == 4 assert state.n_photons == 2 assert EncodingSpace.DUAL_RAIL.logical_to_fock_map( n_modes=4, n_photons=2, )[(1, 1)] == (0, 1, 0, 1) This is the most direct choice for qubit-like binary features or logical two-level systems. Partitioned encodings --------------------- Use ``EncodingSpace(modes_per_photon=[...])`` when each photon has its own local set of modes. In ML terms, this is useful when the last tensor dimension indexes amplitudes over discrete feature blocks with different numbers of choices. For example, one block might represent a three-choice feature and the second block might represent a binary feature, giving ``3 * 2 = 6`` joint logical states. This is not a replacement for a PyTorch embedding layer. If your raw data is an integer category or a table of real features, first decide how to turn it into amplitudes. The partitioned encoding only describes what those amplitudes mean once you have them. .. code-block:: python import torch from merlin.core import EncodingSpace, StateVector encoding = EncodingSpace(modes_per_photon=[3, 2]) logical = torch.zeros(encoding.logical_basis_size()) logical[encoding.logical_basis_states().index((2, 1))] = 1.0 state = StateVector.from_tensor(logical, encoding=encoding) assert state.n_modes == 5 assert state.n_photons == 2 assert encoding.logical_to_fock_map()[(2, 1)] == (0, 0, 1, 0, 1) Each entry in ``modes_per_photon`` reserves a block of modes for one photon. The product of the block widths is the logical tensor length. QLOQ encodings -------------- ``EncodingSpace.qloq(qubit_groups=[...])`` is a convenience constructor for Qubit Logic on Qudits (QLOQ). A group of ``k`` logical qubits becomes one photon delocalized over ``2**k`` modes. For example, ``qubit_groups=[2, 2]`` creates ``modes_per_photon=(4, 4)`` and a 16-component logical tensor. QLOQ was introduced for quantum circuit compression in Lysaght et al., `Quantum circuit compression using qubit logic on qudits `__. The example below uses the same grouping idea for an ML latent state rather than a chemistry VQE: a compact 16-component latent vector is embedded into an 8-mode, 2-photon photonic state and passed through a ``QuantumLayer``. .. code-block:: python import torch from merlin import CircuitBuilder, MeasurementStrategy, QuantumLayer from merlin.core import EncodingSpace, StateVector from merlin.core.computation_space import ComputationSpace encoding = EncodingSpace.qloq(qubit_groups=[2, 2]) latent = torch.zeros(encoding.logical_basis_size(), dtype=torch.complex64) latent[0] = 1.0 latent[-1] = 1.0 state = StateVector.from_tensor(latent, encoding=encoding) assert encoding.modes_per_photon == (4, 4) assert state.n_modes == 8 assert state.n_photons == 2 builder = CircuitBuilder(n_modes=state.n_modes) builder.add_entangling_layer(trainable=False) layer = QuantumLayer( input_size=0, builder=builder, n_photons=state.n_photons, measurement_strategy=MeasurementStrategy.probs( computation_space=ComputationSpace.FOCK, ), ) probabilities = layer(state) assert probabilities.shape[-1] == layer.output_size Migration from manual Fock tensors ---------------------------------- Older code sometimes created a full Fock-sized tensor manually, filled a small subset of entries, and passed that full tensor to ``StateVector.from_tensor``. That still works with ``EncodingSpace.FOCK``, but it is no longer necessary when the data is naturally logical. Manual full-Fock construction: .. code-block:: python import torch from merlin.core import EncodingSpace, StateVector full = torch.zeros(10, dtype=torch.complex64) full[2] = 1.0 full[6] = 1.0 state = StateVector.from_tensor( full, n_modes=4, n_photons=2, encoding=EncodingSpace.FOCK, ) Equivalent logical construction: .. code-block:: python import torch from merlin.core import EncodingSpace, StateVector logical = torch.tensor([1.0, 0.0, 0.0, 1.0], dtype=torch.complex64) state = StateVector.from_tensor( logical, encoding=EncodingSpace.DUAL_RAIL, ) Use the logical form when possible. It records the intended encoding in the ``StateVector``, validates the compact tensor shape, and keeps the mapping available through ``logical_to_fock_map()``. Testing status -------------- The examples on this page are mirrored by ``tests/core/test_encoding_examples.py`` so that the documented workflows stay aligned with Merlin's runtime behavior.