QuantumLayer Essentials

The QuantumLayer is MerLin’s core building block for integrating quantum computation, as a single module, in a machine learning pipeline. It combines a Perceval photonic circuit (or experiment), optional classical parameters, and detector logic into a single differentiable module.

Overview

  • Autograd ready – QuantumLayer exposes a PyTorch Module interface, supports batching and differentiable forward passes, and plays nicely with optimisers or higher-level architectures.

  • Input encoding strategies - Pick a data encoding method: angle or amplitude encoding.. See Angle Encoding and Amplitude Encoding for more information.

  • Output measurement strategies – Select between probabilities, per-mode expectations, or raw amplitudes through MeasurementStrategy. The layer validates incompatible combinations (e.g. detectors with amplitude read-out). For more information ont this and all of the possible output configurations, visit Measurement Strategy Guide. - Grouping strategy – The grouping strategy to format the output of the QuantumLayer to the desired size can be defined directly in

    the measurement_strategy parameter. See Grouping Guide for more information.

  • Multiple construction paths – Build layers from the convenience simple() factory, a CircuitBuilder, a custom perceval.Circuit or a fully specified perceval.Experiment.

  • Detector awareness – Layers automatically derive detector transforms from the experiment, enabling threshold, PNR, or hybrid detection schemes.

  • Photon-loss aware – Experiments carrying a perceval.NoiseModel trigger an automatic photon-loss transform so survival and loss outcomes share a single, normalised output distribution.

Initialisation recipes

QuantumLayer.simple()

The simple() helper generates a trainable interferometer with angle encoding that has the same number of modes as the input size. It is convenient for quick experiments, baselines or for machine learning experts without any prior knowledge in quantum machine learning.

import merlin as ML

layer = ML.QuantumLayer.simple(
    input_size=4,
    measurement_strategy=ML.MeasurementStrategy.probs(),
)

x = torch.rand(16, 4)
probs = layer(x)

CircuitBuilder

Use MerLin’s CircuitBuilder utilities to describe a circuit at a higher level. The builder maintains a record of the trainable parameters and the parameters used as layer inputs. A prefix-based naming scheme separates trainable parameters from those used as layer inputs. This is an ideal tool for quantum machine learning experts who do not have any experience with Perceval.”. More information in the CircuitBuilder API reference: CircuitBuilder

import torch
import merlin as ML

builder = ML.CircuitBuilder(n_modes=4)
builder.add_superpositions(depth=1)
builder.add_angle_encoding(modes=[0, 1], name="x")
builder.add_rotations(trainable=True, name="theta")

layer = ML.QuantumLayer(
    input_size=2,
    builder=builder,
    measurement_strategy=ML.MeasurementStrategy.probs(computation_space=ML.ComputationSpace.UNBUNCHED),
)

x = torch.rand(4, 2)
probs = layer(x)

Custom circuit

When you already have a perceval.Circuit, provide the classical input layout and the trainable parameter prefixes explicitly. This initialization requires a good understanding of Perceval.

import perceval as pcvl
import torch
import merlin as ML

circuit = pcvl.Circuit(3)
circuit.add((0, 1), pcvl.BS())
circuit.add(0, pcvl.PS(pcvl.P("phi")))

layer = ML.QuantumLayer(
    input_size=1,
    circuit=circuit,
    input_parameters=["phi"],
    trainable_parameters=["theta"],
    input_state=[1, 0, 0],
    measurement_strategy=ML.MeasurementStrategy.probs(),
)

x = torch.linspace(0.0, 1.0, steps=8).unsqueeze(1)
probs = layer(x)

Note

input_state=[...] is accepted as a convenience input, but the layer stores it as a Perceval perceval.BasicState (access the occupation vector via list(layer.input_state)).

Experiment-driven

If you want to simulate a noise model or specify detectors characteristics, configure a perceval.Experiment and pass it directly. The QuantumLayer inherits the circuit, detectors, and any photon-loss noise model you attached. This scheme is the one that gives the user the most options when utilizing a QuantumLayer.

import perceval as pcvl
import torch
import merlin as ML

circuit = pcvl.Circuit(2)
circuit.add((0, 1), pcvl.BS())

experiment = pcvl.Experiment(circuit)
experiment.detectors[0] = pcvl.Detector.threshold()
experiment.detectors[1] = pcvl.Detector.pnr()
experiment.noise = pcvl.NoiseModel(brightness=0.95, transmittance=0.9)

layer = ML.QuantumLayer(
    input_size=0,
    experiment=experiment,
    input_state=[1, 1],
    measurement_strategy=ML.MeasurementStrategy.probs(),
)

probs = layer()
detector_keys = layer.output_keys

Photon loss and detectors

  • Without an experiment, the layer defaults to ideal PNR detection on every mode, mirroring Perceval’s default behaviour.

  • experiment.noise = pcvl.NoiseModel(...) adds photon-loss sampling ahead of detector transforms. The resulting output_keys and output_size cover every survival/loss configuration implied by the noise model.

  • MeasurementStrategy.amplitudes() requires access to raw complex amplitudes and is therefore incompatible with custom detectors or photon-loss noise models. Attempting this combination raises a RuntimeError. To emulate a detector pipeline while still inspecting amplitudes, run the layer without detectors and apply DetectorTransform manually to the resulting amplitudes.

  • Call output_keys() to inspect the classical outcomes produced by the detector transform.

Notes

  • input_state must match the number of circuit modes. When unspecified, the photons (denoted by n_photons) are evenly distributed across the modes (for instance, for dual-rail it defaults to [1,0,1,0,...]).

  • Both strong simulation (SLOS, which computes exact probabilities) and weak simulation (sampling) are supported. Sampling can be enabled using the shots and sampling_method parameters. See the SLOS: Strong Linear Optical Simulator for more information about strong and weak simulations.

  • The layer.parameters() method provides access to the trainable parameters (if any), just like any standard PyTorch layer.

  • Inspect layer.has_custom_noise_model and layer.output_keys to confirm whether photon loss is active and how it alters the output distribution.

Warning

Deprecated since version 0.3: The use of the no_bunching flag is deprecated and is removed since version 0.3.0. Use the computation_space flag inside measurement_strategy instead. See Migration guide.

API Reference

class merlin.algorithms.layer.QuantumLayer(input_size=None, builder=None, circuit=None, experiment=None, input_state=None, n_photons=None, trainable_parameters=None, input_parameters=None, amplitude_encoding=False, computation_space=None, measurement_strategy=None, return_object=False, device=None, dtype=None)

Bases: MerlinModule

Quantum Neural Network Layer with factory-based architecture.

This layer can be created either from a CircuitBuilder instance, a pre-compiled pcvl.Circuit, or an :class:Experiment`.

export_config()

Export a standalone configuration for remote execution.

Return type:

dict

forward(*input_parameters, shots=None, sampling_method=None, simultaneous_processes=None)

Forward pass through the quantum layer.

Encoding is inferred from the input type: :rtype: Tensor | PartialMeasurement | StateVector | ProbabilityDistribution

  • torch.Tensor (float): angle encoding (compatible with nn.Sequential)

  • torch.Tensor (complex): amplitude encoding

  • StateVector: amplitude encoding (preferred for quantum state injection)

Parameters

*input_parameterstorch.Tensor | StateVector

Input data. For angle encoding, pass float tensors. For amplitude encoding, pass a single StateVector or complex tensor.

shotsint | None, optional

Number of samples; if 0 or None, return exact amplitudes/probabilities.

sampling_methodstr | None, optional

Sampling method, e.g. “multinomial”.

simultaneous_processesint | None, optional

Batch size hint for parallel computation.

Returns

torch.Tensor | PartialMeasurement | StateVector | ProbabilityDistribution

Output after measurement mapping. Depending on the return_object argument and measurement strategy defined in the input, the output type will be different. Check the constructor for more details.

Raises

TypeError

If inputs mix torch.Tensor and StateVector, or if an unsupported input type is provided.

ValueError

If multiple StateVector inputs are provided.

property has_custom_detectors: bool
property output_keys

Return the Fock basis associated with the layer outputs.

property output_size: int
prepare_parameters(input_parameters)

Prepare parameter list for circuit evaluation.

Return type:

list[Tensor]

set_input_state(input_state)
set_sampling_config(shots=None, sampling_method=None)

Deprecated: sampling configuration must be provided at call time in forward.

classmethod simple(cls, input_size, output_size=None, device=None, dtype=None, computation_space=ComputationSpace.UNBUNCHED)

Create a ready-to-train layer with a input_size-mode, (input_size//2)-photon architecture.

The circuit is assembled via CircuitBuilder with the following layout:

  1. A fully trainable entangling layer acting on all modes;

  2. A full input encoding layer spanning all encoded features;

  3. A fully trainable entangling layer acting on all modes.

Args:

input_size: Size of the classical input vector. Must be 20 or lower. output_size: Optional classical output width. device: Optional target device for tensors. dtype: Optional tensor dtype. computation_space: Logical computation subspace; one of {“fock”, “unbunched”, “dual_rail”}.

Returns:

QuantumLayer configured with the described architecture.

to(*args, **kwargs)

Move and/or cast the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)
to(dtype, non_blocking=False)
to(tensor, non_blocking=False)
to(memory_format=torch.channels_last)

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Args:
device (torch.device): the desired device of the parameters

and buffers in this module

dtype (torch.dtype): the desired floating point or complex dtype of

the parameters and buffers in this module

tensor (torch.Tensor): Tensor whose dtype and device are the desired

dtype and device for all parameters and buffers in this module

memory_format (torch.memory_format): the desired memory

format for 4D parameters and buffers in this module (keyword only argument)

Returns:

Module: self

Examples:

>>> # xdoctest: +IGNORE_WANT("non-deterministic")
>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)