merlin.algorithms.layer module

Main QuantumLayer implementation

class merlin.algorithms.layer.QuantumLayer(input_size=None, builder=None, circuit=None, experiment=None, input_state=None, n_photons=None, trainable_parameters=None, input_parameters=None, amplitude_encoding=False, measurement_strategy=None, return_object=False, noise=None, n_phase_error_samples=1, device=None, dtype=None)

Bases: MerlinModule

Quantum neural network layer with factory-based architecture.

This layer can be created either from a CircuitBuilder instance, a pre-compiled pcvl.Circuit, or an pcvl.Experiment.

detach_memristive_state(*, clear_history=False)

Detach the current memristive state without resetting its value.

This method is intended for manual truncated backpropagation through time. It cuts the autograd graph carried by the live recurrent memristive state, so future forward passes keep using the same numerical state values without backpropagating through earlier recurrence updates.

Parameters:: clear_history (bool) – Whether to replace each memristive history with only the detached current state. If False, the history length is preserved but stored tensors are detached. Default value is False.
Returns:: The layer is updated in place.
Return type:: None

export_config()

Export a standalone configuration for remote execution.

Returns:: Serializable layer configuration containing the resolved circuit, parameters, and input metadata.
Return type:: dict

forward(*input_parameters, shots=None, sampling_method=None, simultaneous_processes=None)

Forward pass through the quantum layer.

Encoding is inferred from the input type:

torch.Tensor (float): angle encoding (compatible with nn.Sequential)
torch.Tensor (complex): amplitude encoding
StateVector: amplitude encoding (preferred for quantum state injection)

Memristive State Updates

For layers with memristive elements, the state is updated after each forward pass according to the registered update rule. Gradient flow through the memristive recurrence is controlled by the detach_at_each_forward flag:

detach_at_each_forward=True (default): New states are detached, blocking gradients through the state recurrence. Earlier inputs receive zero gradients from memristive state chains. the entire accumulated state history.

Parameters:

input_parameters (torch.Tensor | merlin.core.state_vector.StateVector) – Input data. For angle encoding, pass float tensors. For amplitude encoding, pass a single StateVector or complex tensor.
shots (int | None) – Number of samples; if 0 or None, return exact amplitudes/probabilities.
sampling_method (str | None) – Sampling method, e.g. “multinomial”.
simultaneous_processes (int | None) – Batch size hint for parallel computation.

Returns:

Output after measurement mapping. Depending on the return_object argument and measurement strategy defined in the input, the output type will be different. Check the constructor for more details.

Return type:

torch.Tensor | PartialMeasurement | merlin.core.state_vector.StateVector | ProbabilityDistribution

Raises:

TypeError – If inputs mix torch.Tensor and StateVector, or if an unsupported input type is provided.
ValueError – If multiple StateVector inputs are provided.
RuntimeError – If batch size is inconsistent with memristive state (call reset(batch_size=N) to fix).

property has_custom_detectors: bool

Whether the wrapped experiment defines non-default detectors.

Type:: bool

memristive_history: list[list[torch.Tensor]]: Full history of memristive phase-shifter states since the last reset(), indexed by the memristive phase-shifters.

memristive_state: list[torch.Tensor]: Current state of each memristive phase-shifter.

property output_keys

Return the Fock basis associated with the layer outputs.

For g2 noise cases with photon loss/detectors, returns flattened keys matching the tensor output order. For other cases, returns keys with original structure.

property output_size: int

Number of values produced after measurement mapping.

Type:: int

prepare_parameters(input_parameters)

Prepare parameter list for circuit evaluation.

Return type:: list[Tensor]

reset(batch_size=1)

Resets the memristors to their initial state while clearing the history.

This also defines the allowed batch size to be ran per forward pass for circuits with memristive phase shifters.

Parameters:: batch_size (int) – Batch size that will be used in forward passes. Must be at least 1. Call this before each new batch to ensure memristive states are properly initialized.
Raises:: ValueError – If batch_size < 1.
Return type:: None

set_input_state(input_state)

Set the layer input state for subsequent evaluations.

Parameters:: input_state (merlin.core.state_vector.StateVector | pcvl.StateVector | pcvl.BasicState | tuple | list) – Input state to store on the layer and underlying computation process.
Raises:: ValueError – If torch.Tensor is passed as input_state.
Return type:: None

set_sampling_config(shots=None, sampling_method=None): Deprecated: sampling configuration must be provided at call time in forward.

classmethod simple(cls, input_size, output_size=None, device=None, dtype=None, computation_space=ComputationSpace.UNBUNCHED)

Create a ready-to-train layer with a (input_size+1)-mode, ceil((input_size+1)/2)-photon architecture.

The circuit is assembled via CircuitBuilder with the following layout:

A fully trainable entangling layer acting on all modes;
A full input encoding layer spanning all encoded features;
A fully trainable entangling layer acting on all modes.

Parameters:

input_size (int) – Size of the classical input vector. Must be 19 or lower.
output_size (int | None) – Optional classical output width.
device (torch.device | None) – Optional target device for tensors.
dtype (torch.dtype | None) – Optional tensor dtype.
computation_space (ComputationSpace | str) – Logical computation subspace; one of {"fock", "unbunched", "dual_rail"}.

Returns:

QuantumLayer configured with the described architecture.

Return type:

torch.nn.Module

to(*args, **kwargs)

Move the layer and auxiliary transforms to a new device or dtype.

Parameters:

*args – Positional arguments forwarded to torch.nn.Module.to().
**kwargs – Keyword arguments forwarded to torch.nn.Module.to().

Returns:

The updated layer instance.

Return type:

QuantumLayer

Note

Quantum layers built from a pcvl.Experiment now apply the experiment’s per-mode detector configuration before returning classical outputs. When no detectors are specified, ideal photon-number resolving detectors are used by default.

If the experiment carries a pcvl.NoiseModel (via experiment.noise), MerLin inserts a PhotonLossTransform ahead of any detector transform. The resulting output_keys and output_size therefore include every survival/loss configuration implied by the model, and amplitude read-out is disabled whenever custom detectors or photon loss are present.

Circuit phase noise is applied while MerLin builds the differentiable unitary. phase_imprecision quantizes each phase to the nearest grid point using round(phi / phase_imprecision) * phase_imprecision; it is not truncation. Exact half-step ties follow torch.round behavior, so phi = pi / 8 with phase_imprecision = pi / 4 maps to 0.

phase_error is sampled after any phase_imprecision quantization. With both active, each sampled unitary uses round(phi / phase_imprecision) * phase_imprecision + epsilon where epsilon is drawn from Uniform(-phase_error, phase_error).

n_phase_error_samples controls the Monte Carlo sample count used for active phase_error circuit noise. Each phase_error sample is a coherent unitary evolution: tensor input superpositions interfere before that sample is converted to probabilities. MerLin then averages the sampled probability distributions, not amplitudes or unitaries. Source-noise simulations are incoherent mixtures: tensor input components are propagated independently and combined with weights |c_i|^2. Runtime scales roughly linearly with this value when phase_error > 0; when source noise or g2 is also active, each phase-error sample runs the full source-noise mixture, so the worst-case cost is roughly n_phase_error_samples * n_active_input_states * SLOS. The default is 1 sample.

Example: Quickstart QuantumLayer

import torch.nn as nn
from merlin import QuantumLayer

simple_layer = QuantumLayer.simple(
    input_size=4,
)

model = nn.Sequential(
    simple_layer,
    nn.Linear(simple_layer.output_size, 3),
)
# Train and evaluate as a standard torch.nn.Module

Note

QuantumLayer.simple() returns a thin SimpleSequential wrapper that behaves like a standard PyTorch module while exposing the inner quantum layer as .quantum_layer and any post-processing (ModGrouping or Identity) as .post_processing. The wrapper also forwards .circuit and .output_size so existing code that inspects these attributes continues to work.

A Perceval Circuit built with QuantumLayer.simple

The simple quantum layer above implements a circuit of (input_size+1) modes and (ceil((input_size+1)/2)) photons. This circuit is made of: - A fully trainable entangling layer acting on all modes; - A full input encoding layer spanning all encoded features; - A fully trainable entangling layer acting on all modes.

Example: Declarative builder API

import torch.nn as nn
from merlin import LexGrouping, MeasurementStrategy, QuantumLayer
from merlin.builder import CircuitBuilder
builder = CircuitBuilder(n_modes=6)
builder.add_entangling_layer(trainable=True, name="U1")
builder.add_angle_encoding(modes=list(range(4)), name="input")
builder.add_rotations(trainable=True, name="theta")
builder.add_superpositions(depth=1)

builder_layer = QuantumLayer(
    input_size=4,
    builder=builder,
    n_photons=3,  # is equivalent to input_state=[1,1,1,0,0,0]
    measurement_strategy=MeasurementStrategy.probs(),
)

model = nn.Sequential(
    builder_layer,
    LexGrouping(builder_layer.output_size, 3),
)
# Train and evaluate as a standard torch.nn.Module

A Perceval Circuit built with the CircuitBuilder

The circuit builder allows you to build your circuit layer by layer, with a high-level API. The example above implements a circuit of 6 modes and 3 photons. This circuit is made of: - A first entangling layer (trainable) - Angle encoding on the first 4 modes (for 4 input parameters with the name “input”) - A trainable rotation layer to add more trainable parameters - An entangling layer to add more expressivity

Other building blocks in the CircuitBuilder include:

add_rotations: Add single or multiple phase shifters (rotations) to specific modes. Rotations can be fixed, trainable, or data-driven (input-encoded).
add_angle_encoding: Encode classical data as quantum rotation angles, supporting higher-order feature combinations for expressive input encoding.
add_entangling_layer: Insert a multi-mode entangling layer (implemented via a generic interferometer), optionally trainable, and tune its internal template with the model argument ("mzi" or "bell") for different mixing behaviours.
add_superpositions: Add one or more beam splitters (superposition layers) with configurable targets, depth, and trainability.

Example: Manual Perceval circuit (more control)

import torch.nn as nn
import perceval as pcvl
from merlin import LexGrouping, MeasurementStrategy, QuantumLayer
modes = 6
wl = pcvl.GenericInterferometer(
    modes,
    lambda i: pcvl.BS() // pcvl.PS(pcvl.P(f"theta_li{i}")) //
    pcvl.BS() // pcvl.PS(pcvl.P(f"theta_lo{i}")),
    shape=pcvl.InterferometerShape.RECTANGLE,
)
circuit = pcvl.Circuit(modes)
circuit.add(0, wl)
for mode in range(4):
    circuit.add(mode, pcvl.PS(pcvl.P(f"input{mode}")))
wr = pcvl.GenericInterferometer(
    modes,
    lambda i: pcvl.BS() // pcvl.PS(pcvl.P(f"theta_ri{i}")) //
    pcvl.BS() // pcvl.PS(pcvl.P(f"theta_ro{i}")),
    shape=pcvl.InterferometerShape.RECTANGLE,
)
circuit.add(0, wr)

manual_layer = QuantumLayer(
    input_size=4,  # matches the number of phase shifters named "input{mode}"
    circuit=circuit,
    input_state=[1, 0, 1, 0, 1, 0],
    trainable_parameters=["theta"],
    input_parameters=["input"],
    measurement_strategy=MeasurementStrategy.probs(),
)

model = nn.Sequential(
    manual_layer,
    LexGrouping(manual_layer.output_size, 3),
)
# Train and evaluate as a standard torch.nn.Module

A Perceval Circuit built with the Perceval API

Here, the grouping can also be directly added to the MeasurementStrategy object used in the measurement_strategy parameter.

See the User guide and Notebooks for more advanced usage and training routines !

Input states and amplitude encoding

The input state of a photonic circuit specifies how the photons enter the device. Physically this can be a single Fock state (a precise configuration of n_photons over m modes) or a superposed/entangled state within the same computation space (for example Bell pairs or GHZ states). QuantumLayer accepts the following representations:

pcvl.BasicState – a single configuration such as pcvl.BasicState([1, 0, 1, 0]);
StateVector – an arbitrary superposition of basic states with complex amplitudes;
Python lists/tuples, e.g. [1, 0, 1, 0]. These are accepted as convenience inputs and are immediately converted
to a Perceval perceval.BasicState.

Note

For Fock/occupation inputs, QuantumLayer stores .input_state as a Perceval pcvl.BasicState. If you need the raw occupation vector, use list(layer.input_state).

When input_state is passed, the layer always injects that photonic state. In more elaborate pipelines you may want to cascade circuits and let the output amplitudes of the previous layer become the input state of the next. Merlin calls this amplitude encoding: the probability amplitudes themselves carry information and are passed to the next layer as a tensor. Amplitude input handling is activated by passing a StateVector or a complex torch.Tensor to forward(). The removed amplitude_encoding=True constructor flag now raises an error; use from_tensor() when a constructor tensor must become a state object. Passing torch.Tensor directly as input_state is also removed.

The snippet below prepares a dual-rail Bell state as the initial condition and evaluates a batch of classical parameters:

import torch
import perceval as pcvl
from merlin.algorithms.layer import QuantumLayer
from merlin.core import ComputationSpace
from merlin.measurement.strategies import MeasurementStrategy
from merlin.measurement.

circuit = pcvl.Unitary(pcvl.Matrix.random_unitary(4))  # some haar-random 4-mode circuit

bell = pcvl.StateVector()
bell += pcvl.BasicState([1, 0, 1, 0])
bell += pcvl.BasicState([0, 1, 0, 1])
print(bell) # bell is a state vector of 2 photons in 4 modes

layer = QuantumLayer(
    circuit=circuit,
    n_photons=2,
    input_state=bell,
    measurement_strategy=MeasurementStrategy.probs(computation_space=ComputationSpace.DUAL_RAIL),
)

x = torch.rand(10, circuit.m)  # batch of classical parameters
amplitudes = layer(x)
assert amplitudes.shape == (10, 2**2)

For comparison, a complex tensor supplies the photonic state during the forward pass:

import torch
import perceval as pcvl
from merlin.algorithms.layer import QuantumLayer
from merlin.core import MeasurementStrategy,ComputationSpace

circuit = pcvl.Circuit(3)

layer = QuantumLayer(
    circuit=circuit,
    n_photons=2,
    measurement_strategy=MeasurementStrategy.probs(computation_space=ComputationSpace.UNBUNCHED),
    dtype=torch.cdouble,
)

prepared_states = torch.tensor(
    [[1.0 + 0.0j, 0.0 + 0.0j, 0.0 + 0.0j],
     [0.0 + 0.0j, 0.0 + 0.0j, 1.0 + 0.0j]],
    dtype=torch.cdouble,
)

out = layer(prepared_states)

In the first example the circuit always starts from bell; in the second, each row of prepared_states represents a different logical photonic state that flows through the layer. This separation allows you to mix classical angle encoding with fully quantum, amplitude-based data pipelines.

Chunked amplitude execution

Amplitude inputs can be passed as ordinary dense tensors or as StateVector objects. Internally, QuantumLayer normalizes these inputs into compact active support before propagation: only basis states with non-zero amplitudes are sent to the simulator. Those active components are processed in chunks and accumulated into the final dense output amplitudes.

This reduces peak temporary memory from a whole-support table of roughly num_input_basis_states * num_output_states to approximately chunk_size * num_output_states. The tradeoff is that smaller chunks use less memory but require more simulator calls; larger chunks can improve throughput when memory is available. The chunk size is controlled by the simultaneous_processes argument:

out = layer(prepared_states, simultaneous_processes=32)

Changing simultaneous_processes should not change the numerical result; it only changes how the active support is batched internally.

Returning typed objects

When return_object is set to True, the output of a forward() call depends of the measurement_strategy. By default, it is set to False. See the following output matrix to see what to expect as the return of a forward call.

measurement_strategy	return_object=False	return_object=True
AMPLTITUDES	torch.Tensor	StateVector
PROBABILITIES	torch.Tensor	ProbabilityDistribution
PARTIAL_MEASUREMENT	PartialMeasurement	PartialMeasurement
MODE_EXPECTATIONS	torch.Tensor	torch.Tensor

Most of the typed objects can give the torch.Tensor as an output with the .tensor parameter. Only the PartialMeasurement object is a little different. See its according documentation.

These object could be quite useful to access metadata like the number of photons, modes and measurement_strategy behind the output tensors. For example, a better access to specific states is available with StateVector and ProbabilityDistribution by indexing the desired state. The objects are interoperable with Perceval, enabling seamless interaction between the two libraries.

For more information on the typed output capabilities, follow the following links:

The snippet below prepares a basic quantum layer and returns a ProbabilityDistribution object:

import torch
import perceval as pcvl
from merlin.algorithms.layer import QuantumLayer
from merlin.core import ComputationSpace, ProbabilityDistribution
from merlin.measurement.strategies import MeasurementStrategy

circuit = ML.CircuitBuilder(n_modes=4)
circuit.add_entangling_layer()

bell = pcvl.StateVector()
bell += pcvl.BasicState([1, 0, 1, 0])
bell += pcvl.BasicState([0, 1, 0, 1])
print(bell) # bell is a state vector of 2 photons in 4 modes

layer = QuantumLayer(
    builder=circuit,
    n_photons=2,
    input_state=bell,
    measurement_strategy=MeasurementStrategy.probs(computation_space=ComputationSpace.DUAL_RAIL),
    return_object=True,
)

x = torch.rand(10, circuit.m)  # batch of classical parameters
probs = layer(x)
assert isinstance(probs,ProbabilityDistribution)
assert isinstance(probs.tensor,torch.Tensor)

Memristive phase-shifter

Memristive phase-shifters carry state across forward passes. When a QuantumLayer is built from a CircuitBuilder that contains memristive phase-shifters added with add_memristive_ps(), call reset() before processing a new sequence or batch.

reset(batch_size=...) restores each memristor to its initial state, clears memristive_history, and sets the batch size expected by later forward passes. Until reset is called again, all forward passes must use that configured batch size.

import torch
import merlin as ML

circ = ML.CircuitBuilder(n_modes=3)
circ.add_memristive_ps(mode=1, update_rule=update_rule, initial_state=1.2)
circ.add_angle_encoding(modes=[0, 2])

ql = ML.QuantumLayer(
    builder=circ,
    n_photons=3,
    measurement_strategy=ML.MeasurementStrategy.probs(
        computation_space=ML.ComputationSpace.FOCK
    ),
)

input_tensor = torch.rand((5, 2))

# This would fail because the default memristive batch size is 1.
# probs = ql(input_tensor)

ql.reset(batch_size=5)
probs = ql(input_tensor)

The current state of each memristive phase-shifter is available through memristive_state. The full state history is available through memristive_history. Both lists follow the order in which the memristive phase-shifters were added to the CircuitBuilder.

Gradient Flow Control

When defining a memristive phase-shifter using add_memristive_ps(), the detach_at_each_forward parameter controls how gradients flow through the memristive state recurrence.

At time step t, the layer uses the current memristive state as the phase value. After the forward pass, the memristor’s update_rule receives the current state and the layer output, then returns the state used at time step t + 1. The numerical state is carried forward in every regime below; only the retained PyTorch autograd history changes.

Use detach_memristive_state() at a TBPTT boundary when the next chunk should keep the current numerical state but stop backpropagating through earlier recurrent updates. Use reset instead when starting a new independent sequence.

The common gradient-history regimes are:

No recurrent gradient steps: use detach_at_each_forward=True, the default. The memristive state still updates after each forward pass, but each new state is detached from the graph. A later loss does not backpropagate through earlier memristive state updates.
All recurrent gradient steps: use detach_at_each_forward=False and do not manually detach during the sequence. Backpropagation can traverse the full memristive history since the last reset(). This is full backpropagation through time and uses more memory as the sequence grows.
N recurrent gradient steps: use detach_at_each_forward=False and call detach_memristive_state(clear_history=True) every n time steps. This is truncated backpropagation through time (TBPTT): gradients flow inside the current chunk, while the current numerical state is preserved for the next chunk.

The full history of states is maintained in memristive_history until reset() is called or detach_memristive_state() is called with clear_history=True.

The examples below assume torch, ql, optimizer, criterion, inputs, and targets already exist, and that each x_t is one time step with the batch size configured by ql.reset(batch_size=...). Construct the memristive phase-shifter with the detach_at_each_forward setting named by each case.

No recurrent gradient steps:

ql.reset(batch_size=batch_size)

for x_t, target_t in zip(inputs, targets):
    optimizer.zero_grad(set_to_none=True)

    prediction = ql(x_t)
    loss = criterion(prediction, target_t)

    loss.backward()
    optimizer.step()

All recurrent gradient steps:

ql.reset(batch_size=batch_size)
optimizer.zero_grad(set_to_none=True)

loss = torch.zeros((), dtype=ql.dtype, device=ql.device)
for x_t, target_t in zip(inputs, targets):
    prediction = ql(x_t)
    loss = loss + criterion(prediction, target_t)

loss.backward()
optimizer.step()

N recurrent gradient steps with TBPTT:

ql.reset(batch_size=batch_size)

for start in range(0, len(inputs), n):
    input_chunk = inputs[start : start + n]
    target_chunk = targets[start : start + n]

    optimizer.zero_grad(set_to_none=True)

    loss = torch.zeros((), dtype=ql.dtype, device=ql.device)
    for x_t, target_t in zip(input_chunk, target_chunk):
        prediction = ql(x_t)
        loss = loss + criterion(prediction, target_t)

    loss.backward()
    optimizer.step()
    ql.detach_memristive_state(clear_history=True)

Deprecations

Warning

Removed in version 0.4: The no_bunching flag is removed in version 0.4. Use MeasurementStrategy.probs(computation_space=ComputationSpace.UNBUNCHED) or MeasurementStrategy.probs(computation_space=ComputationSpace.FOCK) instead. See Migration guide.

Warning

Deprecated since version 0.4: The use of the computation_space argument in the QuantumLayer’s constructor is no longer supported as 0.4.0. Use the computation_space flag inside measurement_strategy instead. See Migration guide.