merlin.algorithms.layer module
Main QuantumLayer implementation
- class merlin.algorithms.layer.QuantumLayer(input_size=None, builder=None, circuit=None, experiment=None, input_state=None, n_photons=None, trainable_parameters=None, input_parameters=None, amplitude_encoding=False, measurement_strategy=None, return_object=False, noise=None, n_phase_error_samples=1, device=None, dtype=None)
Bases:
MerlinModuleQuantum neural network layer with factory-based architecture.
This layer can be created either from a
CircuitBuilderinstance, a pre-compiledpcvl.Circuit, or anpcvl.Experiment.- detach_memristive_state(*, clear_history=False)
Detach the current memristive state without resetting its value.
This method is intended for manual truncated backpropagation through time. It cuts the autograd graph carried by the live recurrent memristive state, so future forward passes keep using the same numerical state values without backpropagating through earlier recurrence updates.
- Parameters:
clear_history (bool) – Whether to replace each memristive history with only the detached current state. If
False, the history length is preserved but stored tensors are detached. Default value isFalse.- Returns:
The layer is updated in place.
- Return type:
None
- export_config()
Export a standalone configuration for remote execution.
- Returns:
Serializable layer configuration containing the resolved circuit, parameters, and input metadata.
- Return type:
- forward(*input_parameters, shots=None, sampling_method=None, simultaneous_processes=None)
Forward pass through the quantum layer.
Encoding is inferred from the input type:
torch.Tensor(float): angle encoding (compatible withnn.Sequential)torch.Tensor(complex): amplitude encodingStateVector: amplitude encoding (preferred for quantum state injection)
Memristive State Updates
For layers with memristive elements, the state is updated after each forward pass according to the registered update rule. Gradient flow through the memristive recurrence is controlled by the
detach_at_each_forwardflag:detach_at_each_forward=True(default): New states are detached, blocking gradients through the state recurrence. Earlier inputs receive zero gradients from memristive state chains. the entire accumulated state history.
- Parameters:
input_parameters (torch.Tensor | merlin.core.state_vector.StateVector) – Input data. For angle encoding, pass float tensors. For amplitude encoding, pass a single
StateVectoror complex tensor.shots (int | None) – Number of samples; if 0 or None, return exact amplitudes/probabilities.
sampling_method (str | None) – Sampling method, e.g. “multinomial”.
simultaneous_processes (int | None) – Batch size hint for parallel computation.
- Returns:
Output after measurement mapping. Depending on the return_object argument and measurement strategy defined in the input, the output type will be different. Check the constructor for more details.
- Return type:
torch.Tensor | PartialMeasurement | merlin.core.state_vector.StateVector | ProbabilityDistribution
- Raises:
TypeError – If inputs mix
torch.TensorandStateVector, or if an unsupported input type is provided.ValueError – If multiple
StateVectorinputs are provided.RuntimeError – If batch size is inconsistent with memristive state (call
reset(batch_size=N)to fix).
- property has_custom_detectors: bool
Whether the wrapped experiment defines non-default detectors.
- Type:
- memristive_history: list[list[torch.Tensor]]
Full history of memristive phase-shifter states since the last
reset(), indexed by the memristive phase-shifters.
- memristive_state: list[torch.Tensor]
Current state of each memristive phase-shifter.
- property output_keys
Return the Fock basis associated with the layer outputs.
For g2 noise cases with photon loss/detectors, returns flattened keys matching the tensor output order. For other cases, returns keys with original structure.
- prepare_parameters(input_parameters)
Prepare parameter list for circuit evaluation.
- reset(batch_size=1)
Resets the memristors to their initial state while clearing the history.
This also defines the allowed batch size to be ran per forward pass for circuits with memristive phase shifters.
- Parameters:
batch_size (int) – Batch size that will be used in forward passes. Must be at least 1. Call this before each new batch to ensure memristive states are properly initialized.
- Raises:
ValueError – If batch_size < 1.
- Return type:
- set_input_state(input_state)
Set the layer input state for subsequent evaluations.
- Parameters:
input_state (merlin.core.state_vector.StateVector | pcvl.StateVector | pcvl.BasicState | tuple | list) – Input state to store on the layer and underlying computation process.
- Raises:
ValueError – If
torch.Tensoris passed asinput_state.- Return type:
- set_sampling_config(shots=None, sampling_method=None)
Deprecated: sampling configuration must be provided at call time in forward.
- classmethod simple(cls, input_size, output_size=None, device=None, dtype=None, computation_space=ComputationSpace.UNBUNCHED)
Create a ready-to-train layer with a (input_size+1)-mode, ceil((input_size+1)/2)-photon architecture.
The circuit is assembled via
CircuitBuilderwith the following layout:A fully trainable entangling layer acting on all modes;
A full input encoding layer spanning all encoded features;
A fully trainable entangling layer acting on all modes.
- Parameters:
input_size (int) – Size of the classical input vector. Must be 19 or lower.
output_size (int | None) – Optional classical output width.
device (torch.device | None) – Optional target device for tensors.
dtype (torch.dtype | None) – Optional tensor dtype.
computation_space (ComputationSpace | str) – Logical computation subspace; one of
{"fock", "unbunched", "dual_rail"}.
- Returns:
QuantumLayer configured with the described architecture.
- Return type:
- to(*args, **kwargs)
Move the layer and auxiliary transforms to a new device or dtype.
- Parameters:
*args – Positional arguments forwarded to
torch.nn.Module.to().**kwargs – Keyword arguments forwarded to
torch.nn.Module.to().
- Returns:
The updated layer instance.
- Return type:
Note
Quantum layers built from a pcvl.Experiment now apply the experiment’s per-mode detector configuration before returning classical outputs. When no detectors are specified, ideal photon-number resolving detectors are used by default.
If the experiment carries a pcvl.NoiseModel (via experiment.noise), MerLin inserts a PhotonLossTransform ahead of any detector transform. The resulting output_keys and output_size therefore include every survival/loss configuration implied by the model, and amplitude read-out is disabled whenever custom detectors or photon loss are present.
Circuit phase noise is applied while MerLin builds the differentiable unitary. phase_imprecision quantizes each phase to the nearest grid point using round(phi / phase_imprecision) * phase_imprecision; it is not truncation. Exact half-step ties follow torch.round behavior, so phi = pi / 8 with phase_imprecision = pi / 4 maps to 0.
phase_error is sampled after any phase_imprecision quantization. With both active, each sampled unitary uses round(phi / phase_imprecision) * phase_imprecision + epsilon where epsilon is drawn from Uniform(-phase_error, phase_error).
n_phase_error_samples controls the Monte Carlo sample count used for active phase_error circuit noise. Each phase_error sample is a coherent unitary evolution: tensor input superpositions interfere before that sample is converted to probabilities. MerLin then averages the sampled probability distributions, not amplitudes or unitaries. Source-noise simulations are incoherent mixtures: tensor input components are propagated independently and combined with weights |c_i|^2. Runtime scales roughly linearly with this value when phase_error > 0; when source noise or g2 is also active, each phase-error sample runs the full source-noise mixture, so the worst-case cost is roughly n_phase_error_samples * n_active_input_states * SLOS. The default is 1 sample.
Example: Quickstart QuantumLayer
import torch.nn as nn
from merlin import QuantumLayer
simple_layer = QuantumLayer.simple(
input_size=4,
)
model = nn.Sequential(
simple_layer,
nn.Linear(simple_layer.output_size, 3),
)
# Train and evaluate as a standard torch.nn.Module
Note
QuantumLayer.simple() returns a thin SimpleSequential wrapper that behaves like a standard
PyTorch module while exposing the inner quantum layer as .quantum_layer and any
post-processing (ModGrouping or Identity) as .post_processing.
The wrapper also forwards .circuit and .output_size so existing code that inspects these
attributes continues to work.
The simple quantum layer above implements a circuit of (input_size+1) modes and (ceil((input_size+1)/2)) photons. This circuit is made of: - A fully trainable entangling layer acting on all modes; - A full input encoding layer spanning all encoded features; - A fully trainable entangling layer acting on all modes.
Example: Declarative builder API
import torch.nn as nn
from merlin import LexGrouping, MeasurementStrategy, QuantumLayer
from merlin.builder import CircuitBuilder
builder = CircuitBuilder(n_modes=6)
builder.add_entangling_layer(trainable=True, name="U1")
builder.add_angle_encoding(modes=list(range(4)), name="input")
builder.add_rotations(trainable=True, name="theta")
builder.add_superpositions(depth=1)
builder_layer = QuantumLayer(
input_size=4,
builder=builder,
n_photons=3, # is equivalent to input_state=[1,1,1,0,0,0]
measurement_strategy=MeasurementStrategy.probs(),
)
model = nn.Sequential(
builder_layer,
LexGrouping(builder_layer.output_size, 3),
)
# Train and evaluate as a standard torch.nn.Module
The circuit builder allows you to build your circuit layer by layer, with a high-level API. The example above implements a circuit of 6 modes and 3 photons. This circuit is made of: - A first entangling layer (trainable) - Angle encoding on the first 4 modes (for 4 input parameters with the name “input”) - A trainable rotation layer to add more trainable parameters - An entangling layer to add more expressivity
Other building blocks in the CircuitBuilder include:
add_rotations: Add single or multiple phase shifters (rotations) to specific modes. Rotations can be fixed, trainable, or data-driven (input-encoded).
add_angle_encoding: Encode classical data as quantum rotation angles, supporting higher-order feature combinations for expressive input encoding.
add_entangling_layer: Insert a multi-mode entangling layer (implemented via a generic interferometer), optionally trainable, and tune its internal template with the
modelargument ("mzi"or"bell") for different mixing behaviours.add_superpositions: Add one or more beam splitters (superposition layers) with configurable targets, depth, and trainability.
Example: Manual Perceval circuit (more control)
import torch.nn as nn
import perceval as pcvl
from merlin import LexGrouping, MeasurementStrategy, QuantumLayer
modes = 6
wl = pcvl.GenericInterferometer(
modes,
lambda i: pcvl.BS() // pcvl.PS(pcvl.P(f"theta_li{i}")) //
pcvl.BS() // pcvl.PS(pcvl.P(f"theta_lo{i}")),
shape=pcvl.InterferometerShape.RECTANGLE,
)
circuit = pcvl.Circuit(modes)
circuit.add(0, wl)
for mode in range(4):
circuit.add(mode, pcvl.PS(pcvl.P(f"input{mode}")))
wr = pcvl.GenericInterferometer(
modes,
lambda i: pcvl.BS() // pcvl.PS(pcvl.P(f"theta_ri{i}")) //
pcvl.BS() // pcvl.PS(pcvl.P(f"theta_ro{i}")),
shape=pcvl.InterferometerShape.RECTANGLE,
)
circuit.add(0, wr)
manual_layer = QuantumLayer(
input_size=4, # matches the number of phase shifters named "input{mode}"
circuit=circuit,
input_state=[1, 0, 1, 0, 1, 0],
trainable_parameters=["theta"],
input_parameters=["input"],
measurement_strategy=MeasurementStrategy.probs(),
)
model = nn.Sequential(
manual_layer,
LexGrouping(manual_layer.output_size, 3),
)
# Train and evaluate as a standard torch.nn.Module
Here, the grouping can also be directly added to the MeasurementStrategy object used in the measurement_strategy parameter.
See the User guide and Notebooks for more advanced usage and training routines !
Input states and amplitude encoding
The input state of a photonic circuit specifies how the photons enter the device. Physically this can be a single
Fock state (a precise configuration of n_photons over m modes) or a superposed/entangled state within the same
computation space (for example Bell pairs or GHZ states). QuantumLayer accepts the
following representations:
pcvl.BasicState – a single configuration such as
pcvl.BasicState([1, 0, 1, 0]);StateVector– an arbitrary superposition of basic states with complex amplitudes;- Python lists/tuples, e.g.
[1, 0, 1, 0]. These are accepted as convenience inputs and are immediately converted to a Perceval perceval.BasicState.
- Python lists/tuples, e.g.
Note
For Fock/occupation inputs, QuantumLayer stores .input_state as a Perceval
pcvl.BasicState. If you need the raw occupation vector, use list(layer.input_state).
When input_state is passed, the layer always injects that photonic state. In more elaborate pipelines you may want
to cascade circuits and let the output amplitudes of the previous layer become the input state of the next. Merlin
calls this amplitude encoding: the probability amplitudes themselves carry information and are passed to the next
layer as a tensor. Amplitude input handling is activated by passing a
StateVector or a complex torch.Tensor to
forward(). The removed amplitude_encoding=True constructor flag now
raises an error; use from_tensor()
when a constructor tensor must become a state object. Passing
torch.Tensor directly as input_state is also removed.
The snippet below prepares a dual-rail Bell state as the initial condition and evaluates a batch of classical parameters:
import torch
import perceval as pcvl
from merlin.algorithms.layer import QuantumLayer
from merlin.core import ComputationSpace
from merlin.measurement.strategies import MeasurementStrategy
from merlin.measurement.
circuit = pcvl.Unitary(pcvl.Matrix.random_unitary(4)) # some haar-random 4-mode circuit
bell = pcvl.StateVector()
bell += pcvl.BasicState([1, 0, 1, 0])
bell += pcvl.BasicState([0, 1, 0, 1])
print(bell) # bell is a state vector of 2 photons in 4 modes
layer = QuantumLayer(
circuit=circuit,
n_photons=2,
input_state=bell,
measurement_strategy=MeasurementStrategy.probs(computation_space=ComputationSpace.DUAL_RAIL),
)
x = torch.rand(10, circuit.m) # batch of classical parameters
amplitudes = layer(x)
assert amplitudes.shape == (10, 2**2)
For comparison, a complex tensor supplies the photonic state during the forward pass:
import torch
import perceval as pcvl
from merlin.algorithms.layer import QuantumLayer
from merlin.core import MeasurementStrategy,ComputationSpace
circuit = pcvl.Circuit(3)
layer = QuantumLayer(
circuit=circuit,
n_photons=2,
measurement_strategy=MeasurementStrategy.probs(computation_space=ComputationSpace.UNBUNCHED),
dtype=torch.cdouble,
)
prepared_states = torch.tensor(
[[1.0 + 0.0j, 0.0 + 0.0j, 0.0 + 0.0j],
[0.0 + 0.0j, 0.0 + 0.0j, 1.0 + 0.0j]],
dtype=torch.cdouble,
)
out = layer(prepared_states)
In the first example the circuit always starts from bell; in the second, each row of prepared_states represents a
different logical photonic state that flows through the layer. This separation allows you to mix classical angle
encoding with fully quantum, amplitude-based data pipelines.
Chunked amplitude execution
Amplitude inputs can be passed as ordinary dense tensors or as
StateVector objects. Internally,
QuantumLayer normalizes these inputs into compact active support before
propagation: only basis states with non-zero amplitudes are sent to the
simulator. Those active components are processed in chunks and accumulated into
the final dense output amplitudes.
This reduces peak temporary memory from a whole-support table of roughly
num_input_basis_states * num_output_states to approximately
chunk_size * num_output_states. The tradeoff is that smaller chunks use less
memory but require more simulator calls; larger chunks can improve throughput
when memory is available. The chunk size is controlled by the
simultaneous_processes argument:
out = layer(prepared_states, simultaneous_processes=32)
Changing simultaneous_processes should not change the numerical result; it
only changes how the active support is batched internally.
Returning typed objects
When return_object is set to True, the output of a forward() call depends of the measurement_strategy. By default,
it is set to False. See the following output matrix to see what to expect as the return of a forward call.
measurement_strategy |
return_object=False |
return_object=True |
|---|---|---|
AMPLTITUDES |
torch.Tensor |
StateVector |
PROBABILITIES |
torch.Tensor |
ProbabilityDistribution |
PARTIAL_MEASUREMENT |
PartialMeasurement |
PartialMeasurement |
MODE_EXPECTATIONS |
torch.Tensor |
torch.Tensor |
Most of the typed objects can give the torch.Tensor as an output with the .tensor parameter. Only the
PartialMeasurement object is a little different. See its according documentation.
These object could be quite useful to access metadata like the number of photons, modes and measurement_strategy behind the output tensors. For example, a better access to specific
states is available with StateVector and ProbabilityDistribution by indexing the desired state. The objects are interoperable with Perceval, enabling seamless interaction between the two libraries.
For more information on the typed output capabilities, follow the following links:
The snippet below prepares a basic quantum layer and returns a ProbabilityDistribution object:
import torch
import perceval as pcvl
from merlin.algorithms.layer import QuantumLayer
from merlin.core import ComputationSpace, ProbabilityDistribution
from merlin.measurement.strategies import MeasurementStrategy
circuit = ML.CircuitBuilder(n_modes=4)
circuit.add_entangling_layer()
bell = pcvl.StateVector()
bell += pcvl.BasicState([1, 0, 1, 0])
bell += pcvl.BasicState([0, 1, 0, 1])
print(bell) # bell is a state vector of 2 photons in 4 modes
layer = QuantumLayer(
builder=circuit,
n_photons=2,
input_state=bell,
measurement_strategy=MeasurementStrategy.probs(computation_space=ComputationSpace.DUAL_RAIL),
return_object=True,
)
x = torch.rand(10, circuit.m) # batch of classical parameters
probs = layer(x)
assert isinstance(probs,ProbabilityDistribution)
assert isinstance(probs.tensor,torch.Tensor)
Memristive phase-shifter
Memristive phase-shifters carry state across forward passes. When a QuantumLayer is built from a CircuitBuilder that contains memristive phase-shifters added with add_memristive_ps(), call reset() before processing a new sequence or batch.
reset(batch_size=...) restores each memristor to its initial state, clears memristive_history, and sets the batch size expected by later forward passes. Until reset is called again, all forward passes must use that configured batch size.
import torch
import merlin as ML
circ = ML.CircuitBuilder(n_modes=3)
circ.add_memristive_ps(mode=1, update_rule=update_rule, initial_state=1.2)
circ.add_angle_encoding(modes=[0, 2])
ql = ML.QuantumLayer(
builder=circ,
n_photons=3,
measurement_strategy=ML.MeasurementStrategy.probs(
computation_space=ML.ComputationSpace.FOCK
),
)
input_tensor = torch.rand((5, 2))
# This would fail because the default memristive batch size is 1.
# probs = ql(input_tensor)
ql.reset(batch_size=5)
probs = ql(input_tensor)
The current state of each memristive phase-shifter is available through memristive_state. The full state history is available through memristive_history. Both lists follow the order in which the memristive phase-shifters were added to the CircuitBuilder.
Gradient Flow Control
When defining a memristive phase-shifter using add_memristive_ps(), the detach_at_each_forward parameter controls how gradients flow through the memristive state recurrence.
At time step t, the layer uses the current memristive state as the phase value. After the forward pass, the memristor’s update_rule receives the current state and the layer output, then returns the state used at time step t + 1. The numerical state is carried forward in every regime below; only the retained PyTorch autograd history changes.
Use detach_memristive_state() at a TBPTT boundary when the next chunk should keep the current numerical state but stop backpropagating through earlier recurrent updates. Use reset instead when starting a new independent sequence.
The common gradient-history regimes are:
No recurrent gradient steps: use
detach_at_each_forward=True, the default. The memristive state still updates after each forward pass, but each new state is detached from the graph. A later loss does not backpropagate through earlier memristive state updates.All recurrent gradient steps: use
detach_at_each_forward=Falseand do not manually detach during the sequence. Backpropagation can traverse the full memristive history since the lastreset(). This is full backpropagation through time and uses more memory as the sequence grows.N recurrent gradient steps: use
detach_at_each_forward=Falseand calldetach_memristive_state(clear_history=True)everyntime steps. This is truncated backpropagation through time (TBPTT): gradients flow inside the current chunk, while the current numerical state is preserved for the next chunk.
The full history of states is maintained in memristive_history until reset() is called or detach_memristive_state() is called with clear_history=True.
The examples below assume torch, ql, optimizer, criterion, inputs, and targets already exist, and that each x_t is one time step with the batch size configured by ql.reset(batch_size=...). Construct the memristive phase-shifter with the detach_at_each_forward setting named by each case.
No recurrent gradient steps:
ql.reset(batch_size=batch_size)
for x_t, target_t in zip(inputs, targets):
optimizer.zero_grad(set_to_none=True)
prediction = ql(x_t)
loss = criterion(prediction, target_t)
loss.backward()
optimizer.step()
All recurrent gradient steps:
ql.reset(batch_size=batch_size)
optimizer.zero_grad(set_to_none=True)
loss = torch.zeros((), dtype=ql.dtype, device=ql.device)
for x_t, target_t in zip(inputs, targets):
prediction = ql(x_t)
loss = loss + criterion(prediction, target_t)
loss.backward()
optimizer.step()
N recurrent gradient steps with TBPTT:
ql.reset(batch_size=batch_size)
for start in range(0, len(inputs), n):
input_chunk = inputs[start : start + n]
target_chunk = targets[start : start + n]
optimizer.zero_grad(set_to_none=True)
loss = torch.zeros((), dtype=ql.dtype, device=ql.device)
for x_t, target_t in zip(input_chunk, target_chunk):
prediction = ql(x_t)
loss = loss + criterion(prediction, target_t)
loss.backward()
optimizer.step()
ql.detach_memristive_state(clear_history=True)
Deprecations
Warning
Removed in version 0.4:
The no_bunching flag is removed in version 0.4. Use
MeasurementStrategy.probs(computation_space=ComputationSpace.UNBUNCHED)
or MeasurementStrategy.probs(computation_space=ComputationSpace.FOCK)
instead. See Migration guide.
Warning
Deprecated since version 0.4: The use of the computation_space argument in the QuantumLayer’s constructor is no longer supported as 0.4.0.
Use the computation_space flag inside measurement_strategy instead. See Migration guide.