MerlinProcessor API Reference

Overview

MerlinProcessor is an RPC-style bridge that offloads quantum leaves (e.g., layers exposing export_config()) to a Perceval backend, while keeping classical layers local. It supports these backend entry points:

Perceval perceval.runtime.AProcessor — the primary constructor argument. Local, non-remote processors execute through synchronous Perceval local jobs.
Perceval RemoteProcessor — the original Quandela Cloud path. Passing a RemoteProcessor through processor= routes to this same path.
Perceval pcvl.runtime.session.ISession — the preferred path for Scaleway-hosted platforms (and any future session-based providers).

All execution paths support batched execution, per-call/global timeouts, cooperative cancellation, and a Torch-friendly async interface returning torch.futures.Future. Remote backends additionally support chunking with limited intra-leaf concurrency.

Key Capabilities

Automatic traversal of a PyTorch module; offloads only quantum leaves.
Remote batch chunking (microbatch_size) and parallel submission per leaf (chunk_concurrency).
Synchronous (forward) and asynchronous (forward_async) APIs.
Cancellation of a single call or all calls in flight.
Timeouts that cancel in-flight cloud jobs for remote backends and check local jobs before and after synchronous execution.
Isolated backend objects: local processors are rebuilt from copied non-circuit experiment state, while RemoteProcessor objects are cloned from the original (RemoteProcessor path) or built from the session (ISession path).
Stable, descriptive cloud job names (capped to 50 chars) for remote jobs.

Note

Execution is supported both with exact probabilities (if the backend exposes the "probs" command) and with sampling ("sample_count" or "samples"). Shots are user-controlled via nsample; there is no hidden auto-shot selection.

Key current limitation

No gradient can flow through the MerLin processor. When calling backwards directly on a MerlinProcessor’s forward pass, this error will be shown:
```
element 0 of tensors does not require grad and does not have a grad_fn
```
If a MerLin processor is part of a bigger Pytorch module, the quantum layer’s parameter gradient will be None. This also means that every upstream models’ (models called before the QuantumLayer in the model’s forward) gradient will be None.
The gradient can not pass through right now as pytorch can not access directly the computation on the QPU.
Differentiation through the MerLin processor will be implemented in v0.5.

Class Reference

BackendCapabilities

class merlin.core.merlin_processor.BackendCapabilities(name, available_commands)

Immutable dataclass encapsulating capabilities extracted from a Perceval backend.

Attributes

name: str — Backend platform name (e.g., "sim:slos", "perceval-qpu:scaleway").

available_commands: list[str] — Commands supported by the backend (e.g., "probs", "sample_count", "samples").

Example

proc = MerlinProcessor(processor=rp)
caps = proc.backend_capabilities
print(f"Platform: {caps.name}")
print(f"Supports: {caps.available_commands}")

MerlinProcessor

class merlin.core.merlin_processor.MerlinProcessor(remote_processor=None, session=None, microbatch_size=32, timeout=3600.0, max_shots_per_call=None, chunk_concurrency=1, token=None, *, processor=None)

Create a processor that offloads quantum leaves to a Perceval backend. Exactly one of processor, remote_processor, or session must be provided.

Warning

Deprecated since version 0.4: The remote_processor constructor argument is deprecated and will be removed in a future release. Pass the same RemoteProcessor through processor= instead.

Parameters:

remote_processor – Deprecated authenticated Perceval RemoteProcessor entry point. Pass the same object through processor= instead. Merlin clones it per chunk so concurrent jobs have independent state. Type: RemoteProcessor | None.
session –
A Perceval pcvl.runtime.session.ISession object — e.g. from pcvl.providers.scaleway.Session. Merlin calls session.build_remote_processor() per chunk, giving each chunk an independent RP. Type: ISession | None.
microbatch_size (int) – Maximum rows per remote backend chunk. Ignored for local processors.
timeout (float) – Default wall-time limit (seconds) per call. Per-call override via timeout=... on API methods.
max_shots_per_call – Hard cap on shots per backend call. If None, a safe default is used internally. If nsample exceeds this cap, Merlin clamps the submitted sample count to this value. Type: int | None.
chunk_concurrency (int) – Max number of remote chunk jobs in flight per quantum leaf during a single call. Ignored for local processors. >=1 (default: 1, i.e., serial).
token – Optional authentication token forwarded to cloned remote processors. If omitted, Merlin extracts the token from the RemoteProcessor’s RPC handler. Ignored for local and session paths. Type: str | None.
processor – Keyword-only Perceval perceval.runtime.AProcessor. Local processors execute through synchronous local Perceval jobs and do not require remote token extraction. RemoteProcessor instances passed here are normalized to the existing remote backend. Type: AProcessor | None.

Raises:

TypeError – If exactly one backend is not provided, if the provided argument is not the expected type, or if processor is an unsupported remote AProcessor subclass.
ValueError – If no authentication token can be resolved on the RemoteProcessor path (neither provided explicitly nor extractable from the processor’s RPC handler).

Attributes

backend_capabilities: BackendCapabilities — Encapsulates backend name and available commands extracted at initialization time. This is the primary way to access backend capabilities.

backend_kind: str — Active backend route: "local_processor", "remote_processor", or "session".

backend_name: str — Backward-compatibility property. Equivalent to backend_capabilities.name. Best-effort backend name from the remote processor or session (e.g., "sim:slos").

available_commands: list[str] — Backward-compatibility property. Equivalent to backend_capabilities.available_commands. Commands exposed by the backend (e.g., "probs", "sample_count", "samples").

processor: AProcessor | None — set when constructed with a local, non-remote processor; None for remote and session paths.

remote_processor: RemoteProcessor | None — set when constructed with remote_processor or a RemoteProcessor passed through processor; None for local and session paths.

session: ISession | None — set when constructed with session; None for local and RemoteProcessor paths.

microbatch_size
default_timeout
max_shots_per_call
chunk_concurrency: Constructor options reflected on the instance.

DEFAULT_MAX_SHOTS
DEFAULT_SHOTS_PER_CALL: Library constants used when computing defaults for sampling paths.

Context Management

merlin.core.merlin_processor.__enter__()

merlin.core.merlin_processor.__exit__(exc_type, exc, tb): Entering returns the processor. Exiting triggers a best-effort cancel_all() to ensure no stray jobs remain.

Execution APIs

merlin.core.merlin_processor.forward(module, input, *, nsample=None, timeout=None) → torch.Tensor

Synchronous convenience around forward_async().

Parameters:

module (torch.nn.Module) – A Torch module/tree. Leaves exposing export_config() (and not force_local=True) are offloaded.
input (torch.Tensor) – 2D batch [B, D] or shape required by the first leaf. Tensors are moved to CPU for backend execution if needed; the result is moved back to the input’s original device/dtype.
nsample (int | None) – Shots per input when sampling. Ignored if the backend supports exact probabilities ("probs").
timeout (float | None) – Per-call override. None/0 == unlimited.

Returns:

Output tensor with batch dimension B and leaf-determined distribution dimension.

Return type:

torch.Tensor

Raises:

RuntimeError – If module is in training mode.
TimeoutError – On global per-call timeout. Remote jobs are cancelled best-effort; local synchronous jobs are checked before and after execution.
concurrent.futures.CancelledError – If the call is cooperatively cancelled via the async API.

merlin.core.merlin_processor.forward_async(module, input, *, nsample=None, timeout=None) → torch.futures.Future

Asynchronous execution. Returns a torch.futures.Future with extra helpers attached:

Future extensions

future.job_ids: list[str] — accumulates remote job IDs across chunk jobs.
future.status() -> dict — current state/progress/message plus chunk counters: {"chunks_total", "chunks_done", "active_chunks"}.
future.cancel_remote() -> None — cooperative cancel; in-flight jobs are best-effort cancelled and future.wait() raises CancelledError.

Parameters:

module – See forward().
input – See forward().
nsample – See forward().
timeout – See forward().

Returns:

Future that resolves to the same tensor as forward().

Job & Lifecycle Utilities

merlin.core.merlin_processor.cancel_all() → None: Best-effort cancellation of all active jobs across outstanding calls.

merlin.core.merlin_processor.get_job_history()

Returns:: List of remote job handles.
Return type:: list[perceval.runtime.RemoteJob]

Returns a list of all jobs observed/submitted by this instance during the process lifetime (useful for diagnostics).

merlin.core.merlin_processor.clear_job_history() → None: Clears the internal job history list.

Shot Estimation (No Submission)

merlin.core.merlin_processor.estimate_required_shots_per_input(layer, input, desired_samples_per_input) → list[int]

Ask the platform estimator how many shots are required per input row to reach a target number of useful samples. This helper is available only for remote processor and session backends.

Parameters:

layer (torch.nn.Module) – A quantum leaf (must implement export_config()).
input (torch.Tensor) – [B, D] or a single vector [D]. Values are mapped to the circuit parameters as they would be during execution.
desired_samples_per_input (int) – Target useful samples per input.

Returns:

list[int] of length B (0 indicates “not viable” under current settings).

Return type:

list[int]

Raises:

RuntimeError – If called on a local processor= backend.
TypeError – If layer does not expose export_config().
ValueError – If input is not 1D or 2D.

Execution Semantics

Traversal & Offload

Leaves with export_config() are treated as quantum leaves and are offloaded unless they expose a should_offload() method that returns False, or they set force_local=True.
Non-quantum leaves run locally under torch.no_grad().

Batching & Chunking

Local processors keep the full PyTorch input as one Merlin-level batch, then represent its rows as Perceval sampler iterations. Perceval does not execute a native vectorized batch. microbatch_size and chunk_concurrency do not affect local processor execution; users control local batch size through the input batch they pass to PyTorch.
RemoteProcessor and ISession backends split batches larger than microbatch_size into chunks of size <= microbatch_size. Up to chunk_concurrency chunk jobs per quantum leaf are submitted in parallel.
Remote chunks are retried up to 3 times with exponential backoff. Local synchronous execution propagates errors directly. Cancellation and timeout errors propagate immediately without retry.

Backends & Commands

Backend capabilities (name and available commands) are extracted once at initialization and stored in backend_capabilities. Local processors use their own name and available_commands directly. The RemoteProcessor path uses the original RP, and the ISession path uses the first processor built from the session.
If the backend exposes "probs" and nsample is None or 0, the processor queries exact probabilities.
Otherwise it uses "sample_count" or "samples" with nsample or min(DEFAULT_SHOTS_PER_CALL, max_shots_per_call).
The backward-compatibility properties backend_name and available_commands provide direct access to the capabilities.

Timeouts & Cancellation

Per-call timeouts are enforced as global deadlines. On expiry, in-flight remote jobs are cancelled and a TimeoutError is raised. Local synchronous jobs cannot be interrupted mid-call; the deadline is checked before submission and again before results are returned.
future.cancel_remote() performs cooperative cancellation; awaiting the future raises concurrent.futures.CancelledError.

Job Naming & Traceability

Each chunk job receives a descriptive name of the form "mer:{layer}:{call_id}:{idx}/{total}:{cmd}", sanitized and truncated to 50 characters with a stable hash suffix when necessary. Local jobs do not use cloud job names.

Threading & Isolated Backends

For each chunk attempt, the processor uses an isolated backend object:
- Local processor path: rebuilds a fresh perceval.runtime.Processor from copied non-circuit experiment state and a fresh local backend. The caller processor’s previous circuit, input, and mode count are cleared before the layer circuit is set.
- RemoteProcessor path: clones the original RP (independent RPC handler).
- ISession path: calls session.build_remote_processor() (independent RP per chunk). The session’s lifecycle is managed by the context manager (__enter__ and __exit__ delegate to the session if provided).
This prevents concurrent calls, chunks, and retries from sharing mutable backend state, and session resources are properly initialized and cleaned up.

Return Shapes & Mapping

Distribution size is inferred from the leaf graph or from (n_modes, n_photons) and the computation space chosen (UNBUNCHED or FOCK). Probability vectors are normalized if needed.

Examples

Synchronous execution (local AProcessor)

import perceval as pcvl

local_processor = pcvl.Processor("SLOS")
proc = MerlinProcessor(processor=local_processor)
y = proc.forward(model, X)

Synchronous execution (RemoteProcessor)

proc = MerlinProcessor(processor=pcvl.RemoteProcessor("sim:slos"))
y = proc.forward(model, X, nsample=20_000)

Synchronous execution (ISession)

import perceval.providers.scaleway as scw

with scw.Session("sim:ascella", project_id=..., token=...) as session:
    proc = MerlinProcessor(session=session, timeout=300.0)
    y = proc.forward(model, X, nsample=5_000)

Asynchronous with status and cancellation

fut = proc.forward_async(model, X, nsample=5_000, timeout=None)
print(fut.status())        # {'state': ..., 'progress': ..., ...}
# If needed:
fut.cancel_remote()        # cooperative cancel
try:
    y = fut.wait()
except Exception as e:
    print("Cancelled:", type(e).__name__)

High-throughput chunking

proc = MerlinProcessor(rp, microbatch_size=8, chunk_concurrency=2)
y = proc.forward(q_layer, X, nsample=3_000)

Version Notes

Only remote_processor and session paths use chunking and chunk_concurrency. The local processor path keeps the full PyTorch input together at the Merlin level, maps rows to Perceval sampler iterations, and uses a rebuilt processor.
Default chunk_concurrency is 1 (serial).
The constructor timeout must be a float; use per-call timeout=None for an unlimited call.
max_shots_per_call caps the submitted sample count when needed.
Shots are user-controlled (no auto-shot chooser); use the estimator helper to plan values ahead of time.