MerlinProcessor API Reference

Overview

MerlinProcessor is an RPC-style bridge that offloads quantum leaves (e.g., layers exposing export_config()) to a remote backend, while keeping classical layers local. It supports two backend paths:

  • Perceval RemoteProcessor — the original Quandela Cloud path.

  • Perceval ISession — the preferred path for Scaleway-hosted platforms (and any future session-based providers).

Both paths support batched execution with chunking, limited intra-leaf concurrency, per-call/global timeouts, cooperative cancellation, and a Torch-friendly async interface returning torch.futures.Future.

Key Capabilities

  • Automatic traversal of a PyTorch module; offloads only quantum leaves.

  • Batch chunking (microbatch_size) and parallel submission per leaf (chunk_concurrency). Works identically for both backend paths.

  • Synchronous (forward) and asynchronous (forward_async) APIs.

  • Cancellation of a single call or all calls in flight.

  • Timeouts that cancel in-flight cloud jobs.

  • Per-chunk fresh RemoteProcessor objects — cloned from the original (RemoteProcessor path) or built from the session (ISession path) — to avoid cross-thread handler sharing.

  • Stable, descriptive cloud job names (capped to 50 chars).

Note

Execution is supported both with exact probabilities (if the backend exposes the "probs" command) and with sampling ("sample_count" or "samples"). Shots are user-controlled via nsample; there is no hidden auto-shot selection.

Class Reference

MerlinProcessor

class merlin.core.merlin_processor.MerlinProcessor(remote_processor=None, session=None, microbatch_size=32, timeout=3600.0, max_shots_per_call=None, chunk_concurrency=1)

Create a processor that offloads quantum leaves to a remote backend. Exactly one of remote_processor or session must be provided.

Parameters:
  • remote_processor – Authenticated Perceval RemoteProcessor (simulator or QPU-backed). Merlin clones it per chunk so concurrent jobs have independent state. Type: RemoteProcessor | None.

  • session – A Perceval ISession object — e.g. from perceval.providers.scaleway.Session. Merlin calls session.build_remote_processor() per chunk, giving each chunk an independent RP. Type: ISession | None.

  • microbatch_size (int) – Maximum rows per cloud job (chunk size).

  • timeout (float) – Default wall-time limit (seconds) per call. Per-call override via timeout=... on API methods.

  • max_shots_per_call – Hard cap on shots per cloud call. If None, a safe default is used internally. If nsample exceeds this cap, Merlin automatically raises it to match. Type: int | None.

  • chunk_concurrency (int) – Max number of chunk jobs in flight per quantum leaf during a single call. >=1 (default: 1, i.e., serial).

Raises:

TypeError – If both or neither of remote_processor and session are provided, or if the provided argument is not the expected type.

Attributes

remote_processor

RemoteProcessor | None — set when constructed with remote_processor; None for the session path.

session

ISession | None — set when constructed with session; None for the RemoteProcessor path.

backend_name

str — best-effort backend name from the remote processor or session.

available_commands

list[str] — commands exposed by the backend (e.g., "probs", "sample_count", "samples"). Always empty for the ISession path.

microbatch_size
default_timeout
max_shots_per_call
chunk_concurrency

Constructor options reflected on the instance.

DEFAULT_MAX_SHOTS
DEFAULT_SHOTS_PER_CALL

Library constants used when computing defaults for sampling paths.

Context Management

merlin.core.merlin_processor.__enter__()
merlin.core.merlin_processor.__exit__(exc_type, exc, tb)

Entering returns the processor. Exiting triggers a best-effort cancel_all() to ensure no stray jobs remain.

Execution APIs

merlin.core.merlin_processor.forward(module, input, *, nsample=None, timeout=None) torch.Tensor

Synchronous convenience around forward_async().

Parameters:
  • module (torch.nn.Module) – A Torch module/tree. Leaves exposing export_config() (and not force_local=True) are offloaded.

  • input (torch.Tensor) – 2D batch [B, D] or shape required by the first leaf. Tensors are moved to CPU for remote execution if needed; the result is moved back to the input’s original device/dtype.

  • nsample (int | None) – Shots per input when sampling. Ignored if the backend supports exact probabilities ("probs").

  • timeout (float | None) – Per-call override. None/0 == unlimited.

Returns:

Output tensor with batch dimension B and leaf-determined distribution dimension.

Return type:

torch.Tensor

Raises:
  • RuntimeError – If module is in training mode.

  • TimeoutError – On global per-call timeout (remote cancel is issued).

  • concurrent.futures.CancelledError – If the call is cooperatively cancelled via the async API.

merlin.core.merlin_processor.forward_async(module, input, *, nsample=None, timeout=None) torch.futures.Future

Asynchronous execution. Returns a torch.futures.Future with extra helpers attached:

Future extensions

  • future.job_ids: list[str] — accumulates job IDs across all chunk jobs.

  • future.status() -> dict — current state/progress/message plus chunk counters: {"chunks_total", "chunks_done", "active_chunks"}.

  • future.cancel_remote() -> None — cooperative cancel; in-flight jobs are best-effort cancelled and future.wait() raises CancelledError.

Parameters:
Returns:

Future that resolves to the same tensor as forward().

Job & Lifecycle Utilities

merlin.core.merlin_processor.cancel_all() None

Best-effort cancellation of all active jobs across outstanding calls.

merlin.core.merlin_processor.get_job_history() list[perceval.runtime.RemoteJob]

Returns a list of all jobs observed/submitted by this instance during the process lifetime (useful for diagnostics).

merlin.core.merlin_processor.clear_job_history() None

Clears the internal job history list.

Shot Estimation (No Submission)

merlin.core.merlin_processor.estimate_required_shots_per_input(layer, input, desired_samples_per_input) list[int]

Ask the platform estimator how many shots are required per input row to reach a target number of useful samples.

Parameters:
  • layer (torch.nn.Module) – A quantum leaf (must implement export_config()).

  • input (torch.Tensor) – [B, D] or a single vector [D]. Values are mapped to the circuit parameters as they would be during execution.

  • desired_samples_per_input (int) – Target useful samples per input.

Returns:

list[int] of length B (0 indicates “not viable” under current settings).

Return type:

list[int]

Raises:
  • TypeError – If layer does not expose export_config().

  • ValueError – If input is not 1D or 2D.

Execution Semantics

Traversal & Offload

  • Leaves with export_config() are treated as quantum leaves and are offloaded unless they expose a should_offload() method that returns False, or they set force_local=True.

  • Non-quantum leaves run locally under torch.no_grad().

Batching & Chunking

  • If B > microbatch_size, the batch is split into chunks of size <= microbatch_size. Up to chunk_concurrency chunk jobs per quantum leaf are submitted in parallel. This applies to both the RemoteProcessor and ISession paths.

  • Failed chunks are retried up to 3 times with exponential backoff. Cancellation and timeout errors propagate immediately without retry.

Backends & Commands

  • If the backend exposes "probs", the processor queries exact probabilities and ignores nsample.

  • Otherwise it uses "sample_count" or "samples" with nsample or DEFAULT_SHOTS_PER_CALL.

  • Command detection is only available on the RemoteProcessor path; the ISession path always uses sampling.

Timeouts & Cancellation

  • Per-call timeouts are enforced as global deadlines. On expiry, in-flight jobs are cancelled and a TimeoutError is raised.

  • future.cancel_remote() performs cooperative cancellation; awaiting the future raises concurrent.futures.CancelledError.

Job Naming & Traceability

  • Each chunk job receives a descriptive name of the form "mer:{layer}:{call_id}:{idx}/{total}:{cmd}", sanitized and truncated to 50 characters with a stable hash suffix when necessary.

Threading & Fresh RPs

  • For each chunk attempt, the processor builds a fresh RemoteProcessor:

    • RemoteProcessor path: clones the original RP (independent RPC handler).

    • ISession path: calls session.build_remote_processor() (independent RP per chunk).

    This ensures concurrent chunks and retries never share mutable RP state.

Return Shapes & Mapping

  • Distribution size is inferred from the leaf graph or from (n_modes, n_photons) and the computation space chosen (UNBUNCHED or FOCK). Probability vectors are normalized if needed.

Examples

Synchronous execution (RemoteProcessor)

proc = MerlinProcessor(pcvl.RemoteProcessor("sim:slos"))
y = proc.forward(model, X, nsample=20_000)

Synchronous execution (ISession)

import perceval.providers.scaleway as scw

with scw.Session("sim:ascella", project_id=..., token=...) as session:
    proc = MerlinProcessor(session=session, timeout=300.0)
    y = proc.forward(model, X, nsample=5_000)

Asynchronous with status and cancellation

fut = proc.forward_async(model, X, nsample=5_000, timeout=None)
print(fut.status())        # {'state': ..., 'progress': ..., ...}
# If needed:
fut.cancel_remote()        # cooperative cancel
try:
    y = fut.wait()
except Exception as e:
    print("Cancelled:", type(e).__name__)

High-throughput chunking

proc = MerlinProcessor(rp, microbatch_size=8, chunk_concurrency=2)
y = proc.forward(q_layer, X, nsample=3_000)

Version Notes

  • Both remote_processor and session paths now support chunking and chunk_concurrency. Each chunk gets an independent RemoteProcessor.

  • Default chunk_concurrency is 1 (serial).

  • The constructor timeout must be a float; use per-call timeout=None for an unlimited call.

  • max_shots_per_call is automatically raised to match nsample when needed.

  • Shots are user-controlled (no auto-shot chooser); use the estimator helper to plan values ahead of time.