MerlinProcessor API Reference
Overview
MerlinProcessor is an RPC-style bridge that offloads quantum leaves
(e.g., layers exposing export_config()) to a remote backend, while keeping
classical layers local. It supports two backend paths:
Perceval
RemoteProcessor— the original Quandela Cloud path.Perceval
ISession— the preferred path for Scaleway-hosted platforms (and any future session-based providers).
Both paths support batched execution with chunking, limited intra-leaf
concurrency, per-call/global timeouts, cooperative cancellation, and a
Torch-friendly async interface returning torch.futures.Future.
Key Capabilities
Automatic traversal of a PyTorch module; offloads only quantum leaves.
Batch chunking (
microbatch_size) and parallel submission per leaf (chunk_concurrency). Works identically for both backend paths.Synchronous (
forward) and asynchronous (forward_async) APIs.Cancellation of a single call or all calls in flight.
Timeouts that cancel in-flight cloud jobs.
Per-chunk fresh
RemoteProcessorobjects — cloned from the original (RemoteProcessor path) or built from the session (ISession path) — to avoid cross-thread handler sharing.Stable, descriptive cloud job names (capped to 50 chars).
Note
Execution is supported both with exact probabilities (if the backend exposes
the "probs" command) and with sampling ("sample_count" or
"samples"). Shots are user-controlled via nsample; there is no
hidden auto-shot selection.
Class Reference
MerlinProcessor
- class merlin.core.merlin_processor.MerlinProcessor(remote_processor=None, session=None, microbatch_size=32, timeout=3600.0, max_shots_per_call=None, chunk_concurrency=1)
Create a processor that offloads quantum leaves to a remote backend. Exactly one of
remote_processororsessionmust be provided.- Parameters:
remote_processor – Authenticated Perceval
RemoteProcessor(simulator or QPU-backed). Merlin clones it per chunk so concurrent jobs have independent state. Type:RemoteProcessor | None.session – A Perceval
ISessionobject — e.g. fromperceval.providers.scaleway.Session. Merlin callssession.build_remote_processor()per chunk, giving each chunk an independent RP. Type:ISession | None.microbatch_size (int) – Maximum rows per cloud job (chunk size).
timeout (float) – Default wall-time limit (seconds) per call. Per-call override via
timeout=...on API methods.max_shots_per_call – Hard cap on shots per cloud call. If
None, a safe default is used internally. Ifnsampleexceeds this cap, Merlin automatically raises it to match. Type:int | None.chunk_concurrency (int) – Max number of chunk jobs in flight per quantum leaf during a single call.
>=1(default: 1, i.e., serial).
- Raises:
TypeError – If both or neither of
remote_processorandsessionare provided, or if the provided argument is not the expected type.
Attributes
- remote_processor
RemoteProcessor | None— set when constructed withremote_processor;Nonefor the session path.
- session
ISession | None— set when constructed withsession;Nonefor the RemoteProcessor path.
- backend_name
str— best-effort backend name from the remote processor or session.
- available_commands
list[str]— commands exposed by the backend (e.g.,"probs","sample_count","samples"). Always empty for the ISession path.
Context Management
- merlin.core.merlin_processor.__enter__()
- merlin.core.merlin_processor.__exit__(exc_type, exc, tb)
Entering returns the processor. Exiting triggers a best-effort
cancel_all()to ensure no stray jobs remain.
Execution APIs
- merlin.core.merlin_processor.forward(module, input, *, nsample=None, timeout=None) torch.Tensor
Synchronous convenience around
forward_async().- Parameters:
module (torch.nn.Module) – A Torch module/tree. Leaves exposing
export_config()(and notforce_local=True) are offloaded.input (torch.Tensor) – 2D batch
[B, D]or shape required by the first leaf. Tensors are moved to CPU for remote execution if needed; the result is moved back to the input’s original device/dtype.nsample (int | None) – Shots per input when sampling. Ignored if the backend supports exact probabilities (
"probs").timeout (float | None) – Per-call override.
None/0== unlimited.
- Returns:
Output tensor with batch dimension
Band leaf-determined distribution dimension.- Return type:
- Raises:
RuntimeError – If
moduleis in training mode.TimeoutError – On global per-call timeout (remote cancel is issued).
concurrent.futures.CancelledError – If the call is cooperatively cancelled via the async API.
- merlin.core.merlin_processor.forward_async(module, input, *, nsample=None, timeout=None) torch.futures.Future
Asynchronous execution. Returns a
torch.futures.Futurewith extra helpers attached:Future extensions
future.job_ids: list[str]— accumulates job IDs across all chunk jobs.future.status() -> dict— current state/progress/message plus chunk counters:{"chunks_total", "chunks_done", "active_chunks"}.future.cancel_remote() -> None— cooperative cancel; in-flight jobs are best-effort cancelled andfuture.wait()raisesCancelledError.
Job & Lifecycle Utilities
- merlin.core.merlin_processor.cancel_all() None
Best-effort cancellation of all active jobs across outstanding calls.
- merlin.core.merlin_processor.get_job_history() list[perceval.runtime.RemoteJob]
Returns a list of all jobs observed/submitted by this instance during the process lifetime (useful for diagnostics).
- merlin.core.merlin_processor.clear_job_history() None
Clears the internal job history list.
Shot Estimation (No Submission)
- merlin.core.merlin_processor.estimate_required_shots_per_input(layer, input, desired_samples_per_input) list[int]
Ask the platform estimator how many shots are required per input row to reach a target number of useful samples.
- Parameters:
layer (torch.nn.Module) – A quantum leaf (must implement
export_config()).input (torch.Tensor) –
[B, D]or a single vector[D]. Values are mapped to the circuit parameters as they would be during execution.desired_samples_per_input (int) – Target useful samples per input.
- Returns:
list[int]of lengthB(0indicates “not viable” under current settings).- Return type:
list[int]
- Raises:
TypeError – If
layerdoes not exposeexport_config().ValueError – If
inputis not 1D or 2D.
Execution Semantics
Traversal & Offload
Leaves with
export_config()are treated as quantum leaves and are offloaded unless they expose ashould_offload()method that returnsFalse, or they setforce_local=True.Non-quantum leaves run locally under
torch.no_grad().
Batching & Chunking
If
B > microbatch_size, the batch is split into chunks of size<= microbatch_size. Up tochunk_concurrencychunk jobs per quantum leaf are submitted in parallel. This applies to both the RemoteProcessor and ISession paths.Failed chunks are retried up to 3 times with exponential backoff. Cancellation and timeout errors propagate immediately without retry.
Backends & Commands
If the backend exposes
"probs", the processor queries exact probabilities and ignoresnsample.Otherwise it uses
"sample_count"or"samples"withnsample or DEFAULT_SHOTS_PER_CALL.Command detection is only available on the RemoteProcessor path; the ISession path always uses sampling.
Timeouts & Cancellation
Per-call timeouts are enforced as global deadlines. On expiry, in-flight jobs are cancelled and a
TimeoutErroris raised.future.cancel_remote()performs cooperative cancellation; awaiting the future raisesconcurrent.futures.CancelledError.
Job Naming & Traceability
Each chunk job receives a descriptive name of the form
"mer:{layer}:{call_id}:{idx}/{total}:{cmd}", sanitized and truncated to 50 characters with a stable hash suffix when necessary.
Threading & Fresh RPs
For each chunk attempt, the processor builds a fresh
RemoteProcessor:RemoteProcessor path: clones the original RP (independent RPC handler).
ISession path: calls
session.build_remote_processor()(independent RP per chunk).
This ensures concurrent chunks and retries never share mutable RP state.
Return Shapes & Mapping
Distribution size is inferred from the leaf graph or from
(n_modes, n_photons)and the computation space chosen (UNBUNCHEDorFOCK). Probability vectors are normalized if needed.
Examples
Synchronous execution (RemoteProcessor)
proc = MerlinProcessor(pcvl.RemoteProcessor("sim:slos"))
y = proc.forward(model, X, nsample=20_000)
Synchronous execution (ISession)
import perceval.providers.scaleway as scw
with scw.Session("sim:ascella", project_id=..., token=...) as session:
proc = MerlinProcessor(session=session, timeout=300.0)
y = proc.forward(model, X, nsample=5_000)
Asynchronous with status and cancellation
fut = proc.forward_async(model, X, nsample=5_000, timeout=None)
print(fut.status()) # {'state': ..., 'progress': ..., ...}
# If needed:
fut.cancel_remote() # cooperative cancel
try:
y = fut.wait()
except Exception as e:
print("Cancelled:", type(e).__name__)
High-throughput chunking
proc = MerlinProcessor(rp, microbatch_size=8, chunk_concurrency=2)
y = proc.forward(q_layer, X, nsample=3_000)
Version Notes
Both
remote_processorandsessionpaths now support chunking andchunk_concurrency. Each chunk gets an independentRemoteProcessor.Default
chunk_concurrencyis 1 (serial).The constructor
timeoutmust be a float; use per-calltimeout=Nonefor an unlimited call.max_shots_per_callis automatically raised to matchnsamplewhen needed.Shots are user-controlled (no auto-shot chooser); use the estimator helper to plan values ahead of time.