merlin.algorithms.loss module

Specialized loss functions for QML

class merlin.algorithms.loss.NKernelAlignment

Bases: _Loss

Negative kernel-target alignment loss function for quantum kernel training.

Within quantum kernel alignment, the goal is to maximize the alignment between the quantum kernel matrix and the ideal target matrix given by \(K^{*} = y y^T\), where \(y \in \{-1, +1\}\) are the target labels.

The negative kernel alignment loss is given as:

\[\text{NKA}(K, K^{*}) = -\frac{\operatorname{Tr}(K K^{*})}{ \sqrt{\operatorname{Tr}(K^2)\operatorname{Tr}(K^{*2})}}\]
forward(input, target)

Define the computation performed at every call.

Should be overridden by all subclasses. :rtype: Tensor

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class merlin.algorithms.loss.Tensor

Bases: TensorBase

align_to(*names)

Permutes the dimensions of the self tensor to match the order specified in names, adding size-one dims for any new names.

All of the dims of self must be named in order to use this method. The resulting tensor is a view on the original tensor.

All dimension names of self must be present in names. names may contain additional names that are not in self.names; the output tensor has a size-one dimension for each of those new names.

names may contain up to one Ellipsis (...). The Ellipsis is expanded to be equal to all dimension names of self that are not mentioned in names, in the order that they appear in self.

Python 2 does not support Ellipsis but one may use a string literal instead ('...').

Args:
names (iterable of str): The desired dimension ordering of the

output tensor. May contain up to one Ellipsis that is expanded to all unmentioned dim names of self.

Examples:

>>> tensor = torch.randn(2, 2, 2, 2, 2, 2)
>>> named_tensor = tensor.refine_names('A', 'B', 'C', 'D', 'E', 'F')

# Move the F and E dims to the front while keeping the rest in order
>>> named_tensor.align_to('F', 'E', ...)

Warning

The named tensor API is experimental and subject to change.

backward(gradient=None, retain_graph=None, create_graph=False, inputs=None)

Computes the gradient of current tensor wrt graph leaves.

The graph is differentiated using the chain rule. If the tensor is non-scalar (i.e. its data has more than one element) and requires gradient, the function additionally requires specifying a gradient. It should be a tensor of matching type and shape, that represents the gradient of the differentiated function w.r.t. self.

This function accumulates gradients in the leaves - you might need to zero .grad attributes or set them to None before calling it. See Default gradient layouts for details on the memory layout of accumulated gradients.

Note

If you run any forward ops, create gradient, and/or call backward in a user-specified CUDA stream context, see Stream semantics of backward passes.

Note

When inputs are provided and a given input is not a leaf, the current implementation will call its grad_fn (though it is not strictly needed to get this gradients). It is an implementation detail on which the user should not rely. See https://github.com/pytorch/pytorch/pull/60521#issuecomment-867061780 for more details.

Args:
gradient (Tensor, optional): The gradient of the function

being differentiated w.r.t. self. This argument can be omitted if self is a scalar.

retain_graph (bool, optional): If False, the graph used to compute

the grads will be freed. Note that in nearly all cases setting this option to True is not needed and often can be worked around in a much more efficient way. Defaults to the value of create_graph.

create_graph (bool, optional): If True, graph of the derivative will

be constructed, allowing to compute higher order derivative products. Defaults to False.

inputs (sequence of Tensor, optional): Inputs w.r.t. which the gradient will be

accumulated into .grad. All other tensors will be ignored. If not provided, the gradient is accumulated into all the leaf Tensors that were used to compute the tensors.

detach()

Returns a new Tensor, detached from the current graph.

The result will never require gradient.

This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.

Note

Returned Tensor shares the same storage with the original one. In-place modifications on either of them will be seen, and may trigger errors in correctness checks.

detach_()

Detaches the Tensor from the graph that created it, making it a leaf. Views cannot be detached in-place.

This method also affects forward mode AD gradients and the result will never have forward mode AD gradients.

dim_order(*, ambiguity_check=False)

Returns the uniquely determined tuple of int describing the dim order or physical layout of self.

The dim order represents how dimensions are laid out in memory of dense tensors, starting from the outermost to the innermost dimension.

Note that the dim order may not always be uniquely determined. If ambiguity_check is True, this function raises a RuntimeError when the dim order cannot be uniquely determined; If ambiguity_check is a list of memory formats, this function raises a RuntimeError when tensor can not be interpreted into exactly one of the given memory formats, or it cannot be uniquely determined. If ambiguity_check is False, it will return one of legal dim order(s) without checking its uniqueness. Otherwise, it will raise TypeError.

Args:

ambiguity_check (bool or List[torch.memory_format]): The check method for ambiguity of dim order.

Examples:

>>> torch.empty((2, 3, 5, 7)).dim_order()
(0, 1, 2, 3)
>>> torch.empty((2, 3, 5, 7)).transpose(1, 2).dim_order()
(0, 2, 1, 3)
>>> torch.empty((2, 3, 5, 7), memory_format=torch.channels_last).dim_order()
(0, 2, 3, 1)
>>> torch.empty((1, 2, 3, 4)).dim_order()
(0, 1, 2, 3)
>>> try:
...     torch.empty((1, 2, 3, 4)).dim_order(ambiguity_check=True)
... except RuntimeError as e:
...     print(e)
The tensor does not have unique dim order, or cannot map to exact one of the given memory formats.
>>> torch.empty((1, 2, 3, 4)).dim_order(
...     ambiguity_check=[torch.contiguous_format, torch.channels_last]
... )  # It can be mapped to contiguous format
(0, 1, 2, 3)
>>> try:
...     torch.empty((1, 2, 3, 4)).dim_order(ambiguity_check="ILLEGAL")
... except TypeError as e:
...     print(e)
The ambiguity_check argument must be a bool or a list of memory formats.

Warning

The dim_order tensor API is experimental and subject to change.

eig(eigenvectors=False)
is_shared()

Checks if tensor is in shared memory.

This is always True for CUDA tensors.

istft(n_fft, hop_length=None, win_length=None, window=None, center=True, normalized=False, onesided=None, length=None, return_complex=False)

See torch.istft()

lstsq(other)
lu(pivot=True, get_infos=False)

See torch.lu()

module_load(other, assign=False)

Defines how to transform other when loading it into self in load_state_dict().

Used when get_swap_module_params_on_conversion() is True.

It is expected that self is a parameter or buffer in an nn.Module and other is the value in the state dictionary with the corresponding key, this method defines how other is remapped before being swapped with self via swap_tensors() in load_state_dict().

Note

This method should always return a new object that is not self or other. For example, the default implementation returns self.copy_(other).detach() if assign is False or other.detach() if assign is True.

Args:

other (Tensor): value in state dict with key corresponding to self assign (bool): the assign argument passed to nn.Module.load_state_dict()

norm(p='fro', dim=None, keepdim=False, dtype=None)

See torch.norm()

refine_names(*names)

Refines the dimension names of self according to names.

Refining is a special case of renaming that “lifts” unnamed dimensions. A None dim can be refined to have any name; a named dim can only be refined to have the same name.

Because named tensors can coexist with unnamed tensors, refining names gives a nice way to write named-tensor-aware code that works with both named and unnamed tensors.

names may contain up to one Ellipsis (...). The Ellipsis is expanded greedily; it is expanded in-place to fill names to the same length as self.dim() using names from the corresponding indices of self.names.

Python 2 does not support Ellipsis but one may use a string literal instead ('...').

Args:
names (iterable of str): The desired names of the output tensor. May

contain up to one Ellipsis.

Examples:

>>> imgs = torch.randn(32, 3, 128, 128)
>>> named_imgs = imgs.refine_names('N', 'C', 'H', 'W')
>>> named_imgs.names
('N', 'C', 'H', 'W')

>>> tensor = torch.randn(2, 3, 5, 7, 11)
>>> tensor = tensor.refine_names('A', ..., 'B', 'C')
>>> tensor.names
('A', None, None, 'B', 'C')

Warning

The named tensor API is experimental and subject to change.

register_hook(hook)

Registers a backward hook.

The hook will be called every time a gradient with respect to the Tensor is computed. The hook should have the following signature:

hook(grad) -> Tensor or None

The hook should not modify its argument, but it can optionally return a new gradient which will be used in place of grad.

This function returns a handle with a method handle.remove() that removes the hook from the module.

Note

See backward-hooks-execution for more information on how when this hook is executed, and how its execution is ordered relative to other hooks.

Example:

>>> v = torch.tensor([0., 0., 0.], requires_grad=True)
>>> h = v.register_hook(lambda grad: grad * 2)  # double the gradient
>>> v.backward(torch.tensor([1., 2., 3.]))
>>> v.grad

 2
 4
 6
[torch.FloatTensor of size (3,)]

>>> h.remove()  # removes the hook
register_post_accumulate_grad_hook(hook)

Registers a backward hook that runs after grad accumulation.

The hook will be called after all gradients for a tensor have been accumulated, meaning that the .grad field has been updated on that tensor. The post accumulate grad hook is ONLY applicable for leaf tensors (tensors without a .grad_fn field). Registering this hook on a non-leaf tensor will error!

The hook should have the following signature:

hook(param: Tensor) -> None

Note that, unlike other autograd hooks, this hook operates on the tensor that requires grad and not the grad itself. The hook can in-place modify and access its Tensor argument, including its .grad field.

This function returns a handle with a method handle.remove() that removes the hook from the module.

Note

See backward-hooks-execution for more information on how when this hook is executed, and how its execution is ordered relative to other hooks. Since this hook runs during the backward pass, it will run in no_grad mode (unless create_graph is True). You can use torch.enable_grad() to re-enable autograd within the hook if you need it.

Example:

>>> v = torch.tensor([0., 0., 0.], requires_grad=True)
>>> lr = 0.01
>>> # simulate a simple SGD update
>>> h = v.register_post_accumulate_grad_hook(lambda p: p.add_(p.grad, alpha=-lr))
>>> v.backward(torch.tensor([1., 2., 3.]))
>>> v
tensor([-0.0100, -0.0200, -0.0300], requires_grad=True)

>>> h.remove()  # removes the hook
reinforce(reward)
rename(*names, **rename_map)

Renames dimension names of self.

There are two main usages:

self.rename(**rename_map) returns a view on tensor that has dims renamed as specified in the mapping rename_map.

self.rename(*names) returns a view on tensor, renaming all dimensions positionally using names. Use self.rename(None) to drop names on a tensor.

One cannot specify both positional args names and keyword args rename_map.

Examples:

>>> imgs = torch.rand(2, 3, 5, 7, names=('N', 'C', 'H', 'W'))
>>> renamed_imgs = imgs.rename(N='batch', C='channels')
>>> renamed_imgs.names
('batch', 'channels', 'H', 'W')

>>> renamed_imgs = imgs.rename(None)
>>> renamed_imgs.names
(None, None, None, None)

>>> renamed_imgs = imgs.rename('batch', 'channel', 'height', 'width')
>>> renamed_imgs.names
('batch', 'channel', 'height', 'width')

Warning

The named tensor API is experimental and subject to change.

rename_(*names, **rename_map)

In-place version of rename().

resize(*sizes)
resize_as(tensor)
share_memory_()

Moves the underlying storage to shared memory.

This is a no-op if the underlying storage is already in shared memory and for CUDA tensors. Tensors in shared memory cannot be resized.

See torch.UntypedStorage.share_memory_() for more details.

solve(other)
split(split_size, dim=0)

See torch.split()

stft(n_fft, hop_length=None, win_length=None, window=None, center=True, pad_mode='reflect', normalized=False, onesided=None, return_complex=None, align_to_window=None)

See torch.stft()

Warning

This function changed signature at version 0.4.1. Calling with the previous signature may cause error or return incorrect result.

storage() torch.TypedStorage

Returns the underlying TypedStorage.

Warning

TypedStorage is deprecated. It will be removed in the future, and UntypedStorage will be the only storage class. To access the UntypedStorage directly, use Tensor.untyped_storage().

storage_type() type

Returns the type of the underlying storage.

symeig(eigenvectors=False)
to_sparse_coo()

Convert a tensor to coordinate format.

Examples:

>>> dense = torch.randn(5, 5)
>>> sparse = dense.to_sparse_coo()
>>> sparse._nnz()
25
unflatten(dim, sizes) Tensor

See torch.unflatten().

unique(sorted=True, return_inverse=False, return_counts=False, dim=None)

Returns the unique elements of the input tensor.

See torch.unique()

unique_consecutive(return_inverse=False, return_counts=False, dim=None)

Eliminates all but the first element from every consecutive group of equivalent elements.

See torch.unique_consecutive()