Equivariance and generalization in neural networks

Srinath Bulusu; Matteo Favoni; Andreas Ipp; David I. Müller; Daniel Schuh

doi:10.1051/epjconf/202225809001

EPJ

a
b
c
d
e
ap
st
h
plus
ds
pv
ti
qt
am
n

Proceedings

Open Access

EPJ Web of Conferences 258, 09001 (2022)
https://doi.org/10.1051/epjconf/202225809001

Equivariance and generalization in neural networks

Srinath Bulusu¹^*, Matteo Favoni¹^,2^**, Andreas Ipp¹^***, David I. Müller¹^**** and Daniel Schuh¹^†

¹ Institute for Theoretical Physics, TU Wien, Wiedner Hauptstr. 8-10, 1040 Vienna, Austria
² Speaker and corresponding author

^* e-mail: sbulusu@hep.itp.tuwien.ac.at
^** e-mail: favoni@hep.itp.tuwien.ac.at
^*** e-mail: ipp@hep.itp.tuwien.ac.at
^**** e-mail: dmueller@hep.itp.tuwien.ac.at
^† e-mail: schuh@hep.itp.tuwien.ac.at

Published online: 11 January 2022

Abstract

The crucial role played by the underlying symmetries of high energy physics and lattice field theories calls for the implementation of such symmetries in the neural network architectures that are applied to the physical system under consideration. In these proceedings, we focus on the consequences of incorporating translational equivariance among the network properties, particularly in terms of performance and generalization. The benefits of equivariant networks are exemplified by studying a complex scalar field theory, on which various regression and classification tasks are examined. For a meaningful comparison, promising equivariant and non-equivariant architectures are identified by means of a systematic search. The results indicate that in most of the tasks our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts, which applies not only to physical parameters beyond those represented in the training set, but also to different lattice sizes.

© The Authors, published by EDP Sciences, 2022

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.