Proceedings

EPJ E Highlight - Training models with a structured data curriculum

Building a structured curriculum of data

By carefully structuring the data used to train models of complex systems by leveraging physics and information theory, researchers can significantly improve the quality of their predictions, without relying on additional principles from machine learning in situations where less information about the system is available.

Researchers are now increasingly driven to identify and model the intricate mathematical patterns found in complex natural systems, where the interactions of many simple parts and subsystems can give rise to deeply intricate mathematical patterns. Today, machine learning is the most widely used technique to model these systems. Through new analysis in EPJ E, a research team at Université Paris-Saclay shows how a ‘curriculum learning’ approach, which carefully structures the data used to train models, can significantly improve their results, without relying on additional machine learning principles.

Machine learning is a form of artificial intelligence (AI) which improves its ability to model systems as it is exposed to more information about them – helping researchers to spot patterns hidden deep inside the data. When studying complex systems, this approach becomes more challenging when large amounts of observational data aren’t available: often due to costs, or technical difficulties in attaining the information.

The Paris-Saclay researchers’ technique is based on the idea that like humans, machines learn best if they are first exposed to simpler situations, before dealing with more complex ones further on in the learning process. In this way, the information used to train a model can be structured into a carefully planned curriculum. The team’s approach worked by first assessing the amount of data needed to guarantee an accurate model, then investigating the impact of a curriculum’s structure on the model’s reliability.

The team ultimately showed that through careful structuring of its training dataset, the quality of a model’s predictions can be significantly improved – without any need of more intricate model architectures or additional principles from machine learning. The insights gathered by the team could lead to advanced new modelling approaches – applicable in scenarios ranging from robotics and computer vision, to video games and language processing.

Bucci, M.A., Semeraro, O., Allauzen, A. et al. Curriculum learning for data-driven modeling of dynamical systems. Eur. Phys. J. E 46, 12 (2023). https://doi.org/10.1140/epje/s10189-023-00269-8

This was our first experience of publishing with EPJ Web of Conferences. We contacted the publisher in the middle of September, just one month prior to the Conference, but everything went through smoothly. We have had published MNPS Proceedings with different publishers in the past, and would like to tell that the EPJ Web of Conferences team was probably the best, very quick, helpful and interactive. Typically, we were getting responses from EPJ Web of Conferences team within less than an hour and have had help at every production stage.
We are very thankful to Solange Guenot, Web of Conferences Publishing Editor, and Isabelle Houlbert, Web of Conferences Production Editor, for their support. These ladies are top-level professionals, who made a great contribution to the success of this issue. We are fully satisfied with the publication of the Conference Proceedings and are looking forward to further cooperation. The publication was very fast, easy and of high quality. My colleagues and I strongly recommend EPJ Web of Conferences to anyone, who is interested in quick high-quality publication of conference proceedings.

On behalf of the Organizing and Program Committees and Editorial Team of MNPS-2019, Dr. Alexey B. Nadykto, Moscow State Technological University “STANKIN”, Moscow, Russia. EPJ Web of Conferences vol. 224 (2019)

ISSN: 2100-014X (Electronic Edition)

© EDP Sciences