Optimizing ML training with metagradient descent

TribeNews
1 Min Read

[Submitted on 17 Mar 2025]

View PDF

- Advertisement -

Abstract:A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based approach to this problem. We first introduce an algorithm for efficiently calculating metagradients — gradients through model training — at scale. We then introduce a “smooth model training” framework that enables effective optimization using metagradients. With metagradient descent (MGD), we greatly improve on existing dataset selection methods, outperform accuracy-degrading data poisoning attacks by an order of magnitude, and automatically find competitive learning rate schedules.

Submission history From: Andrew Ilyas [view email]
[v1]
Mon, 17 Mar 2025 22:18:24 UTC (368 KB)

- Advertisement -
Leave a Comment
Ads Blocker Image Powered by Code Help Pro

Ads Blocker Detected & This Is Prohibited!!!

We have detected that you are using extensions to block ads and you are also not using our official app. Your Account Have been Flagged and reported, pending de-activation & All your earning will be wiped out. Please turn off the software to continue

You cannot copy content of this app