Eggroll: Novel general-purpose machine learning algorithm provides 100x speed

bdbdbdb 19 hours ago

What does this actually mean for LLMs? Cheaper training?

MarkusQ 17 hours ago

Yes. Provided it works as well as they claim.
Not only cheaper, but (since in this case money ≈ hardware-cost × time), faster. They claim that training time can even approach inference time:
> EGGROLL's efficiency results in a hundredfold increase in training throughput for billion-parameter models at large population sizes, nearly reaching the throughput of pure batch inference
free_bip 11 hours ago

Their technique does not claim to compete with gradient descent - it's competition for techniques like Proximal Policy Optimization, so it's more suited for things like creating a reasoning model out of an existing pre-trained model.