Womp Labs
Uncovering strong links between training data and model behaviors.
Data quality is now the largest blocker to scaling AI performance. We train large models on trillions of tokens, but when they fail, finding those responsible is incredibly hard. As a result, edge cases are brutally difficult to mitigate and reliability can be a nightmare. The next OOM of performance will come from progress made on this problem.