The initial release of this project focuses on the Bradley-Terry reward modeling and pairwise preference model. Since then, we have included more advanced techniques to construct a preference model.
Some results have been hidden because they may be inaccessible to you