Upload a vendor-delivered dataset; review the prep contract; pick a target+judge model and run the eval. Results land in raywardevalres.
raywardevalres
Loading…