18.21%
What is the average R-squared of all the runs? This tells us linear regression alignment human vs AI. 100% means AI perfectly predicts a human vote. 0% means AI doesn't predict it at all.
17.14%
When we set the model according to all pairs from all runs what is the R-squared?
47
Total number of evaluation runs.
$331.21
The sum of costs for all evaluation runs.
$7.05
The average cost of a single evaluation run.
3578m 57s
The sum of durations for all evaluation runs.
76m 9s
The average duration of a single evaluation run.
1708
The sum of all generations for all evaluation runs.