What if we eliminat...
 
Share:

What if we eliminated 0.1% and 1% maximums in average result calculations.  

  RSS
Spectrum Twist
(@spectrum-twist)
New Member Registered

I couldn't really find any ideal place to pose this suggestion or rather a question, and while I've under other handle names have had pleasant conversations with adoredtv through other means, I figured i'd drop this here first and maybe see what other people think as well.

Please hear me out on this thought experiment as well as how this could "correct" for plenty of cases where reported results seem to mismatch real world experiences. Averages seem like a very solid foundation for a result metric when testing hardware, but that's next to impossible to really ascertain realistically and consistently in for example games. It has been the only real metric we have been able to utilize but i think we've had plenty of years exploring the 0.1% and 1% minimums as a good gauge of what's happening at the bottom, and no sane person would every validate maximum values as any kind of worth because that's irrelevant, yet those values will always impact the average obtained. Now depending on the circumstances of any application, there are interesting situations where frame rates will skyrocket, even in futurmark's own test benches, there are a number of specific situations where the frame rates will rocket and frankly they serve no use in the overall average short of ballooning the average result and really honestly aren't relevant to the end result. Example of such cases would be events where you're briefly looking at a blank wall or sky or some other situation that is not heavy on the gpu at all causing frame rates to briefly spike to levels that aren't remotely necessary or beneficial.

Now the question or assertion many will make is "Well those explosive frame rates as so damn brief, they couldn't possibly impact the end results that much!", if that were true, then we'd have never remotely considered showing minimums in the first place, but we know frame times are important and if micro stutters occur, this can definitely impact it's playability so it's very important of course, however maximums have a detrimental impact on the average results too, as frankly i could not care less about hitting 350 or higher frames for the brief times that over a period of testing time, can certainly cause an average to creep up by several percentages.

Having ran though a few tests and collecting some data, then performing a "correction" for abnormally high frame rates essentially normalizing or eliminating them from the average calculation, you can end up with a really wildly different results. This is specially true in situations where something such as in my situation, where one of my test beds had a higher frequency cpu with an nvidia card was spitting out higher frame rate averages on it's face, but looking through the actual frame time data and then calculating it compared to a mildly slower cpu mostly only saw the brief maximum frame rates jump with the minimums and actual averages sit within margins or error. Correcting for that 0.1% or 1% showed this more clearly, while being on average a touch bit faster, it wasn't remotely as ballooned up. Instead of for example the higher end cpu showing a 8% better average, with the corrections made, it showed only a 2% improvement.

 

What i propose just for curiosities sake, and not directed at adoretv himself, though as i said, would be curious to find out, is to see if anyone else can see similar outcomes, granted it's heavily based on the games being played. But i wonder how many reviews over the years indicating one product appearing to have a clearly win over another may have been down to just unusable ballooned high frame rates.

I know it's far more complicated and probably quite difficult to implement something, but if the similar calculation that is applied to minimums, but reversed to remove maximums to get a different average, what would they look like.

Quote
Posted : February 8, 2020 14:18
Kirk Johnson
(@ammaross)
Member Admin

I believe I have my frame data from some/all of the benchmarks I've ran in my reviews. I'll have to dig it up and see if there's some data for you.

ReplyQuote
Posted : February 9, 2020 16:36
Spectrum Twist
(@spectrum-twist)
New Member Registered

@ammaross

Appreciate the effort though it may be a fools errand or wild goose chase, but even for some older games where i've seen some frames rates go to full warp into or near 1000fps but it's always just so brief and if i managed to do the basic math right either choosing to "erase" or simply "normalize" the random multiple peaked frame rates, the average drops considerably in those situations. CS:GO for example where plenty of reviews are well into the 200+fps range and hell even some fortnite which isn't arguably that old of a game.

 

One of the methods i applied in terms of normalization, was to look at the frame times and where they dropped considerably for brief moments, i basically flattened them (normalized), i think this is the best method since outright removing them would eliminate a sample entirely, but perhaps either way would work because in the end the result wasn't much different from normalized vs removed.

But i don't necessarily have the ideal tools to really do it justice or the hardware to compare and contrast nor a good amount of games. Like i said, it started out as a thought experiment to try and explain why even though the averages were very high in plenty of games over an alternative cpu/gpu, it often felt "worse" or completely indecipherable at best.

 

I just think average frame rates are a terrible metric to use, i think 0.1 and 1% lows are CERTAINLY worthwhile using however, as they really paint a picture, but often people seem to glaze over them in favor of the averages, which honestly don't tell you much of anything due to far too many circumstances/variables to consider.

ReplyQuote
Posted : February 9, 2020 21:28
Olle P
(@olle-p)
Active Member Registered

I liked the way Hard OCP used to do their video card reviews: They played (the most demanding passages of some) games and subjectively tweaked the settings to find the "highest playable settings" for each combination of game/card. They did have some bias to prefer higher resolution but also commented on what other settings could be increased at lower resolution.

That type of test did compensate for stutter at inopportune moments and the like.

ReplyQuote
Posted : February 21, 2020 07:42
Spectrum Twist
(@spectrum-twist)
New Member Registered

@olle-p

Getting in the realms of subjective matters makes it VERY difficult to really know for certain without creating an obvious bias. While it should obvious be relevant to provide some user perspective, it must be clearly stated on as being personal opinion and may not be representative for who knows what reasons.

 

It's a lot easier to really get to the meat of a potential problem therefore going forward with a real solution. I think it would be worthwhile improve the methodologies to better illustrate products going forward, and if you can be the first to employ such models and see how the public views it, granted there's always ones that will despise change regardless of it's benefits or even superiority over prior methods, it could be WELL worth it.

ReplyQuote
Posted : February 22, 2020 16:30
F7GOS
(@f7gos)
Eminent Member Registered

I guess a big factor in end user experience these days comes from adaptive sync. 

I'll be honest I couldn't care squat if I get 200fps... But keeping within between 144fps and say 48fps that's the real kicker. For me anything above 144 isn't really tangible and below 48 falls below my cheaper freesync monitor adaptive sync range.

Capping at 144 and measuring the 0.1% / 1% and even 5% of the lowest frames would give me a clearer indication on actual performance. 

You would still get deviations in the averages between cards but in those unrealistic scenarios where one card hits 300 and the other 400 it's not going to skew the end result with less relevant data. 

Even if you changed it to a percentage based option.. ie percentage of frames under 30, between 30 and 60, between 60 and 120 etc you can paint a better picture about what's going on.. reality is though that takes a lot of work and not many places want to innovate with the numbers game as the majority of viewers only want the metrics they expect to see. 

 

 

 

 

Feel Free to check out the F7GOS youtube channel.
www.youtube.com/c/F7GOS

ReplyQuote
Posted : February 25, 2020 08:53
Spectrum Twist
(@spectrum-twist)
New Member Registered

@f7gos

I wouldn't advise capping the frame rate as that's an artificial ceiling being made in order to run benchmarks. The idea of eliminating the 0.1 or 1% maximums is to eliminate the bloating of the average, and while it doesn't appear that it would have any bearing on the result by any suitable factor worthwhile mentioning, in a few of the tests i've ran through, it can certainly result in some pretty big swings in the average results. Again the 1% lows are certainly important to report, no one likes deep dives in the minimums. But like you said, no one really cares at all or should care about peak frame rates, specially when an average can be heavily influenced in the end by rampant fluctuation in frames that spit out in the end, an average that looks quite good/acceptable, but the actual play experience would be horrid. While outright eliminating the maximum values isn't going to entirely solve it, i think or at least it would seem it would be a more precise reflection of the performance of the products individuals are looking for. I think it would be rather silly to cast it aside, as frankly that would set precedence to justify removing the 1% lows as well which would be asinine.

 

Actually if anything, averages would be simply removed from being used as a performance metric and instead focus be on 1% minimums. Personally at the very least, i really wish reviewers would reorder their review charts so that 1% minimums were their ordering value, with the products with the highest minimums at the top and the lowest minimums at the bottom regardless of their "average" frame rates. 

ReplyQuote
Posted : February 25, 2020 13:29

Do NOT follow this link or you will be banned from the site!

Please Login or Register