Public Discussion: Testing and Random Crits

Berry · Sep 3, 2014

RubbishyUser said:
Let me preface this by saying that I am in favour of random crits on the majority of weapons, but:

This is actually the best reason for turning off crits. See, one round is nothing in the extensive testing afforded to Snowplow, the Valve Beta maps and internal Valve testing, but many maps that we play often only get 1 or two tests before a new iteration is made. In some game modes, that means we only get 4 rounds in. A perfect crocket could then mean Blu wins an extra 25% of the rounds, and the mapper changes the map accordingly.

This might only happen 1 in 50 rounds, but I'd hate to think people are making a step in the wrong direction over a single rocket. Anyone who wants to see the effect of random crits, load up goldrush with some bots and spectate and see how much a crit grenade can do.

For the record, I've seen some larger maps have only a single, or sometimes two rounds.

Considering random crits can affect a single round's outcome majorly (rarely) I think that's something to keep in mind even if it is for a map by map basis.

Flame · Sep 3, 2014

I dont even use this forum anymore or check it but I saw the popup in my community feed so here's my two cents.

If you want balance in a map testing with crits off is best outcome. 1 crit sticky or rocket can determine a points cap-ability? idk if thats a word but point is you only test 30 minutes at a time generally, 15 minutes on each team roughly, and each point of a map is dependent on good testing for balance. if a team successfully caps 50% of the time with crits on and 30% of the time with crits off the map isn't balanced.

You'll never know the data without turning it off. But know that a vast majority of the 'community' maps added to tf2 have been added as a result of competitive testing and play.

youme, shmitz, nineaxis, mangy, icarus, ravidge, acegikmo, scorpio, and a lot of the experienced mappers who got their work implemented did so by testing in highlander/6v6/non-crit environments.

just food for thought.

edit: also back when i was into the whole mapping scene and tf2 was booming drp/nine had the servers set to crits off. the anti-competitive mentality of tf2maps.net was also strong back then

pce dudes.

-flame

TMP · Sep 3, 2014

Shocked you are subscribed still. That's pretty amazing.

I think the more important thing than the no crits part here is the competitive feedback here. They tend to be relatively more balanced in terms of skill variation when you test in these environments, so the impacts of a "carry player" or unbalanced teams don't skew the data as much. I think personally these players are more of an issue to outlier data than crits.

There's other merits to comp testing other than reduction of outliers, but for the most part that's just the overlap with crits in regards to why it's good.

A Boojum Snark · Sep 3, 2014

RubbishyUser said:
In some game modes, that means we only get 4 rounds in. A perfect crocket could then mean Blu wins an extra 25% of the rounds, and the mapper changes the map accordingly.

and like I said, with that low amount of testing, you could have just as skewed results because someone defending the point had the clouds at their house break and suddenly a glare shined on their monitor and made them get killed. or someone on the attacking team has an upset stomach and playing poorly. sure, removing one variable might help, but there are so many variables it means nothing.

Flame said:
-flame

WHAT YEAR IS IT?! :O

Sergis · Sep 3, 2014

lettuce be real to the tea there is comp and there is tf2m and there is one not being the other

Harribo · Sep 3, 2014

As TMP has said, testing with competitive players is the relevant thing there really though. The no crits was just a side effect of testing which those groups of people, not an active choice. @Booj, Flame has been poking his head in here and there into map tests recently.

Crash · Sep 3, 2014

Honestly if anyone is relying on just the TF2Maps servers for major testing of a big project, they aren't going to be getting the best data anyway. Just getting one group of people in general isn't going to be the best data. They are going to go into the match with the same biases they developed from earlier versions. They are going to think about the map in the same way as they did before.

I've had plenty of TF2Maps tests that received a ton of negative feedback (edit: where testing elsewhere went well) because we as mappers look at things differently than the average player. This is good and bad. Good in the sense that a lot of technical ideas can be addressed, bad because you aren't making the map for mappers, you're making it for the average player.

Befriend some communities, get it in some tests that it wouldn't normally be in. That's how you really learn how it's going to play and where your problem areas are. Variety is very key in testing. Get a full server of new people on a late beta map that has been tested by the same group through it's whole development cycle and you are going to find a lot of issues that the previous testers completely missed.

As for crits, it DOES balance out the high skilled players (we have a huuuge skill range on the TF2Maps servers) as it's more chances for them to be taken out by a lucky hit. Yes they are getting them more often because of their skill increasing the crit chance, but it's not like their health is increasing as well, they can still be taken out by a lucky shot.

Sergis · Sep 3, 2014

Crash said:
Befriend some communities, get it in some tests that it wouldn't normally be in. That's how you really learn how it's going to play and where your problem areas are. Variety is very key in testing. Get a full server of new people on a late beta map that has been tested by the same group through it's whole development cycle and you are going to find a lot of issues that the previous testers completely missed.

i got a test of 3some where majority were the random dudes i never saw before

they didnt mind the vertical ladders and had some discussion and shit

then i made them work and tested with the usual tf2m crew

oh the vitriol

Crash · Sep 3, 2014

It's very difficult to get at times, but if you can get a group of non-mapping people together who are willing to objectively look at a map, know it's WIP, and are ready to be open about something new, you'd be amazed at the quality feedback you can get.

We give comp players a lot of crap at times (more so in the past than now) but if you can get some teams together to play a map who fit all of the above, you can learn SO much from it, just seeing how they play/ hear their reactions to things.

MystycCheez · Sep 3, 2014

There have been times where I would have got random crits and thought, "Hmm, if I didn't get that random crit to kill those 3 people with my rocket, we would have lost the game due to not capping the point, but instead, we pushed through thanks to random crits and won the game because of it."
I don't like random crits at all and that is my opinion.

Idolon · Sep 3, 2014

On the topic of data purity and "a map iteration only gets 1-3 tests," I don't see any reason why people can't have a single version of their map tested multiple times. Granted, your development process will be slower, but if you go too fast, that's your own fault. Random crits or not, there's a ton of other variables (team balance, player count, overall mood of the server, etc) that could sway tests just as easily, and you should probably test more thoroughly regardless.

As for random crits or not, I think I'd rather have them off. Getting killed because of a random crit could start an argument, but not getting killed by a lack of them presumably would not.

DonutVikingChap · Sep 3, 2014

I'm definitely all for keeping crits enabled during tests, and I agree with what most of the people have said regarding their "necessity".

Another thing I'd like to add that I bring up sometimes in these discussions is that we also shouldn't forget that crits significantly boost the effectiveness of melee weapons, since their crit chance is 15% by default, and bonuses all the way up to 60%.

Just think about how many times you've whipped out your Übersaw as a medic and landed a nice crit on an inbound spy, saving your patient's life, or how many times a well-timed crit from your equalizer/escape plan has saved your ass as a soldier with no ammo.

Sometimes it seems like an overpowered aspect, but it is something that I actually like about melee weapons since it is basically what makes them even remotely viable in a lot of situations. This is a quote from the TF2 wiki:

Melee weapons are designed to crit far more often than ranged ones in order to encourage their usage at close range.

Fruity Snacks · Sep 3, 2014

I was chatting with Turbo earlier about this and I think it'd be good if I share my example oh why I think Random Crits should stay on during testing 99% of the time.

[[WARNING, SOME SCIENCE CONTENT]]

I'm a planetary scientist. One of the things that I had to do a lot was determine the amount of time it takes a massive cloud of dust to form a planetary system (think, a giant dust cloud forming the solar system). This is actually a relatively simple equation (I won't get into) that has a couple of factors that you can include into it. One of them is a randomness factor, which is the amount of disruption that the gas cloud receives from random objects (like, asteroids, roaming planets, minature black holes -which do occur-, etc.) across different times. There is no control over these objects, they just randomly pop up in the environment. When we look for the average time it takes for gas cloud of X radius to form a planetary system, we MUST include these variables. Why? Because they occur in the real/realistic environment.

If we don't include the random events, the timescale can be upwards of a billion or more years to form a solar system. If we do include the randomness, it's on the scale of maybe 10-100 million years for something to form. You can see how ignoring random variables in the environment can drastically effect the outcome of something. Without randomness: billions of years. With Randomness: 100 times less.

But you may be asking "What if you find that WITH AND WITHOUT the random events, it takes 600 million years to form a solar system. Why would you include the randomness then in the first place?" Because it's bad science if we don't include environmental variables. It's VERY bad science if we ignore environmental variables because we *think* they won't affect anything. In a large majority (as far as I know) of servers that play TF2, they are default settings. Those setting have Random Crits on. That is the dictating environment, and thus, Random crits are the environmental variables that we cannot ignore.

Map testing is a science. Not only are we all level designers, we're scientists. Our science is video games.

tl;dr: Random crits, like it or not, are environmental variables that CANNOT EVER be ignored. If they are, it's bad science.

RubbishyUser · Sep 3, 2014

Fr0Z3n said:
SCIENCE!

The counter argument to this is that many mappers are doing bad science anyway, with "sample sizes" of less than 5 rounds. I'll include myself on that list. Sometimes you learn something in half an hour and you don't want to see people banging on about it for another half before you fix it.

Let me explain why this is a problem with more science. If you were examining a disease, you'd like to do controlled double blind experiments with hundreds of people with the disease and hundreds without. This will average out differences from patient to patient and the results will be more accurate or true to the effect of the disease on the average man (or woman). You want the random factors - old, young, male, female, with other conditions and so on - to be more representative of the whole population of the earth.

But when the sample sizes get really small, you have to throw this randomness out the window. With diseases, this is the cases with really rare diseases where only a handful of people have what you are looking for. In pharmaceuticals, you can't have a larger sample size because literally nobody else suffers from the disease. In mapping, your sample size is tiny because you don't want anyone else to suffer your poorly made alpha map with glaring design flaws. This time, doctors would really want to eliminate all randomness between patients - identical twins are ideal in that they can show whether the disease is genetic or, if only one has the disease, tests can be performed on the both of them to uncover the exact effect of the disease.

Let's take this bloated metaphor and apply it in mapping. Each round is a patient, with random characteristics and showing various symptoms, such as "difficult to push C" in the same way a patient might have restricted breathing. If you have a large sample size, a huge number of patients, because you are testing snowplow or a Valve map or whatever, you want rounds with a huge variety of features - unbalanced teams, scout rushes, 6v6, critically timed critical hits and so on - so that you can construct an impression of the "average game". Then you can see what symptoms it has e.g. "difficult to push C" and act on it.

Alternatively, if you just want to act quickly, you want your rounds to be as uniform as possible, so you can make the assumption that they accurately represent the map. You'd like every round to be perfectly balanced, with great team composition and no surprise factors like random crits so that there are the minimal number of symptoms not caused by the actual map. In other words, we want perfect team composition so that we know "difficult to push C" isn't caused by red having three engineers.

TL;DR:

WITH SCIENCE

Sergis · Sep 3, 2014

tbh a mapper should be smart enough to be able to tell the difference between a round lost because of a crit sticky on a team full of congaing scouts and a round lost due to map features

RubbishyUser · Sep 3, 2014

Sergis said:
tbh a mapper should be smart enough to tell the difference between a round lost because of a crit sticky on a team full of congaing scouts and a round lost due to map features

You might see conga scouts, or rocket jumper soldiers, but if you aren't in the right place at the right time you won't see the crit sticky. If anything, that's what makes random crits more dangerous.

>inb4 Frozen's rebuttal?

Sergis · Sep 3, 2014

i watch my stvs five times over

i see everything

EVERYTHING

Fruity Snacks · Sep 3, 2014

RubbishyUser said:
>inb4 Frozen's rebuttal?

No rebuttal.

One of the things I failed to mention in my example was that the equation we use is like doing thousands and thousands of instances of solar system formation at once (yaaaaaaay intergrals). So I fully agree with you, and with what Idolon pointed out: there is nothing wrong with having a map version tested more than once. I think we have some rules in our gamedays that kind of punish that. Maybe we need to look at those rules and amend them so that people can do longer time between iterations.

When it comes to mapping, more iterations doesn't necessarily mean faster development, normally you rush so much that you lose control of the map and do more harm than good. More iterations means you're less effected by the randomness of tests: crits, stacked teams or even player count.

There's a whole science to testing things that applies to everything from Map, foods at resturantes and even other random things like city planning. Everyone should look into it.

Pocket · Sep 3, 2014

I personally never even notice when I happen to be playing on a server that has crits off. I only notice when they're on because I or someone I see gets one. I don't have the best understanding of what effect they have on gameplay or balance overall; probably only Valve with their ridiculous stat-gathering knows for sure. But it's my understanding that it generally gives a boost to players who aren't skilled enough to be pro, which to me says that it evens things out more. So in theory, a map that isn't balanced well might appear to have better balance when crits are on, whereas a map that's only been tested with them off probably won't develop new issues when they're turned on.

That's what we're really aiming for, isn't it? Having maps that play properly in either environment? So ideally, any given map would be tested under both conditions, but if indeed a map that works without crits is guaranteed to work just as well or better with them, then without makes the most sense. The only question, as far as I'm concerned, should be, is that actually the case?

Trotim · Sep 3, 2014

Whether random crits are a good or bad mechanic for the game overall is a whole other can of worms. But for the record I do not think they help worse players

What I will echo regarding our tests is the sample size argument. We don't get much time to test each version of each map. When after 10 minutes a round is decided because a random crit sticky happens to kill half the enemy team that's 1/3 of the usual test length wasted. Adding such a swingy random variable into an as people have pointed out already chaotic test environment isn't a good idea. Doubling or tripling the amount of testing needed to average it out isn't a practical option for us

Public Discussion: Testing and Random Crits

resident homo

Ancient Pyro Main

Toraipoddodezain Mazahabado

L666: ])oo]v[

func_nerd

L666: ])oo]v[

func_nerd

L1: Registered

they/them

L5: Dapper Member

Creator of blackholes & memes. Destroyer of forums

L7: Fancy Member

L666: ])oo]v[

L7: Fancy Member

L666: ])oo]v[

Creator of blackholes & memes. Destroyer of forums

Half a Lambert is better than one.

If this is your first time logging in after the migration (Feb 8, 2022), you must reset your password to log in. Follow this guide if you're having trouble