heatmaps.tf - Heatmaps Discussion

RubbishyUser

L7: Fancy Member
Feb 17, 2013
414
488
So Geit recently gave us the excellent tool that is http://heatmaps.tf/. For those who do not know, this site is regularly updated with data from TF2maps tests and other servers with overviews of maps and "heatmaps" of death locations on them. This means that the redness of a location on the map indicates the number of players that have died there, with red locations seeing a relatively large number of deaths and blue locations seeing very few. I recommend that everyone familiarizes themselves with it.

This discussion, however, is not the same discussion about the particular heatmaps plugin that Geit has created. I would instead like to discuss how we should use heatmaps, not just his plugin but in general. Now I would like to say first that I do not mean to be disrespectful with the following statement or unappreciative of what Geit has constructed in his own free time at no cost to us.

My contention is that while heatmaps are plenty fun for the viewer, they don't really contain any information that could be gleaned by viewing a demo or playing a game. In other words, every sentry spot, sniper location, or choke point visible with certain filters in a heatmap is one that you would have found if you were playing the game anyway.

This statement is where I'd like to start the discussion and we'll work from there. I'd like to not sway too far into suggestion territory, as that is what the other thread is for, and I'd also not like to go into test sample sizes, as that is what this thread is discussing.

Let's be cordial about this.
 

Shogun

L6: Sharp Member
Jan 31, 2014
260
220
My contention is that while heatmaps are plenty fun for the viewer, they don't really contain any information that could be gleaned by viewing a demo or playing a game. In other words, every sentry spot, sniper location, or choke point visible with certain filters in a heatmap is one that you would have found if you were playing the game anyway.

The point of heatmaps is that they keep track of all that keep track of all that for you. If you watched demo or played the map that great, but where are you going to keep that info? Your head.

Also, heatmaps do more than just show you chokes/ sentry spots/ sniper lines. Let's say you have a KOTH map, and 85% of player deaths are concentrated in areas far away from the point. Now you know that the flow of combat isn't taking players towards the point, or they don't want to cap the point or whatever.
 

YM

LVL100 YM
aa
Dec 5, 2007
7,135
6,056
A heatmap is an agregate of information. It contains all deaths, and all kills. Just from watching replays/playing/spectating, you could never take in 100% of the information, witness 100% of the kills and take notes of their locations. Heatmaps definitely contain information you can use.
 

Dain

L3: Member
Oct 21, 2009
106
43
With a single demo you don't know whether what you're seeing is representative of how the map plays, or whether it's some kind of outlier. Heatmaps aggregate all the info to smooth over irregularities so you can see the underlying trends.
 

Fruity Snacks

Creator of blackholes & memes. Destroyer of forums
aa
Sep 5, 2010
6,394
5,571
My contention is that while heatmaps are plenty fun for the viewer, they don't really contain any information that could be gleaned by viewing a demo or playing a game. In other words, every sentry spot, sniper location, or choke point visible with certain filters in a heatmap is one that you would have found if you were playing the game anyway.

No. This is just absolutely horribly incorrect.

You can't be everywhere in a test. You can't remember every single thing. And if you think that you're all seeing during a test, you need to think again.

Heatmaps can tell you the following, but is not limited to: Map flow, player density, timing, route viability, prefered routes, progression of difficulty (and thus, diffculty of objectives), sightlines, sentry spots, AND MORE. You can TRY and glean this info off during a single test, but it will not be as efficient and comprehensive as a single heatmap. You can try and go with your gut, but it's really not what you want to do on your own. Data is important, very very important.

This video is a really good video on data driven development.
 

RubbishyUser

L7: Fancy Member
Feb 17, 2013
414
488
First off I'd just like to say that that's a great video Frozen posted. It's also great at giving an insight into Valve's cautious development cycle, and as a result they can polish their games to a level no one else can.

Firstly, designing based upon the conclusions of massive data gathering is something that the heatmaps are undoubtedly excellent for. In fact, these heatmaps might even be the first actual step towards data-driven design for custom maps.

But that doesn't mean that we're using it yet as a community though, or even that such a method would be palatable. I described Valve's development cycle as "cautious" earlier - certainly, every product and update they release seems to be tested experimentally rigorously and maintains the company's high standard of finished work.

And that might make sense if your audience was as extremely discerning as Valve's and would rip apart anything that released that was unfinished or poorly tested. In fact, if someone complains about a Valve product it sometimes seems that the person is at fault more than the company. But at TF2maps, or internally at Valve, we have the luxury of lower standards. In other words, a2 need not fix every problem with a1 - just the most apparent, or at least the ones that struck the mapper most.

Wow, I just deleted like 5 more paragraphs that cross-examined that awful last sentence. I'll leave that as an exercise for the reader. I think ultimately that my point is that being accurately informed about the current state of the map matters little in the progression of the map because the changes made in the future will partially invalidate any conclusions drawn.

As a result, in order to save time and sanity, making data-driven decisions only works in the large scale as a check to ensure that you have headed in the right direction. This is unsuitable for heatmaps as you cannot reasonably combine data from several iterations of a map as it's shape and locations of deaths changes. Waiting for 4000 deaths on a1 of your map to tell you that there is a massive splodge in BLU spawn is vastly less efficient than trying to fix the problem as you detect it and continuing to monitor the situation.

This post has been chopped and changed a bit so do let me know if anything is complete garbage as it's too late at night for me to check.
 

Fruity Snacks

Creator of blackholes & memes. Destroyer of forums
aa
Sep 5, 2010
6,394
5,571
But that doesn't mean that we're using it yet as a community though, or even that such a method would be palatable. I described Valve's development cycle as "cautious" earlier - certainly, every product and update they release seems to be tested experimentally rigorously and maintains the company's high standard of finished work.

This is a fair point and a good thing to look at, are we as a community at the level of developing where a heatmap plugin is beneficial or used. There are plenty in this community use it... It's been a big part of snowplows development and any future maps that I work on, I'd hope to use it.

another point you mention, is that for an a1 you don't really need it. Yea, it's not *as* important, but you should still glance at it. The later development goes, the more you should analyze you heatmaps. You also bring up at waiting for 4k kills isn't really viable, frankly, not many people here do that for heatmaps. The community's attitude is kind of non-promoting of that type of development, that is, repeat testing of the same version. It's very frustrating (as I think Aly has experienced a lot most recently), to put a version up to test 2-3 times over a week, only to have it rtv'd off in the first 5 minutes of the second test because 'we've played this already.' Thats not how development works, 1 30 minute test isn't enough. (/rant). Yes, there are some maps that have GLARING issues that 1 30 minute test can outline and provide fixes for, but for more quality maps or later devleopment maps, each version should be tested 2-3 hours.
 

RubbishyUser

L7: Fancy Member
Feb 17, 2013
414
488
Well I can't vouch for using heatmaps on maps in late alpha or beta so I can't disagree in good conscience. I can't personally say I've experienced a single map being rtv'd from a gameday because we hadn't played it too much, so maybe it's worth seeing whether it's just a couple troublemakers on the NA server (I'm usually in the EU gamedays).

But I have a new concern. We've talked how heatmaps are able to verify or counter individual experiences by aggregating the results of several tests on the same version - but are we certain that the data is supporting our facts?

For example, a month or so ago we played pl_inari - Aly's contest winning farm based payload map - and both teams had difficulty capping and progressing from the first point. I haven't seen that trouble before and since. Because Aly has been rigorous in her testing, it was pretty clear this was anomalous - something caused by team composition or whatever instead of the map.

But I can't see how you can justify a decision based on the heatmap. Here, Aly could look at the heatmap and say "it's a problem because there's a big red splodge at first and just outside the door, better fix that" or "well, people are dying at first but the splodge isn't as large as second or third, so it's not actually a problem". In other words, can the heatmap be used to justify almost any interpretation of the map? We say that concentrated deaths are bad but a map with no chokepoints would be a complete deathmatch.

I don't have the answers here, and I'm not accusing heatmaps of being useless evidence. I genuinely don't know the answer and I fear that we use them to back up our actions without any anecdotal evidence to say they support our conclusions.

I suppose there's only one real way to test whether it's our experience of the play that determines the results from a heatmap or the power of the heatmap itself - making a map that relies entirely on feedback from the heatmap.

Hmmm. koth_heatmap on it's way.
 

Fruity Snacks

Creator of blackholes & memes. Destroyer of forums
aa
Sep 5, 2010
6,394
5,571
Anomolous events are a bit weird, but you can see them. Normally, if something is stalling you're right, it's obvious that "this is bad" but you aren't looking for 'blobs', as you say, you're looking for the area density. Sounds the same, but really means two different things.

Aly, hope you don't mind, but we're going to use inari a6 for this. Using this heatmap, you can tell a lot. I don't remember if I played this, I don't remember when it was tested, but it had to most kills, so we're going to use that.

It's not as resolved as we'd like, but I've set the intensity and radius to compensate for that. So, from this, I can hypothesize that the first point is very easy, relatively quick (as it should be). The second point is rather tough. The third points front line is juuuuust out of place, and the finale is pretty tough, but not as tough as 2nd.

How did I do that? Comparing and filters. I use the standard, all kills, no filters as the baseline. You can see that the density of kills around point 2, is realtively the same around the finale. Thats pretty rough. If you were looking at blobs, you'd say that they were the same, but if you're looking at the density of kills (or, I guess, the density of blobs), you can see there is a lot more red blobs on point 2, rather than 4 and they are a bit more dense. "But, fr0z3n, Red has a lot of bigger blobs!" Yes, thats fine, and it's hard to differentiate between the two. So, lets go deeper!

Engineers! the pivotal sign of a defensive area, turrets. If we turn on the killer class filter, set it to engineers, we can clearly see the sentry spots (relatively circular bright spots)! Sweet! So, lets look at the sentry spots. You can see on final, there is a sentry spot very evident, it's right by point 3, but not too many on final. It's very bright red, which means that it has a lot of kills on/near that spot. There's a bright one before 3 too, and thats something to note. Looking at point 2, you see multiple bright spots, especially on some big routes. The density around this point is a little high for a 2nd point of a map. Thats something important to note: Across the yard, point 2 is getting more sentry kills, than point 3 or 4. This leads me to believe that engineers get more rooted around 2, than 3/4. This means that 2, on average, is harder than 3/4. This is further supported by my previous paragraph.

Want more support for this (But I don't want to explain it because this is long enough already), then filter the Red class as a killer, you can see that Sentry spots on 2 REALLY light up, compared to the rest, this means they were working overtime, compared to point 3/4. Want EVEN MORE? Using the Red team filter, switch it to the Victims. Red is dieing a lot more on on 3/4, than 2, but 2 is still more dense. This leads me to the hypothesis and conclusion that since red is staying alive more around point 2, and dieing less than other points, they have a much stronger hold there, than on points 3/4. They can survive more, they have more advantage.

Again, using the density of kills in the first paragraph, and the sentry filter, we can tell that the 3rd point, into 4th is pretty tough, but after that choke, the final becomes pretty easy to take out, just a lot kills because of the pit. So, with JUST TWO HEATMAPS, I would rate the progression of difficulty of each points (from LEAST difficult to MOST difficult) as 1 < 3 < 4 < 2. This is a little messed up, and Aly did make changes on this that smoothed things out. As such, I claim my hypothesis is correct, based on the supporting evidence.
 

Fruity Snacks

Creator of blackholes & memes. Destroyer of forums
aa
Sep 5, 2010
6,394
5,571
tbh it doesnt tell me anything critical i didnt knew after a test or two

also, heatmaps dont care where the round ended. if point 2 is as hard as last but the game doesnt get to last half the time heatmaps will lie and tell point 2 is twice as bad as last

This is true, and while it would be nice to sort through data per-round, it's not necessarily needed. If it happens consistently, it will show up clearer than an anomalous event in a single round, so I'm not worried about it. (Note, this only applies to when a map is tested multiple times ... as it should be)