Full disclosure: I am not a level designer. I have started the slow path to becoming one, but this article is based on my knowledge and experience in a non-gaming environment. Secondly, I am publishing this earlier than I planned because Kyouryuu mentioned this exact same thing in his comment yesterday and I didn’t want to seem as I was stealing his point.
There must be many different ways to define, describe and measure levels within a mod or game. A while back a created a large list that I planned to use as a benchmark for pattern searching within popular mods. I still haven’t completely given up the idea.
One particular aspect of mod design that interests me is the ability to build levels that illicit emotions specifically aimed for by the designer. It’s possible to separate the technical ability of creating levels and the ability to create “interesting” levels. In many designers these skills develop side by side over time but the second aspect is rarely discussed and examined – at least not amongst modders.
I tried to start a website many years ago called “The Art Of Level Design”. Its aim was to provide a central place for designers to talk not about the technical aspect; the engines, the editors, etc, but the things that are true across all games and mods.
For whatever reason it never worked and was lost in the Internet graveyard. Recently, I found a site called World Of Level Design via my referrers and it seems to have the same objective. The difference is that it is run by a real designer rather than a “wanna be” like me. I highly recommend you visit it.
Back to the subject at hand. Technical skill is easier to acquire because there are plenty of tutorials around. learning how is not the problem, it’s the why that is interesting.
So, what happens is that modders can build walls, rooms, beaches and can place enemies, weapons and puzzles within these areas but might still lack other skills that are required to make a mod great. One of these aspects is what I call Excitement Management, although I am sure it has a proper name in designer circles.
How often do new modders know what their mod will look, feel and play like when finished? Do they know how they plan to control the players’ emotions at different points within the mod? I doubt it. I am sure that many mods “just” develop until they seem finished.
Enough preamble, let’s get to the point of this essay: Excitement.
To manage how excited players feel you must be able to measure it. There are plenty of way to measure heart rate, skin conductivity brain wave patterns and probably a bunch of other ways I can’t even imagine. But how many amateur modders have access to those kinds of tests?
I suppose it’s quite easy to find a cheap heart rate monitor used for exercise that it might be possible for modders to get their friends to play their mods while wearing one but how many would really do it.
What we need is a simple method that provides useful information.
Back in my tennis coaching days we used a method called “Player Feedback”. The player would be given a simple task that had one variable; let’s say how tightly they gripped the racket. They would hit the ball back to the coach, who in this case would try and keep the returns as consistent as possible, to minimize variables. The player would hit the balls back to the coach and after each shot would shout out a number that corresponded to how tightly they were holding the racket when they hit the ball. For example, 1 for so weakly the racket would fall out their hands and 10 for the tightest possible.
What happens is that the player eventually finds a grip strength that works for them with that type of shot. It’s quite apparent what works and what doesn’t because the quality of the shot is immediately evident.
From a teaching point of view it works well because the player/pupil learns something without being explicitly told; they are part of the learning process rather than just being given the information. The more they do this type of exercise the better they become at judging small differences, making the process even better.
Now let’s take that concept over to amateur beta testing. I have never tried this so the whole thing my fail completely!
Imagine sitting down with a play tester and as soon as they start playing, you begin timing them. Then, every 30 seconds you ask “How excited are you?”. They respond with a number between 1 and 10. 1 is “falling asleep” and 10 is “I’m ready to explode”. After the first question you simply ask “now?” and so on.
At the end of the map you finish the timing. Now convert this information into a graph, something like this might appear:
Collect enough of these an a pattern should emerge.
Without doubt, different players will take different amounts of time to complete levels, they will use different routes and paths, and find different things. If you simply overlay the charts on top of each by making them all the same height and length, the time taken won’t be so important.
Perhaps the players can take these measurements themselves but somehow I doubt it. Perhaps the mod team could create a recording where they say “Now?” every 30 seconds and the beta testers has a sound recording programme running in the background but it sounds more trouble than it’s worth.
I accept that this is a very crude way of trying to record one particular aspect of a player’s experience but if this essay encourages discussion, even at the expense of ridiculing this particular idea and modders begin thinking about things like this then I have done some good.
Think Backwards
So far I have talked about measuring the perceived excitement levels of players but what if you drew a graph and decided to build a mod around that, rather than the other way? Could this be an interesting experiment for a PlanetPhillip mapping competition?
I think it would be both interesting and fun to see how designers create levels. What are your thoughts?
What about doing this for finished mods? Would we find a pattern that these popular mods stick to? Maybe. Have the modders done this on purpose? Maybe. Is anybody interested in participating is a rough experiment?
Other methods?
Do you know of other methods modders can use to measure players emotions when playing mods? Could we perhaps create a list for modders to use? Would be kind fun.
Clearly, there are lots of different ways to beta test but I am interested in writing articles about ones that community modders can use. I still feel that player commentary is very useful and more discussion about these methods can only be a good thing.
I think someone who knows almost nothing about level design can make an excellent map. Knowing how to make a map and how to make a map exciting are two entirely different things.
Though sometimes excitement isn’t the objective of a map.
Also rating your excitement intermittently would kill immersion.
If I make a map. You can be sure that gameplay if top on the list. Though your right, there is no way of telling because you don’t know what everyone else will think.
Yes, it would kill immersion, but remember it’s a test and only testers would be required to use it. For some tests immersion must be lost or the test won’t work.
why not just use a basic pulse recorder that tracks heartbeat? you could clip it on the ear and then not even ask the player any questions at all. just make sure to not what times they hit each chapter and other specific events at so that it can be correlated to peaks and dips in the measurements.
It may not be accurate enough. Also, the heart rate isn’t always connected with excitement. In addition, perceived excitement and real physiological excitement may not match. Ones that record the heart rate are more expensive than others and you would also need a baseline to work from.
Baselines aren’t hard to get when you have the basic equipment. I’ve heard that nintendo is planning an attachment to the wii remote that measures pulse. So that may eventually become a relatively cheap method for research.
You do have a point that heartrate isn’t everything, but it is something that should be made use of in this kind of research, even if you also have the questions on top of it. Video recording should be used along with this so that you can tell exactly when the questions are asked and how much influence that has on heartrate and the like.
It’s just a matter that if you’re going to study something like this, you should do it right.
With endless amounts of experience conducting beta-testing on my mod, I dare say you are overcomplicating things.
The face. It’s a direct interface to the mind. Imagine sitting in a room next to a tester, and not only paying attention to what he is doing ingame, but also paying attention to the facial expressions.
I often bring people in to test my mod in person. I sit there watching them, telling them that I cannot interact, I am just there to watch. They often tell me if they are confused, if they like the mod, or whatever is on their mind. With a little notepad, I can write down all their comments and my observations. This is much more powerful than monitoring the heartbeat, although I admit that could be useful.
Of course, watching people only works in person, unless you use some kind of video link. Another interesting fact is that you can gain most of this information from watching recordings of people playing. External playtesters often sent me .dem files containing their playtest. By watching, I can easily detect if they are confused, or if they are determined and know what to do.
Using these methods involves a lot of interpretation, but video game development is not a science, it’s an art.
I don’t agree that I am complicating things. I am offering another avenue for testing. Not everybody has th opportunity to bring people in to play mods and it’s possible to use many different methods to achieve a balanced result.
There is little doubt that actually being in the same room as the tester is one of the best ways, but not the only one and some testers might not like it. I would find it very annoying to have you looking at my face when I play, but sitting behind me might be ok.
Vaguely related, but here’s a video of some nifty tech that captures a player’s eye movement while playing Left 4 Dead.
http://www.youtube.com/watch?v=N9S-S3ugi8A
I found the commentary video very interesting. I like the idea of watching a recording of someone play with a running commentary. You can see what the player is doing and thinking as well as see how they approach certain situations, some of which may not be what you expected. The best part is you can go back and watch it again if you missed or forgot something.
It may not be quite as good as sitting in on someone play but it certainly makes watching a bit more convenient.
I mentioned over in the forum about this fellow who has been doing Let’s Play videos with mods:
http://www.youtube.com/user/HumanHighways
It’s pretty interesting!
I agree with Joe that the one method of asking the player every 30 seconds “on a scale of 1 to 10, how excited are you?” would be an immersion-killer. Not only that, but there would be a high chance of polluting whatever data you were gathering from the test. The simple act of observation changes the outcome of a test, but actively probing the subject for information changes it even further. Merely asking a question over and over could directly affect the data, the player might progressively get less excited and enthused because of it and answer accordingly. For something as fragile as genuine immersion and excitement, a more passive observation is probably necessary.
I have to say I like the video of you testing a mod and speaking your mind, it’s probably a more effective tool for gauging player experience. Many players won’t report exactly how they’re feeling or what they’re thinking, but it’s of more use than the constant “are you excited” questioning because the player isn’t being interrupted. It’s purer feedback.
From listening to the commentary provided in Left 4 Dead 2, Valve’s current system is an upgraded version of their previous method (of having a tester, and a quiet observer sitting with them). What they do is run webcams to all the testing computers, recording video of the tester and the screen. They pipe this video back to one observer sitting at a computer, who can look at all of the tester areas at once. Further, all the video is recorded, so they can go back to any point they want and evaluate the tests.
In my own case and with my own mod under development, I’ve had to rely on much more meagre tools. In most cases players would report back everything they could think of over Steam messages or the like, saying where they got stuck and what they think could be better, etc. I listened into the playthrough of a friend over the phone, which was interesting because he talked a lot and I gained a lot of information almost effortlessly. Sometimes I would pick up on non-verbal reactions too, such as apprehension and engrossment, which clued me in that I was headed in the right direction.
Firstly, I would like to commend the author of this article on a fine comprehensive and thought provoking essay.The main topic was presented well without relying on the the “argument” format.
I believe completely, the most valuable tool a developer can have at their disposal is the direct feedback from a testing player. Especially through one on one session between the two after the developer has closely observed the tester during game play while noting key reaction types and cross referencing them to the specific occurrence points in the game.During the interview, a developer can gain information through verbal constructive criticisms, comments, and in answers to specific questions from the developer regarding the testers experience as a whole.
While I agree that other more scientific methods are also effective in gathering various biological and psychological stimulus responses and feedback, there are such a large number of factors and variables which much be considered, that I believe that unless involved in a closed study of some type, these methods are of far less use to the developer than other more practical approaches.
Even when using a more conventional method, such as a direct observation ( or most practical to me, a video of the tester during game play, side by side to the synchronized in-game video of the testing ) of gauging the testers responses, there are basic variables which must be considered: age, sex, testers level of gaming experience, personality type, personal game play style, etc., which are easily gathered through a short screening, but are still important to note against the end results.I believe this basic information is vital regardless of the method chosen.
I also favor this method in that you can film the tester while having no personal contact during the testers game play, thus preserving the integrity of the players “immersion” level and subsequently gain more accurate results.
Granted, a lot of modders and developers may not have the means and or time to use some of these methods, but there’s plenty of room for creativity and flexibility as to how they might go about gathering the information they’re looking for.
I do think though, that for he purpose of hosting an experiment here on the site and gathering developers and general modders feedback on their personal results results using different methods, it would be very interesting and fun to see what would be submitted and how it may or may not influence their creative style.
I also agree that this topic is well worth expanding and exploring further.
I read Phillip’s post several times and then decided to do a bit of research.
I found some sites very helpful in sorting out my thoughts. Other players might too, links at the bottom of the comment.
From these and other sites I get the impression that retail game development houses, through gritted teeth, acknowledge Valve as the leaders for playtesting techniques including measuring excitement and how to control it.
I have to agree with Joe, Sortie and Botolf. jjawinte makes very good points.
‘Excitement: Measurement and Management” is something that Valve have got covered and, for sure, will build on (see links below).
Unfortunately I doubt that it is practicable for Source mod makers to try to emulate any of Valves playtesting techniques other than by remote feed back from playtesters during development and players comments after release.
Even if it is practicable, is it affordable?
Cheap gizmos are not going to get the job done reliably and there is the problem of interpretations of the results which are sciences in themselves.
Is it actually necessary? Just look at the mods of 2009, a great year for mods, here are some of them: Dangerous World, Unexpected Conclusion, Strider Mountain, Slums 2, Research and Development, Mission Improbable, Calamity, Project 25, No Escape, Eye of the Storm………. These developers probably did not have the instruments or the PhDs needed for interpretation but what a fine job they did.
They got me pretty excited at various points.
However Phillip said “Clearly, there are lots of different ways to beta test but I am interested in writing articles about ones that community modders can use. I still feel that player commentary is very useful and more discussion about these methods can only be a good thing.’
How right he is.
The links for those interested (or just Google “valve playtesting”:
Links:
http://www.valvesoftware.com/publications.html
In particular “Valve’s Approach to Playtesting: The Application of Empiricism,”
The rest of it is worth a look see.
http://uk.pc.ign.com/articles/966/966972p1.html
Which gives a bit of information on Valve’s testing techniques
As does: http://gamearchitect.net/2009/03/29/gdc-2009-the-year-we-went-hungry/ but with a little bit more detail.
You will need to scroll down a bit to ‘valve’s Approach to Playtesting’
If you really want to get into this in a big way then:
http://www.bth.se/fou/forskinfo.nsf/17e96a0dab8ab6a1c1257457004d59ab/301f20d8b293f29ec12575c50058ad46/$file/Nacke-etal-Game-Metrics-Panel.pdf
You will deserve a Diploma if you go through all this lot!