Monday, November 18, 2013

[Part 1]

Welcome back, imaginary game developer. So now that we know why you might want to include physiological data in your user-testing experiments, let's get into the how. You're going to need: Your normal lab/experiment area, test subjects, recording equipment, and cash. This stuff can be expensive, so it's important to have a good idea of what you're looking for ahead of time.

If you want to know how people react to bullets, go to Aperture.

The first step in this process is going to be determining what kind of information you want to collect. Different monitoring devices have different strengths and weaknesses. Measuring how tightly someone grips a controller, for example, can be a good way to track how aroused someone is, but it doesn't tell you much about valence. The physical size and shape of the equipment you will be using is also going to vary, as will their potential effects on gameplay and user experience.

Some Physiological Monitoring Methods
The items on this list are techniques which have been used in the user-testing experiments with which I am familiar and which have already been shown to produce reliable/valid data. It is by no means a completely comprehensive list of all possible user-testing methods.

Electroencephalography (EEG)
This topic actually got its own post, but to review: Electroencephalography is the recording of electrical signals along the scalp, you can track them with a sensor net, an EEG cap, and now that there's a market for them, headsets. They've been shown to be pretty useful at identifying critical in-game events, but they can be rather invasive, expensive, and time-consuming. Also, the data collected from them can be fairly difficult to interpret.


Useful for: Monitoring attention/boredom, could be useful for identifying common patterns and behaviors, identifying in-game events which may trigger significant changes in focus


Eye Tracking and Electrooculography (EOG)
Eye tracking is a good way to get an idea of where your users are looking during play, as well as how fast their eyes are moving. By far the most common method for doing this is by using camera-based eye tracking systems.

The second method, Electrooculography, measures the resting potential of the retina, which changes based on eye orientation. EOG measurements are already used in motion-capture to faithfully track the positions of actors' eyes, and has the added advantage of being fairly non-invasive, as the electrodes used do not interfere with the subjects' field of vision.

On the other hand, the lack of any standardized electrode configuration means it's difficult to compare your results with those of other researchers, the signal itself can be cluttered with blinking-artifacts (which are exactly what they sound like), and the whole setup requires a much higher sampling rate than other methods.


Useful for: Determining where people were looking, how long they were looking at whatever was there, identifying distractions or areas of interest and user's ability to identify important in-game elements visually. Unfortunately gaze is not a functional proxy for "attention" nor would it give you much information (in 3D environments) about how far "out" someone was looking.


Electromyography (EMG)
EMG records the electrical activity produced by skeletal muscles (See: How Signals Travel Within Neurons for more detail). Facial EMG in particular can be useful due to the fact that it allows one to track the muscles involved with making facial expressions like smiling or frowning, and as such, can give you a good idea the nature (positive or negative) of subjects' emotional states during play. (Giakoumis et al. 2009

Unfortunately, even if your test subjects were totally comfortable with a bunch on sensors on their faces, you'd still have to sacrifice their ability to speak. You're going to get enough hassle from normal recording artifacts, so you don't want to create any more from people talking.


Useful for: Using as a proxy for measuring valence, as it enables you to capture the activity of the muscles involved in making facial expressions.


Also your test subjects may have great poker faces.

Galvanic Skin Response (GSR)
Galvanic Skin Response/Skin conductance is a measure of the electrical conductance of your skin, and varies based on how moist your skin is at any given point in time. GSR can be used as an indicator of arousal because cause your sweat glands are controlled by the sympathetic nervous system.

Possible issues with this method come from several places. The temperature and humidity in which you are operating can have a significant effect on readings, and make the task of comparing readings from different sessions rather difficult. Internals factors, both biological and psychological, can also lead to depressed readings, or a complete lack of significant variation, depending on the subject.


Useful for: Monitoring arousal/stress, identifying in-game situations (not one-off events) which may increase stress or arousal over time.


It depends on the person.

Cardiac Responses
There are actually a number of different cardiac responses one can track for the purposes of measuring arousal, including: interbeat intervals, heart rate, heart rate variability, and blood pressure. The obtrusiveness of these methods depends on what's being tracked, and can range anywhere from arm cuffs to video analysis. Metrics like heart rate variability and systolic blood pressure, for example, have been shown to be fairly reliable indicators of "invested effort" and can be used to detect immediate changes in mental workload. The limitations of these methods are dependent upon the technique being used, and as with the other metrics, can show significant differences between individuals.


Useful for: Identifying immediate responses to game events, monitoring arousal, and may be used as a proxy for subjects' mental workload.



Respiration 
While respiration is indeed sensitive to changes in mental workload and emotional states, it's also one of the few physiological responses used in user testing that almost everyone can control, consciously, without much effort or training. As such, while it can be easy to measure, it may not be as useful as the other methods outlined above if you're looking for raw physical responses. Also, as with facial EMGs, your subjects wouldn't be able to talk while you're recording their breathing.


Useful for: Testing games which may contain an actual physical element to gameplay in addition to, or in lieu of, traditional input.

You are now breathing manually.

Designing the Experiment
Once you know what you will be collecting, keep the following questions in mind when designing your experiment:

1. Are we making sure it works, looking for general input and information, or are we doing an actual game-design experiment?
User testing is not quality assurance. Quality assurance is, and should always be conducted as a separate activity. It has a specific goal, and the people who do it need to be intimately familiar with the game they're testing, they literally cannot give us the kind of information we're looking for in the other two situations.

If we are looking for feedback, then you want to collect as much information as possible, about as many game elements as you can. Your goal will be to identify anything which might be interesting, troublesome, or worth expanding upon. Ask questions not only about gameplay ("Were the goals clear?") but emotional experiences as well.

If, however, you would like to test something specific, something that you can use as an actual test variable, then you will be performing an actual experiment, and you should act accordingly. Strive for consistency, repeatability, and never change more than one variable at a time.

2. How is the physiological data we collect going to be used?
Next, make a decision about when you are going to use this information. You can collect data for post-test analysis, record data to be played back immediately as a means of guiding post-gameplay interviews, or do both.  (...or some shiny new method that you just came up with, in which case I want to hear about it.)

3. How will we attempt to control for the presence of the monitoring equipment? (If at all?)
Certain equipment can distract test subjects and affect their performance during gameplay, while others may have little to no effect whatsoever. Depending on the method being used, you may not feel the need to control for the presence of your monitoring devices, and that's okay.

If you do want to control for such things, however, you can do so by creating three experimental groups: one with no physiological data collection; one with no traditional data collection; and one with both. Alternatively, if you have a whole lot of time and resources at your disposal, you could just try to integrate monitoring devices into the normal input/gameplay hardware itself.

It would totally work, too.  Someone give me a job helping develop these.
I'd test everything on myself, first, and my eye is already messed up, so that's one less thing to worry about.

Conclusion
Although this discussion has been focused on the use of physiological data for video game user testing, the integration of physiological data into user testing and research can have significant impacts on both the quality and quantity of the data collected.

If you are creating a good or service of any sort, it is of the utmost importance that you perform both quality assurance and user-testing as early and as often as possible. Make sure you understand what you are looking for, as well as the pros and cons of the methods you can use to gather that information so that you can frame your questions properly, and design your experiments appropriately.


Further Reading
For more information about implementing physiological monitoring methods in user-testing:
Papers and Academic Articles

0 comments:

Post a Comment