"

09-2: Theories of Choice Behavior

Psychology of Learning

Module 09: Decision-Making 1

Part 2: Theories of Choice Behavior

Looking Back

In Part 1, we explored how choice is measured in laboratory settings. Pigeons choosing between two keys revealed the Matching Law: organisms distribute responses to match reinforcement distribution. The Generalized Matching Law extends this framework with bias & sensitivity parameters. But describing choice patterns doesn’t explain why they occur. The Matching Law tells us what organisms do but not why they distribute responses this way. Does matching reflect rational optimization—organisms maximizing overall utility? Or does it emerge from simpler moment-to-moment processes?

Behavioral Economics: Bridging Psychology & Economics

Behavioral economics is the study of how organisms allocate their limited resources, including time & money. It combines principles from behavioral psychology (how consequences shape behavior) with microeconomics (how consumers make choices) to predict decision-making (Kagel, Battalio, & Green, 1995).

Traditional economics assumes perfectly rational decision-makers with unlimited computational abilities who always maximize utility. Behavioral economics recognizes that real organisms—including humans—face cognitive limitations, use heuristics & shortcuts, & make systematic errors. By incorporating psychological principles into economic models, behavioral economics provides more realistic accounts of actual choice behavior.

Utility is a measure used in economics to describe the subjective satisfaction, value, or benefit gained from consuming a good or service. In psychology, utility represents the reinforcing value or “goodness” of an outcome. Different outcomes provide different amounts of utility to different individuals (Von Neumann & Morgenstern, 1944).

In economics, utility measures the satisfaction gained from consuming a “package” of goods & services. A meal provides utility through taste, nutrition, & social enjoyment. A car provides utility through transportation, status, & comfort. Total utility from a package equals the sum of utilities from individual components, though interactions occur—a car’s utility depends partly on available roads; a meal’s utility depends on hunger level.

In psychology, utility represents reinforcing value. Food provides high utility when you’re hungry, low utility when satiated. Money provides utility through purchasing power. Social approval provides utility through belonging & status. Utility is subjective—$100 means more to someone earning $20,000/year than someone earning $200,000/year. The same outcome provides different utility to different individuals based on their circumstances, preferences, & needs.

Critically, psychological values (utility) don’t match mathematical values. This mismatch drives much research in behavioral economics. Ten dollars isn’t worth exactly twice as much as five dollars in psychological terms. Losing $100 feels worse than gaining $100 feels good. Understanding how psychological value relates to objective quantities requires studying utility functions—mathematical representations of how satisfaction changes with quantity.

Optimization Theory: Maximizing Satisfaction

Optimization theory is a theory of choice behavior which assumes that consumers spend their resources in ways that maximize their utility. Organisms allocate time, effort, & money to produce the highest possible satisfaction given constraints they face (Von Neumann & Morgenstern, 1944).

If you decide to rent ten movies for a long weekend, optimization theory predicts you’ll select the combination providing maximum total utility. You won’t randomly select ten movies or choose based on habit. You’ll consider your preferences, moods, available options, & constraints (rental costs, viewing time) to optimize satisfaction.

The Movie Marathon Example: Law of Diminishing Marginal Value

Like anything, your movie viewing is subject to the law of diminishing marginal value. The first drama you watch gives you 5 units of utility, the second drama gives 4 units, the third gives 3 units, & so on. Each additional drama provides less satisfaction than the previous one. By the tenth drama, you’re getting only 0.5 units of utility—you’re tired of dramas.

Apparently you prefer dramas because the first comedy you watch only gives you 3 units of utility. Each additional comedy you watch gives less & less utility, with the tenth comedy giving only 0.1 units of utility.

If you watched ten dramas, total utility would be: 5 + 4 + 3 + 2 + 1.5 + 1.25 + 1 + 0.75 + 0.6 + 0.5 = 19.6 units. If you watched ten comedies, total utility would be: 3 + 2.5 + 2 + 1.5 + 1.25 + 1 + 0.75 + 0.6 + 0.4 + 0.1 = 13.1 units.

But if you create a package of dramas & comedies, their combined utility can exceed what you would receive from just watching one type of movie. This example shows that the package containing 6 dramas & 4 comedies produces maximum utility (30.25 units): 6 dramas: 5 + 4 + 3 + 2 + 1.5 + 1.25 = 16.75 units; 4 comedies: 3 + 2.5 + 2 + 1.5 = 9 units; Total: 16.75 + 9 = 25.75 units. [Note: The PPT indicates 30.25 as maximum utility from 6 dramas & 4 comedies, suggesting different utility values than calculated above. The principle remains: diversification produces higher utility than specialization due to diminishing marginal value.]

This illustrates why variety matters. Diminishing marginal value means that at some point, adding one more drama provides less utility than switching to a comedy. The optimal package balances different options rather than maximizing one option. This same principle explains why balanced diets are healthier (variety provides better nutrition than specialization), why investment portfolios diversify (spreading risk), & why organisms distribute foraging among multiple patches rather than depleting one patch completely.

Optimization in Animal Behavior: The Dung Fly Example

Studies of choice behavior with animals don’t use movie rentals, but instead tend to focus on those things that are important to animals, namely mating opportunities & food. The waiting time of male dung flies at cow patties provides an excellent example of optimization.

Female dung flies lay eggs in cow patties & prefer fresh patties over old ones. Males wait at patties to mate with arriving females. A male faces a dilemma: How long should he wait at one patty before leaving to search for another? If he leaves too soon, he misses mating opportunities. If he stays too long, he wastes time at a depleted site while fresh patties elsewhere attract females.

Parker (1970) discovered that male dung flies leave patties at precisely the time that maximizes their mating success. They stay at fresh patties longer (where females arrive frequently) & leave older patties sooner (where female arrival rates decline). The departure time matches the point where expected benefits from staying equal expected benefits from searching for a new patty. This represents optimal foraging—balancing current site quality against search costs & alternative site quality.

This demonstrates that optimization occurs in nature through natural selection. Males who stayed too long or left too early produced fewer offspring than males who optimized departure timing. Over generations, selection favored genes producing optimal decision rules. The flies aren’t consciously calculating expected utilities, but evolutionary processes have shaped behavior to approximate optimization.

Social Learning & Decision-Making: Learning from Others’ Choices

Not all learning about choices comes from direct experience. Social learning allows organisms to acquire information about options, outcomes, & strategies by observing others rather than through personal trial & error. This dramatically expands the information available for decision-making beyond what any individual could learn alone (Bandura, 1977).

Observing others’ choices provides valuable information: What options are available? What outcomes do different choices produce? Which alternatives do experienced individuals prefer? A foraging bird can learn which foods are safe by watching other birds eat without poisoning themselves. A new employee can learn which projects to prioritize by observing which tasks successful colleagues focus on. A consumer can learn which products are reliable by noticing what experienced users purchase.

Social learning influences decisions through several mechanisms:

Observational learning allows individuals to learn behavior-outcome relationships by watching others. If you observe a coworker receive praise for a particular approach, you learn that approach produces positive outcomes without risking failure yourself.

Social proof leads people to infer that popular choices are good choices. If many others choose a particular restaurant, product, or strategy, their collective choice signals quality. “If so many people choose it, it must be good.”

Prestige bias leads people to copy the choices of successful, high-status individuals. Experts, celebrities, & leaders disproportionately influence others’ decisions. Their success suggests their choices are effective, making imitation rational.

Conformity influences choices toward group norms, even without explicit information about outcomes. People often choose what others choose simply to fit in, avoid standing out, or maintain social relationships.

Social learning is generally adaptive—it allows rapid acquisition of knowledge that would take years to learn through personal experience. However, social learning can also propagate errors. If early adopters make suboptimal choices, observers may copy those choices, spreading suboptimal behavior through populations. Information cascades occur when sequential decision-makers ignore their own information & copy predecessors, potentially locking populations into poor choices. Understanding when to learn from others versus when to rely on personal judgment is itself a crucial decision-making skill.

Optimization Theory & the Matching Law

Optimization theory provides a compelling explanation for the Matching Law. When facing concurrent VI schedules, matching the distribution of responses to the distribution of reinforcement maximizes total reinforcement obtained. Deviating from matching—either undermatching or overmatching—produces less total reinforcement than perfect matching. Therefore, organisms that match are optimizing their reinforcement intake.

Mathematical analysis confirms this: Given two concurrent VI schedules, the response distribution that produces maximum reinforcement rate equals the matching distribution. Organisms that match are behaving optimally. This suggests that matching isn’t arbitrary—it reflects rational allocation of behavior to maximize outcomes. The Matching Law emerges as a consequence of optimization under concurrent VI schedules.

Momentary Maximization Theory: Living in the Moment

Momentary maximization theory is a theory of choice that argues that organisms make choices to maximize their satisfaction (utility) at the present moment rather than considering long-term consequences. Organisms simply choose whichever option currently offers the highest value (Shimp, 1969).

Unlike optimization theory, which assumes organisms consider overall long-term utility, momentary maximization proposes a much simpler decision rule: “Pick whichever option seems best right now.” This requires no memory of past outcomes, no calculation of long-term consequences, no complex integration of information. Just evaluate current options & choose the better one.

Momentary maximization can produce matching under certain conditions. If organisms switch between alternatives when current value drops below the alternative’s value, & if they sample alternatives frequently enough, their overall distribution of responses will approximate matching. Matching emerges not from long-term optimization but from simple moment-to-moment comparisons.

However, momentary maximization & optimization make different predictions in some situations. Consider concurrent ratio schedules: Optimization predicts exclusive preference for the richer schedule (because responding faster on ratio schedules produces proportionally more reinforcement, so concentrating all responses on the better schedule maximizes reinforcement). Momentary maximization predicts matching (because both schedules provide equal momentary value after each reinforcement). Evidence supports optimization in this case—organisms show exclusive preference on concurrent ratio schedules, not matching.

This suggests that while momentary maximization may contribute to choice patterns, organisms also show sensitivity to long-term consequences & can adjust behavior to optimize over longer timescales. The debate between optimization & momentary maximization reflects deeper questions about cognitive sophistication: Do organisms plan ahead & calculate optimal strategies, or do they use simple rules producing apparently optimal outcomes? The answer likely varies across species, individuals, & contexts.

Melioration Theory: Following Local Improvements

Melioration theory proposes that organisms shift their behavior toward whichever alternative provides better local returns, even when this doesn’t maximize global outcomes. Rather than calculating overall optimal strategies, organisms simply move toward momentarily superior options (Herrnstein & Prelec, 1991).

The term “melioration” comes from the Latin word for “making better.” According to melioration theory, organisms compare the local reinforcement rates of available alternatives & shift responding toward whichever alternative currently provides better returns. This continues until local rates equalize—at which point there’s no further incentive to shift.

Melioration differs from optimization in a crucial way: Optimization asks “What distribution of behavior maximizes total reinforcement?” Melioration asks “Which alternative is currently better?” These questions can produce different answers.

Example: Consider choosing between studying & socializing. Each additional hour of studying provides diminishing returns (the tenth hour helps less than the first). Each additional hour of socializing also provides diminishing returns. Optimization would calculate the mix maximizing total satisfaction. Melioration simply compares: “Right now, would another hour of studying or socializing provide more satisfaction?” If socializing currently feels more rewarding, melioration shifts behavior toward socializing—even if this ultimately reduces total satisfaction.

Melioration can explain several puzzling phenomena. Addiction may develop because drugs provide high local reinforcement despite reducing global well-being. Each individual decision to use provides immediate pleasure exceeding alternatives, even as cumulative use produces devastating long-term consequences. Procrastination may reflect melioration: immediate leisure provides better momentary returns than working, even though working maximizes long-term outcomes.

The relationship between melioration & matching is important. Under many conditions, melioration produces matching—organisms shift toward better alternatives until local rates equalize, which occurs at the matching distribution on concurrent VI schedules. However, melioration & matching can diverge on other schedules, providing tests distinguishing between theories. Research suggests organisms sometimes meliorate even when doing so reduces total reinforcement, supporting melioration as a mechanism underlying choice.

Types of Decision-Making: Certainty, Risk, & Uncertainty

Decision-making contexts differ in how much information is available about outcomes & their probabilities. These differences fundamentally affect how decisions should be made & how they actually are made.

Decision-making under certainty refers to decisions made in which the factors that determine the outcomes of the different choice alternatives are known with complete accuracy. You know exactly what will happen if you choose each option (Savage, 1954).

Example: Choosing between a guaranteed $100 today or a guaranteed $150 in one week. Both outcomes are certain—no probability involved. You know exactly what you’ll receive from each choice. The decision reduces to comparing two certain outcomes, considering factors like need for immediate money, trust in the promise of future payment, & personal time preferences.

Decision-making under certainty is relatively straightforward in principle: Calculate the utility of each certain outcome & choose the option with higher utility. Complications arise from subjective utility (different people value outcomes differently) & time preferences (future outcomes may be discounted), but the outcomes themselves are known.

Decision-making under risk refers to decisions made in which the factors that determine the outcomes of the different choice alternatives occur with known probabilities. You know the possible outcomes & the likelihood of each (Von Neumann & Morgenstern, 1944).

Example: Gambling in a casino where probabilities are defined. A roulette wheel has known probabilities for each outcome. Slot machines have programmed payout rates. You know that betting on red in roulette wins 18/38 times (in American roulette) & loses 20/38 times. The outcomes are uncertain (you don’t know whether this specific spin will win), but the probabilities are known & stable.

Decision-making under risk allows calculation of expected values: multiply each outcome’s value by its probability, then sum across outcomes. This provides a rational basis for choice. Expected value of betting $10 on red: (18/38)($20) + (20/38)($0) = $9.47. You expect to lose $0.53 per bet on average. Knowing probabilities enables informed decision-making, though people don’t always follow expected value calculations (as we’ll see in Part 3).

Decision-making under uncertainty refers to decisions made in which the probabilities of the different factors which affect the outcomes of the different choice alternatives are not known with precision or cannot be determined. You may know possible outcomes but not their likelihoods (Knight, 1921).

Example: Poor Farmer Brown is forced to make choices without enough information. Should he plant corn or soybeans? The outcome depends on weather (unknown), pest infestations (unknown), market prices at harvest (unknown), & numerous other factors. He knows the possible outcomes (crop success or failure, high or low prices) but cannot assign precise probabilities. This uncertainty complicates decision-making enormously.

Most real-world decisions involve uncertainty rather than risk. You don’t know the precise probability that changing jobs will work out well, that a relationship will succeed, that an investment will pay off, or that a medical treatment will work for you specifically. You have partial information, past experience, & educated guesses, but not precise probabilities. Decision-making under uncertainty requires different strategies than decision-making under risk, as we’ll explore in Part 4.

Looking Forward

We’ve explored theoretical explanations for choice behavior. Behavioral economics combines psychology & economics to understand resource allocation. Optimization theory proposes that organisms maximize utility—explaining why matching occurs on concurrent VI schedules. Momentary maximization offers a simpler explanation: organisms choose whichever option currently seems best. We distinguished decision-making under certainty, risk, & uncertainty. In Part 3, we’ll examine normative models prescribing optimal decision-making & descriptive models revealing how people actually decide, often deviating from rational optimization.

License

Psychology of Learning TxWes Copyright © by Jay Brown. All Rights Reserved.