analytis - Tumblr blog

analytis · 4 years ago

Photo

There are plenty of spicy results in our new Man. Science paper with Lisheng He and Sudeep Bhatia, but I am particularly fond of the methodological contribution.

Model crowds can be used as tools for history of science, and can uncover how knowledge in a field has accumulated over time. The field of risky choice grew fast in the past decades. Model evaluation has mostly relied on model comparisons that contrasted contender models with a few incumbent models in experiments where the goal was predicting aggregate level data (more than 50% of people prefer Z to Y). Only subsets of models were evaluated each time; the experiments could report on samples that were favourable for the contender models, and people’s cognitive and emotional idiosyncrasies were washed away in favour of reporting aggregate level results.Thus, we were pretty much in the black in regards to which among the dozens of models that have been proposed over time could better predict people’s choices in large and relatively unbiased stimuli set.

Recent developments have made it possible change this grim state of affairs. Several scholars have now conducted experiments that allow fitting models on individual level data. They have used different procedures to sample items from the space of possible dilemmas generating large and diverse data. Lisheng contacted them and created a new collection. We then evaluated model performance by fitting the 58 models on individual level data and predicting choices out of sample. If we adopt the usual competitive perspective in model building, knowledge accumulation has pretty much stagnated over the last decades.

Behavioural science, however, is not like physics. There is no single model that can predict all phenomena surrounding human behaviour best for everybody. People have their own idiosyncrasies. In our data, modestly performing models, could still predict some individuals best. When applying model crowds, models were up(down)-weighted or selected for specific individuals depending on the data seen in the training set. What’s more, human behaviour is not deterministic, but stochastic. As in the WOC literature, the errors from diverse models cancelled out. Thus, the overall error rates were substantially reduced. A different story about knowledge accumulation pops up when looking at model crowds. The field has not stagnated—we are still creating knowledge, and we now have the tools to better assess specific models and our field’s progress.

0 notes

analytis · 7 years ago

Text

The Cornell Chronicle published a nice piece about our paper in Nature Human Behavior

Whether you are choosing a restaurant or the destination for your next vacation, making decisions about matters of taste can be taxing.

New Cornell research points to more effective ways to make up one’s mind – and sheds light onto how we can use other people’s opinions to make our own decisions. The work may also have implications for how online recommender algorithms are designed and evaluated.

The paper, published May 28 in Nature Human Behavior, suggests that people who have had a lot of experiences in a particular arena – whether it’s restaurants, hotels, movies or music – can benefit from relying mostly on the opinions of similar people (and discounting the opinions of others with different tastes). In contrast, people who haven’t had many experiences cannot reliably estimate their similarity with others and are better off picking the mainstream option.

“Our findings confirm that even in the domain of taste, where people’s likes and dislikes are so different, the wisdom of the crowd is a good way to go for many people,” said lead author Pantelis P. Analytis, a postdoctoral researcher in Cornell’s Department of Information Sciences.

Analytis co-wrote “Social Learning Strategies for Matters of Taste” with Daniel Barkoczi of Linköping University, Sweden, and Stefan M. Herzog of the Max Planck Institute for Human Development, Berlin.

But how many restaurants (or movies or music albums) should you try before relying on the opinions of others who seemingly share your tastes, rather than the wisdom of the crowd? It all depends on how mainstream (or alternative) a person’s tastes are and how much their peers differ in their similarity to them, Analytis said. “For people who have mainstream tastes, the wisdom of the crowd performs quite well, and there is little to be gained by assigning weights to others. Therefore, only people who have experienced lots of options can do better than using the wisdom of the crowd,” he said. “For people with alternative tastes, in contrast, the wisdom of the crowd might be a bad idea. Rather, they should do the opposite of what the crowd prefers.”

The researchers investigated the performance of different social learning strategies by running computer simulations with data from Jester, a joke-recommendation engine; developed at the University of California, Berkeley, in the late 1990s, it has been running online ever since. The interface allows users to rate up to 100 jokes on a scale from “not funny” (-10) to “funny” (+10). An early citizen science project, it is the only available recommender system dataset in which many people have evaluated all the options.

The findings suggest people could learn their own preferences in the same way that recommender systems algorithms assess which options people will like most, shedding light into our own cognition “We humans have the most powerful computer that ever existed running algorithms all the time in our heads. We’re trying to show what those algorithms might be and when they are expected to thrive,” Barkoczi said. In that respect, the new research builds bridges between the behavioral and social sciences and the recommender systems community. The fields have looked at opinion aggregation using very different terminology, yet the underlying principles are very similar, Barkoczi said. “We’ve put a lot of effort in this work trying to develop concepts that could cross-fertilize those parallel literatures.”

The research also has implications for how online recommender algorithms are designed and evaluated. So far scientists in the recommender systems community have studied different recommender algorithms at the aggregate level, disregarding how each algorithm performs for each individual in the dataset. In contrast, this research shows that there might be potential in evaluating these strategies at the individual level. “In our work, we show the performance of the strategies diverges a lot for different individuals. These individual level differences were systematically uncovered for the first time,” Herzog said.

This implies that each individual’s data can be seen as a data set with distinct properties, nested within an overarching recommender system dataset structure. “Movie recommendations systems like the ones used by Netflix could ‘learn’ whether individuals have mainstream or alternative tastes and then select their recommendation algorithms based on that, rather than using the same personalization strategies for everybody,” Herzog said.

According to an age-old adage, there is no arguing about taste. “This work, in contrast, shows that the best learning strategy for each individual is not subjective,” Analytis said,” but rather is subject to rational argumentation.”

0 notes

analytis · 9 years ago

Video

youtube

In fall 2014, Reid Hastie visited our group at the Max Planck Institute in Berlin. Reid’s comments in internal presentations and his overall presence during his visit were absolutely remarkable. In a way, we could all sense the wealth of knowledge that he has gathered in his long and exciting scientific career.

Indeed, he has witnessed the conceptual developments in Judgment and Decision Making from the 1970s to this day. Reid's imminent departure from the group incited us to start the Talking about decisions interview series. It was an attempt to peek in and maintain his wisdom.

The interview surpassed our expectations, as it unraveled a systematic way of thinking about psychology and science that was going beyond academic writings. Reid unearthed anecdotes and historical details that are essential pieces of the academic process, but are rarely encountered in written form. I hope you enjoy watching it as much as we enjoyed recording it.

(via https://www.youtube.com/watch?v=WpOugvqxUbE)

#talking about decisions #interviews #cognitive psychology #judgment and decision making #wisdom of the crowds #intuition #group decision making #juries

1 note · View note

analytis · 9 years ago

Photo

Human behavior in contextual multi-armed bandits

George, an early-career American academic, has just accepted a new position at a European university. Somewhat of a culinary fanatic, he is determined to enjoy the local cuisine as much as possible. As there are over 1,000 restaurants in the area, he is spoiled for choice. George soon starts to try out different restaurants, sometimes leaving ecstatic and sometimes close to nauseous. Keen to avoid the latter, he notices that the quality of the food on offer is related to various pieces of information, such as the facade of the restaurant, the number of patrons, and the distance to the local market. Using this knowledge, George manages to eat out every day, never leaving disappointed.

George’s story captures the essential characteristics of numerous widely encountered decision-making problems, where (a) individuals repeatedly face a choice between a large number of uncertain options, the value of which can be learned through experience, and (b) there are various cues such that they can form an expectation about the value of an option without having tried it previously. These two characteristics are related to two learning problems that have been explored extensively in psychology and cognitive science, yet mostly in isolation. These are how people learn to make decisions from experience (Barron & Erev, 2003; Hertwig et al., 2004) and how they learn to make predictions from multiple noisy cues (Nosofsky, 1984; Speekenbrink & Shanks, 2010). The structure of “decisions from experience” problems can be formally represented in a multi-armed bandit (MAB) framework (Sutton & Barto, 1998). MAB problems involve a fine balance between taking the action that is currently believed to be the most rewarding (“exploitation”) and taking potentially less rewarding actions to gain knowledge about the expected rewards of other alternatives (“exploration”). MAB problems have proven to be a useful framework to study how people tackle this exploration–exploitation trade-off (e.g. Barron & Erev, 2003; Cohen et al., 2007; Speekenbrink & Konstantinidis, 2015; Steyvers et al., 2009).

Decision situations in real life typically contain more informatiοn than classic MAB problems, as alternatives usually have many features that are potentially related to their value. In other words, there is a function relating features of the alternatives to their value, and we assume people can learn this function. In our example, after enough visits to various restaurants, George has learned the function and with one look at the restaurant’s features can estimate the qual- ity of the food. Strictly speaking, feature information is not needed to make good decisions. People who try an alternative many, many times have no need to engage in function learning to estimate its value. However, function learning can be very useful. There might not be time to try out alternatives many times, especially when the number of alternatives is large or the choice sets change frequently. Also, choosing a previously untried alternative might cost the decision maker dearly. In such situations it becomes important to be able to appraise an alternative’s worth without actually trying it.

Hrvoje Stojic, Maarten Speekenbrink and I designed a study that aimed to shed light on how human decision makers allocate decisions among alternatives in contexts involving both function learning and direct experiential learning. Note that this type of problems has received a lot of attention recently in the domain of machine learning due to the numerous applications in autonomous machine decision making (e.g. Li et al., 2010; Agrawal & Goyal, 2012).

The experiment

We investigated the influence of function learning on decision making in a stationary MAB task. There were three versions of the task: (1) a classic MAB task where feature values were not visually displayed (we refer to this as the classic con- dition), (2) a CMAB task where feature values were visible and participants were instructed that features might be useful for their choices (explicit contextual condition), and (3) a CMAB task where feature values were visible but participants were not informed about the relation between features and the value of alternatives (implicit contextual condition). The contextual conditions had an additional test phase with new alternatives, where we examined whether participants had learned the function and could use the acquired knowledge to make better choices when facing new alternatives.

Participants

In total, 193 participants (94 female), aged 18–73 years (M = 32.5 years, SD = 11.4), took part in this study on a voluntary basis. Participants were recruited via Amazon’s Mechanical Turk (mturk.com) and were required to be based in the United States and have an approval rate of 95% or above.1 Participants in the experiments earned a fixed payment of US$0.30 and a performance-dependent bonus of US$0.50 on average. Participants were randomly assigned to one of the three experimental groups: the classic $(N = 66)$, explicit contextual $(N = 64)$, and implicit contextual $(N = 63)$ conditions. As Amazon’s Mechanical Turk is an online environment that offers less control than laboratory experiments, we exluded participants who did not pay due attention to the experimental task. At the end of the instructions, participants answered four questions to check whether they recalled basic information from the instructions. Excluding participants who failed to answer all four questions correctly would have left us with too small a sample, so we excluded participants who failed to answer at least two of these correctly. Importantly, this exclusion was done before we looked at further results. In total, 47 participants were excluded from the analysis.

Task

The task consisted of a training and a test phase. The training phase comprised 100 trials and in each trial participants were presented with the same 20 alternatives (bandit arms) and asked to choose one. After making a choice in trial $t$, they were informed of the payoff $R(t)$ associated with their choice. For each arm $j = 1, …, 20$, the payoffs $R_j(t)$ on trial $t$ were computed according to the following equation:

$R_j(t) = w_1 \cdot + x_{1,j},w_2 \cdot x_{2,j} +e_{j}(t)$

The two feature values, $(x1,j)$ and $(x2,j)$, of each alternative $j$ were drawn from a uniform distribution $ U(0.1, 0.9) $, for each participant at the beginning of the training phase. Weights were set to $w1 = 2$ and $w2 = 1$ for all participants. The error term, $e_j(t)$, was drawn randomly from a normal distribution $N(0,1)$, independently for each arm. The difference between conditions was that the feature values, $x1, j$ and $x2, j$, were visually displayed in the contextual conditions but not in the classic condition. The design of the training phase can be inspected on the top image. The structure of the task was similar in the test phase, but now participants were presented with three new alternatives with randomly drawn feature values on each trial. Weights of the function were kept the same, $w = (2, 1)$. As participants faced a new decision problem on each trial in the test phase, there was no longer an exploration–exploitation trade-off, and participants were expected to always choose the alternative they deemed best. There were five types of trials, specifically designed so that participants would exhibit whether they had learned the functional form and the weights, $w1$ and $w2$. Two of the types were easy and difficult interpolation trials, where feature values were drawn from the same.

Results

Training Phase

Performance in the training phase is illustrated the too bottom images. Over the course of the training phase participants in both the classic and contextual MAB conditions were able to improve their performance by choosing more promising alternatives. This is evident in the downward slopes of linear fits of the average rankings of the chosen options as a function of trial. As the training phase progressed, participants discovered alternatives that yielded higher earnings on average, and the average ranking of the alternatives they had chosen decreased as a result. Although the increase in returns was similarly steep, the participants in the CMAB conditions had a head start and identified better alternatives already in the first rounds. This seems to have been the case especially for the explicit contextual condition where participants were instructed that the features could be used to improve their decisions. Such an increase early on may have been due to a strong prior expectation for positive linear relationships, as often found in the function learning literature (Busemeyer et al., 1997).

Test phase

While behavior in the training phase showed evidence of function learning, the true test of function learning is performance on new, previously unseen items. If participants in the contextual conditions did not learn the function, we would expect that participants would choose randomly between the new sets of three alternatives. Results are presented in Table 1. On easy trials, participants selected the alternative with the higher expected value almost 50% of the time, while the middle and dominated alternatives were selected much less frequently (approximately 25% of the time each). On difficult trials, in contrast, participants selected the dominating and the middle alternative equally often, approximately 37% of the time. The dominated alternative was still selected approximately 25% of the time. Extrapolation trials are crucial for establishing the extent of function learning (Busemeyer et al., 1997). In our case, performance in interpolation and extrapolation trials was similar, indicating that participants extrapolated relatively well. Performance on the “weight test” trials gives a clue as to why they chose the middle alternatives as often as they did—on average, participants seem not to have learned the feature weights correctly, which may have been because of the level of noise in the alternative values (the error term $e_j (t)$ in the function value). Participants seem to have learned that both feature weights were positive, but not that they differed.

#learning from experience #contextual bandits #exploration #exploitation #multi-cue probability learning #amazon mechanical turk

0 notes

analytis · 9 years ago

Photo

You 're special but it doesn't matter if you 're a greenhorn: Social recommender strategies for mere mortals

Where should I go for my next vacation? Which car should I buy? Most choices people encounter are about “matters of taste” and thus no universal, objective criterion about the options’ exists. How can people increase their chances of selecting options that they will enjoy?

One promising approach is to tap into the knowledge of other individuals who have already experienced and evaluated options. The recommender systems community has leveraged this source of knowledge to develop collaborative filtering methods, which estimate the subjective quality of options that people have not yet experienced (Resnick & Varian, 1997; Adomavicius & Tuzhilin, 2005). One key insight is that building recommendations based only on the evaluations of individuals similar to the target individual often improves the quality of the recommendations (e.g., Herlocker, Konstan, Borchers, & Riedl, 1999)—where similarity between two people is typically defined as the correlation in their evaluations across options they have both evaluated.

Although the consumer industry enables people to benefit from recommender systems in some domains (e.g., choosing a movie on Netflix), for many everyday decisions there is neither an algorithm nor “big data” at hand. How can individuals leverage the experience of other people when they have no access to big data but access only to a relatively small community of other people with whom they share some prior experience about the available options?

In this paper, I coauthored with Daniel Barkoczi and Stefan Herzog at the proceedings of the Annual Conference of the Cognitive Science Society we investigated exactly this question. First, we have undertaken an exercise in theory integration by mapping the striking conceptual similarities between seminal recommender system algorithms and both (i) models of judgment and categorization and (ii) models of social learning and social decision making (from psychology, cognitive science, judgment and decision making, anthropology, and biology). Second, we have recast the latter two classes of models as social recommender strategies. Finally, based on this mapping, we have investigated how ordinary people can leverage the experience of other people to make better decisions about matters of taste. To this end we studied the inevitable trade-off between (i) harnessing the apparent (dis)similarity between people’s tastes—to discriminate between more and less relevant advisers—and (ii) estimating those similarities accurately enough. We investigate how this trade-off evolves with the amount of experience a decision maker has (i.e., the number of options previously evaluated).

We investigated the performance of the proposed social recommender strategies (see Table 1) by simulating their predictions for a large-scale, real-world data set. We varied the experience of the simulated decision makers. As experience increased, the strategies relying on similarity could thus base their similarity estimates on more data.

The social network from which a person could leverage vicarious experience would likely be much smaller than the thousands of people available in typical recommender system data sets. The cognitive limit of the number of stable relationships that people can maintain is estimated to be around 250 (Dunbar, 2010). We therefore opted to simulate small “communities” of 250 members each to mirror this real-world feature (as opposed to letting decision makers have access to all other individuals in the population).

Dataset

We used the funniness ratings of 100 jokes collected in the Jester data set. Jester2 was created by an online recommender system that allows Internet users to read and rate jokes. Users evaluated jokes on a scale ranging from not funny (–10) to funny (+10). At the beginning of the recommendation process, a set of 10 jokes was presented to the user. Thereafter, Jester recommended jokes and continued to collect ratings for each of them. The data set contains 4.1 million evaluations of 100 jokes by 73,421 participants. In contrast to other data sets studied by the recommender system community, here a large number of participants evaluated all options. Since its publication, the Jester data set has been used exten- sively to study collaborative filtering algorithms.

The simulation

For simplicity we worked only with participants who evaluated all jokes (reducing the number of participants from 73,421 to 14,116). We randomly selected 14,000 participants in order to partition them into evenly sized communities of 250 members each. In line with previous work in the recommender system literature, we used the Pearson correlation coefficient as a measure of similarity (Herlocker, Konstan, Terveen, & Riedl, 2004). In each simulation run, we followed the following steps:

First, we randomly generated 56 communities with 250 members each (14,000/250). Second, we randomly divided the jokes into a training (x jokes) and a test (10 jokes) set; this assignment was the same for all individuals within all communities. The strategies were then fitted on the training set. Individuals could access only advisers within their own community. Third, for each individual (within all communities) we generated all 45 possible pair comparisons within the test set and examined the performance of the strategies in predicting which of the two jokes in a pair had a higher evaluation for that individual, resulting in 45 pair comparisons per individual, 11,250 per community, and 630,000 in total. For each strategy we recorded the proportion of correct predictions. This procedure was repeated 100 times and results were averaged. We investigated how the performance of the strategies changed as a function of experience by repeating this procedure for different numbers (x) of jokes experienced in the training set (varying from 5 to 90 in increments of 5).

Take away results

Two results stand out. First, the successful strategies all have one thing in common: They aggregate evaluations across several people (or items). Second, the amount of experience within a domain turns out to be a crucial determinant of the success of strategies using similarity information. Whereas experienced people can benefit from relying on only the opinions of seemingly similar people, inexperienced people are often well-advised to aggregate the evaluations of a large set of people (picking the option with the highest average evaluation either across all people or across at least minimally similar people) even if there are interindividual differences in taste, because reliable estimation of similarity requires considerable experience.

Experience and the bias–variance trade-off

With increasing experience with the domain, the performance of all top-notch strategies increased—except for the wisdom of crowds strategy (Follow the whole crowd), which unconditionally averages across all people and is thus—by design—unaffected by the increasing accuracy of the similarity estimates. Such an averaging strategy assumes that everybody has the same taste and performs well to the extent that the tastes in the population are indeed homogeneous. From a bias–variance trade-off perspective (e.g., Gigerenzer & Brighton, 2009; Geurts, 2010), this strategy suffers from potentially high bias to the extent that its homogeneity assumption is wrong, but exhibits zero variance in its prediction error because it does not estimate any free parameters.

In contrast, the strategies relying on similarity have a comparatively low bias because they can adapt to the homogeneity or heterogeneity of tastes in the population. However, they potentially suffer from variance because their predictions depend on the similarity estimates—to differing degrees—and thus they lie on a bias–variance continuum. At one extreme, a strategy of adopting the evaluations of only the seemingly most similar person has the potential to profit from the vicarious experiences of one’s taste Doppelgänger but is most reliant on an accurate estimation of similarity. At the other extreme, a strategy of relying on a large crowd of at least minimally similar people (i.e., with at least positively correlated tastes) is more biased but also more robust because it depends on only roughly discriminating between similar and dissimilar advisers (see also Goldstein et al., 2014; Mannes et al., 2014).

Theory integration: Reconnecting the cognitive sciences with recommender systems research

New statistical tools haven often served as an inspiration for the development of new psychological theories (Gigerenzer, 1991). In the case of recommender systems, however, the insights developed within the last two decades have not been much incorporated into cognitive science—despite recommender systems being widely available and relevant for everyday decision making and seminal recommender systems being inspired by the work of cognitive scientists (Rich, 1979). We hope that the current paper initiates a cross-fertilization between the two until now largely unconnected research streams.

#social learning #recommender systems #theory integration #advice taking #learning from experience #bias-variance

1 note · View note

analytis · 10 years ago

Photo

Multi-attribute utility models as cognitive search engines

Do the following thought experiment: You are the human resources manager of a company and you are assigned the task of hiring a new employee. After advertising the position, you receive several dozen applications from candidates listing their skills and credentials (e.g., grade point average, work experience, programming skills). You can determine each candidate’s potential only after inviting him or her for an interview. Let us assume that you can interview candidates sequentially and that you can decide to stop interviewing and hire a candidate after each interview. Crucially, making the effort to interview another candidate is costly. What is the best way to organize the interview process? First, you need to decide the order in which you will be inviting candidates. Then, after each interview you need to decide whether to make an offer to one of the interviewed candidates, thus stopping your search. The first problem is an ordering problem and the second a stopping problem.

Clearly, if you could perfectly estimate the potential of the candidates on the basis of their credentials you could directly choose the best one by using decision analytic methods (e.g., Keeney & Raiffa, 1993). On the other hand, if the credentials were not at all informitive, you would have to invite people at random, and your problem would reduce to an optimal stopping problem. Such models have been developed formally in statistics (DeGroot, 1970) and economics (Stigler, 1961; Lippman & McCall, 1976) and human behavior in them has been tested empirically in psychology (Rapoport & Tversky, 1966; Lee, 2006), economics (Schotter & Braunstein, 1981; Hey, 1987; Sonnemans, 1998) and marketing (Zwick, Rapoport, Lo & Muthukrishnan, 2003). Intuitively, most everyday decision-making problems lie between these two extreme cases; in reality, the attributes of the alternatives can be used to predict their utility but only imperfectly. There often remains some amount of uncertainty that cannot be explained by the attributes.

In a paper that I published in Judgement and Decision Making with Amit Kothiyal and Konstantinos Katsikopoulos we investigated how people can searh intelligently when they are in this state of half-knowledge. We show that, when the decision makers’ preferences can be described by a linear utility model, the search and stopping problem has an intuitive and psychologically plausible solution. The optimal policy is to follow the estimated utility order prescribed by your subjective utility model; then stop when the expected return from seeing one more candidate for the job turns negative. In essence, the utility models play the role of cognitive search engines, generating the order in which alternatives are examined. We formally develop this approach and apply it to three models that have been studied extensively in the field of judgment and decision making: (i) multi-attribute linear utility, (ii) equal weighting of attributes and (iii) a single-attribute heuristic. Conceptually, our approach illustrates that optimal stopping problems assuming random search and one-shot choice problems are the boundary cases of an ordered search problem with imperfect information. In practice, our approach extends discrete choice models by specifying the exact search process. It is a plausible alternative to Roberts and Lattin’s (1991) theory of consideration set formation and it further advances our understanding of decision-making in environments with rank-ordered alternatives. Let’s have a closer look at the theory!

The theoretical framework

The environment

There are $n$ alternatives $A_1,…,A_n$. Each alternative $A_i$, $i \in {1,…,n}$ is associated with a vector of attributes $\mathbf{a_{i}} = (a_{i1},…,a_{ik})$ and a utility $u_i$ of choosing it. The $u_i$s are unknown but the $\mathbf{a_i}$s are known to the decision maker. The decision maker estimates $u_i$ by $f(\mathbf{a_i})$. We assume that the estimation errors, $\epsilon_i $, such that $u_i = f(\mathbf{a_i}) + \epsilon_i $, are iid Gaussian with mean $\mu$ and standard deviation $\sigma$. We call this equation the decision maker’s subjective model. For each alternative, the decision maker can only learn the utility $u_i$ by sampling the alternative and paying a cost $c$. To choose an alternative, the decision maker can sample as many items as desired. If the decision maker searches for $k$ items, the cost to be paid is $k \cdot c$ and the decision maker will choose out of these $k$ the alternative with the highest utility.

The optimal strategy

Let $S$ denote the set of alternatives already searched and $\bar{S}$ denote the set of alternatives not searched yet. That is, $S \cup \bar{S} = {A_1,…,A_n}, \quad S \cap \bar{S} = \emptyset$.

The decision maker’s problem is to determine the search order and the stopping rule. Let the variable $y$ denote the maximum utility that the decision maker can obtain from the alternatives in $S$, $y = max_{Ak∈S}u_k$, where $A_k$ belongs to $S$. If the decision maker sampled just one more item $A_k$ before stopping search then the subjective expected gain (i.e. the increase in utility minus cost) is (probabilities and expectations below are based on the decision maker’s subjective model):

$ R(A_k) = P(u_k > y) \times E(u_k - y - c | u_k > y) - P(u_k \leq y) \times c $ $= P(u_k > y) \times E(u_k - y | u_k > y) - c. $

It is intuitive that the decision maker should keep on sampling as long as there exists an alternative $A_k \in \bar{S}$ such that its subjective expected gain $R(A_k) > 0$ [if all $R(A_k) < 0$, search should be stopped]. Given this, the decision maker should sample the alternative $A_k$ that achieves the maximum subjective gain $R(A_k)$. It turns out that to maximize $R(A_k)$, it suffices to select the alternative with the highest $f(\mathbf{a_i})$.

Result 1: If for two alternatives $A_i,\, A_j \in \bar{S}$, $f(\mathbf{a_i}) > f(\mathbf{a_j})$ then $R(A_i) > R(A_j)$.

This result says that if the decision maker decides to sample one more item before terminating the search, then the choice should be the one that has maximum unconditional expectation $E(u_i) = f(\mathbf{a_i})$. This suggests the following policy:

Selection rule: Order the alternatives based on their unconditional expectation. Select the items for sampling in this order.

Stopping rule: If at any stage subjective expected gain is negative, terminate the search

Note that the stopping rule can be applied only if the standard deviation of the estimation error $\sigma$ is estimated. On the other hand, for the selection rule to be applied, only the parameters of the multi-attribute function $ f(\mathbf{a_i}) $ need to be estimated. For $\sigma = 0 $ the decision-making problem reduces to a single choice. For $ c = 0 $ the decision maker searches all the alternatives. Note that for $ c = 0 $ and when there are only two alternatives the model reduces to the probit model.Then, for two alternatives $A_i$ and $A_j$ the probability that the alternative $A_i$ will be chosen can be written as $\Phi(\frac{f(\mathbf{a_i}) - f(\mathbf{a_j})}{\sqrt(2)\sigma})$. If $\sigma = \infty $, $P(u_i > y) $ is equal to 0.5.

An example

To illustrate how our approach works, consider a scenario where a decision maker is searching in an online store to buy a single album of an up-and-coming music band she just heard about on the radio. The band has produced three albums so far. The decision maker can learn the exact utility of an album by listening to its songs. Her subjective beliefs are described by the SA model. As represented by the straight line in Figure above, the decision maker believes that the expected utility of an alternative can be estimated by $u_i = 0.3 \times a_i $, where $a_i$ is the average rating of the album by other users of the site. As represented by the bell-shaped curves, she believes that the estimation error $\epsilon_i $ of her model is iid Gaussian with mean $ \mu = 0$ and standard deviation $\sigma = 0.5$. The decision maker first samples album K, which has the highest expected utility. She finds out that the utility of album K is 1.87, which is slightly less than its expected utility. Then she has to decide whether it is worthwhile to examine the album L. Following equation 1 the expected returns from sampling album L can be written as $ P(u_L > u_K) \times E(u_L - u_K| u_L > u_K) - c $. $P(u_L > u_K) = 0.33$, $E(u_L - u_K| u_L > u_K) = 0.329 $ and their the product equals 0.109.Thus, the decision maker will examine the second album if the cost of search is lower than 0.109; otherwise she will stop search, choose album K, and never learn the actual utility of album L. Let us assume that the cost is 0.05. Then, the decision maker samples album L. She finds out that the utility of L for her is 2.27, which is higher than her expectation and the utility of album K. Thus, L replaces K as the sampled album with the highest utility ($y$ in equation 1). Now the returns from sampling album M can be written $ P(u_M > u_L) \times E(u_M - u_L| u_M > u_L) - c $. $ P(u_M > u_L) = 0.003 $ and $ E(u_M - u_L| u_M > u_L) = 0.152 $. The product of these two parts equals 0.0005. Thus the overall return is negative and the decision maker will stop search after sampling album L and choose it. She will never learn the realized utility of album M.

Deterministic optimization and random search: A possible compromise

In optimization problems, widely studied in economics, decision makers can determine the alternative that maximizes their utility. This vision of decision making contrasts with search models in which the decision makers sample alternatives at random and stop search after encountering a good enough alterative (e.g., Simon, 1955; Chow, Robbins & Siegmund, 1971; Caplin, Dean & Martin, 2011). Random sampling may lead to violations of the revealed preference principle and unpredictability in regard to the choices of individual decision makers. Ordered search models provide a possible compromise between these two approaches. Decision makers have a well-defined utility function before the search starts. However, as long as there is some uncertainty about the exact utility of the alternatives, it may pay to sample some of them to learn their utility. In our model, the initial preferences guide the search process but are also subject to revision when the true utility of the sampled alternatives is revealed. For an external observer, such as a firm or a market analyst, ordered search is more predictable, at the level of an individual decision maker, than random search. In ordered search, if the model of the decision maker and the actual utility of the alternatives are known, the external observer could also predict the final decisions made, as well as the preference reversals that would occur along the way.

#multi-attribute choice #search #utility #heuristics #optimization

0 notes

analytis · 10 years ago

Video

youtube

It has been a while since I started to study how people make decisions and almost three years since I began to experiment with cameras and documentary filmmaking. A few months ago, Perke Jacobs, Astird Kause and I, in collaboration with the PR department of the Max Planck Institute for Human Development, embarked on a new project and set out to interview on camera leading decision making researchers.

The resulting videos are an attempt to bring together in one single project the epistemic and aesthetic values of two strikingly different crafts — scientific research and filmmaking. Our goal is to record the history of decision-making research and to explore the potential of casual interviews in communicating life stories and scientific results to a wider public.

The first interview we released is featuring Tom Wallsten. Astrid was addressing the questions while me and Perke were behind the cameras. We inquired Tom about the reasons that led him pursue an academic career, the content and applications of his work and amusingly about how he makes his own decisions. Although Tom has been the mentor of superstar psychologists like Dan Ariely and one of the most respected researchers in the JDM community this was the first time in his career that he gave an interview on camera.

#risk communication #judgment and decision making #Tom Wallsten #preference intransitivity #talking about decisions #uncertainty #wisdom of the crowds #experts

1 note · View note

analytis · 10 years ago

Photo

Social influence and the collective dynamics of opinion formation

Social influence is the process by which individuals adapt their opinion, revise their beliefs, or change their behavior as a result of social interactions with other people. In our strongly interconnected society, social influence plays a prominent role in many self-organized phenomena such as herding in cultural markets, the spread of ideas and innovations, and the amplification of fears during epidemics. Yet, the mechanisms of opinion formation remain poorly understood, and existing physics-based models lack systematic empirical validation.

In a paper Mehdi Moussaïd, Juliane Kämmer, Hansjörg Neth and I, published in PLOS One back in November 2013 we drew on experimental methods inspired by social psychology and theoretical concepts of complex systems typical of statistical physics. First, we conducted two controlled experiments to describe the micro-level mechanisms of social influence, that is, how individuals revise their initial beliefs after being exposed to the opinion of another person. Then, we elaborated an individual-based model of social influence, which served to investigate the collective dynamics of the system. These methods enabled us to go beyond physics-based allegories and to ground a model of collective behavior directly on a plausible psychological model of opinion revision.

In a first experiment, 52 participants were instructed to answer a series of 32 general knowledge questions and evaluate their confidence level on a scale ranging from 1 (very unsure) to 6 (very sure). This baseline experiment was used to characterize the initial configuration of the system before any social influence occurs. In a second experimental session, 59 participants answered 15 questions in the same way but were then exposed to the estimate and confidence level of another participant and asked to revise their initial answer. This procedure renders opinion changes traceable, and the effects of social influence measureable at the individual level. Moreover, changes in confidence were tracked as well, by asking participants to evaluate their confidence level before and after social influence.

On the basis of the observations collected in the second experiment, we formulated an opinion revision model that took into account both opinion and confidence differences between the advisor and the receiver. The model can be easily visualised as a decision-tree that subsumes three fundamental heuristic strategies (i) retaining the same opinion (ii) adopting the opinion of the advisor (iii) compromising and moving toward the opinon of the advisor.

Then we designed an agent-based simulation, where we introduced pair-wise interactions between agents. In this way we were able to study the opinion dynamics in large populations of agents. The simulations allowed us to identify two major attractors of opinion: (i) the expert effect, induced by the presence of a highly confident individual in the group, and (ii) the majority effect, caused by the presence of a critical mass of laypeople sharing similar opinions. Further simulations revealed the existence of a tipping point at which one attractor dominates over the other, driving collective opinion in a given direction.

As people are embedded in strongly connected social networks and permanently influence one another, we believe that these results constitute a first step toward a better understanding of the mechanisms of propagation, reinforcement, or polarization of ideas and attitudes in modern societies.

#opinion formation #advice giving #heuristics #complex systems #polarization #convergence #experiments #wisdom of the crowds

1 note · View note