### In brief: 5 useful measures

- What proportion of all contributions by all others were
*received*by each participant? - What proportion of each participant’s contributions
*built on*others’ contributions? - What proportion of all of a participant’s contributions are in the surviving storylines?
- What proportion of all participants contributed to each surviving storyline?
- What proportion of all storylines became extinct?

# In-depth: Analysis of Contents

The contents generated through the ParEvo process are expected to be of direct value to the participants. Stages 10 and 11 (Review and Follow-Up) of the ParEvo process are points where that value will be most explicitly identified.

Generated content can also be analysed in various ways:

- Open discussion in a workshop context, facilitated and structured as needed
- Content analysis of ParEvo storylines, by coding themes within the contributions made to storylines. This is a more theory led approach.
- Cluster analysis using machine learning methods such as Topic Modelling. This is a more data-led approach.

# In-depth: Analysis of Participation

### Participation data

When a participant makes a new contribution to existing storylines, they make two types of connections: (a) they connect to another participant – the one whose contribution they are immediately adding to, and (b) they connect to a specific storyline – a string of contributions that have been built on, one after the other.

In a ParEvo exercise, using the web application (under development), data on these connections is accumulated in the form of two exportable matrices, known in social network analysis (SNA) jargon as (a) an adjacency matrix, and (b) an affiliation matrix, respectively. An example of each is shown below. In the adjacency matrix, the cell values are the numbers of times the row actor has added a contribution to an existing contribution made by the column actor. In the affiliation matrix, the cells values indicate the number of times each column participant has contributed to a row storyline. In these two anonymised examples, are based on the MSC pretest exercise, which only ran for four iterations.

**Adjacency matrix**

**Affiliation matrix**

In the ParEvo app both kinds of matrices will automatically be generated during each exercise, and available to the Facilitator. Their analysis will then be in the hands of the Facilitator, hopefully, informed by the suggestions below.

PS: This Excel file illustrates how many of the measures discussed on this web page can be generated from the above matrices

In future versions, the matrices will be supplemented by a small number of automatically generated measurements summarising important features.

## Measuring the diversity of participants’ responses

This can be analysed from three perspectives

- Variations across the whole matrix
- Variations across rows
- Variations across columns

In all of these analyses, we can look at what is happening in terms of diversity. There is a big literature on the measurement of diversity. Here I make use of Stirling’s (1998) influential paper. He suggested that diversity can be measured on three dimensions: variety, balance, and disparity. In the discussion below the focus will be mainly on variety and balance.

Why examine diversity?

- Variation is intrinsic to an evolutionary process
- Diversity is indicative of a degree of agency
- Lots of research has been done on diversity & group performance
- Simple but sophisticated measures available, already used in other fields:
- Ecology
- Social Network Analysis

### The ecology of ideas – a whole matrix view

#### Diversification

In the adjacency matrix, showing relationships between contributors and recipients a simple aggregate measure of variety can be based on a count of the cells with any non-zero values in them. There are 23. This is 88% of the possible maximum, given that there were 26 contributions in total (the sum of all the cells). The whole matrix represents all the possible combinations of types of ideas. One could argue that a higher variety score means participants have been more willing to explore a wider range of ideas. In the Brexit pretest the variety measure was lower, at 66%.

A measure of the balance of these contributions would look at how evenly spread these contributions were. Eyeballing the matrix above shows us that most values are 1’s, and only three others have different values (2’s). Balance can be quantified using a simple measure like the Standard Deviation (SD), or more complex measures. A zero value would indicate a complete balance. In this example above the SD is 0.33. In the Brexit pretest, it was 0.75, indicating a much more uneven spread of contributions.

More aggregate measures of diversity, like Simpsons Diversity Index, combine both balance and variety (but are named as evenness and richness).

#### Specialisation

The oppositive of diversification is specialization. This could take the form of many different local coalitions of between particular participants, where each built on each other’s contributions. These relationships can be visualised using social network analysis software. An example can be seen in the network diagram below, showing relationships between participants in a 1990s pretest of the ParEvo process. There a “clique” of three participants built on each other’s contributions (shown connected by red links).

One particular type of specialisation can be seen when participants build on their own previous contributions. This is evident in the green diagonal cells in the adjacency matrix. These can be measured as a proportion of all cells in the matrix with values. In the MSC pretest, this proportion was 27%. In the Brexit pretest, it was much higher at 65%.

#### Disparity

Disparity is the distance between two types, in terms of differences in their attributes. An ape and human being and not very disparate, compare to an ape and a frog. One way of conceptualising and measuring disparity in a ParEvo exercise is to use the SNA measure known as “closeness”. Closeness is the average distance, in terms of links, of the shortest paths that connect a given actor in a network, with each other actor. In the network diagram above C is the most distant, and so could be seen as the most disparate. E is the closest and be seen as the least disparate. Disparity may be a useful measure of how central or peripheral different participants are in the collective construction of storylines.

A similar application of the concept of disparity is to think of links in the tree of storylines as links in a network structure. The distance between any two nodes at the end of different storylines can be treated as a measure of disparity. The Closeness measure, mentioned above, may be useful here as well. Though this measure will look at the distance between all nodes, not just those at the end of storylines.

One way of thinking about disparity is to see it as a description of the space of possibilities that have been incrementally explored by participants. In the two imagined examples below are two extreme possibilities. In the first, all the current surviving storylines are nearby to each other, only 2 x 1 links apart from each other. In the second, all the (same number of) surviving storylines are much further apart, being 2 x 10 links apart from each other.

### Participants as contributors – variations across rows

The same measures of variety and balance can be constructed for individual participants. Individual participants varied in the way they contributed to others existing contributions. Variety in this context refers to the range of other participants they contributed to. Variety = count of row cells values >0 / sum of row cells values. In the MSC pretest variety ranged from 67% to 100%, with an average of 91 %, whereas in the Brexit pretest variety ranged from 50% to 100% with an average of 70%. Variety was greater in the MSC pretest.

Balance in this context refers to the extent to which their contributions were evenly spread across those they had contributed to, or not. In the MSC pretest, the SD of values ranged from 0.0 to 0.5, with an average of 0.15, whereas in the Brexit pretest SD values ranged from 0.0 to 1.0 with an average of 0.39. Balance was greater in the MSC pretest.

These two measures can be combined into a single measure known as Simpsons Diversity Index. There is a useful online calculator here: https://www.alyoung.com/labs/biodiversity_calculator.html This is a more sophisticated measure suitable when there is a larger and more varied number of values in the matrix.

Another much simpler measure which does not make these distinctions between variety and balance is *the proportion of a participant’s contributions which built on others’ contributions* (and not their own). This is probably the most suitable for feedback to participants and one which might, if publicised, encourage such behavior. In the MSC pretest this percentage ranged from 33% to 100%. In the Brexit pretest it ranged from 0% to 100%. Averaged overall participants 73% of MSC pretests contributions built on others’ contributions, whereas in the Brexit pretest the proportion was much lower at 33%.

### Participants as recipients – variations across columns

Individual participants also varied in the way others contributed to their existing contributions. Variety in this context refers to the range of other participants they received contributions from. Variety here = count of column cells values >0 / sum of column cells values. In the MSC pretest, this variety ranged from 67% to 100%, with an average of 90 %, whereas in the Brexit pretest this variety ranged from 33% to 100% with an average of 70%. Variety was greater in the MSC pretest.

Balance in this context refers to the extent to which the contributions of others were evenly received. In the MSC pretest SD values ranged from 0.0 to 0.5, with an average of 0.15, whereas in the Brexit pretest SD values ranged from 0.0 to 0.5 with an average of 0.18. The difference in the balance of received contributions was very small.

Another measure which does not make these distinctions is *the proportion of all contributions by all others which were received by a participant. *As above, this is probably the most suitable for feedback to participants and one which might act as a motivator. In the MSC pretest, the proportion ranged from 8% to 15%. In the Brexit pretest it ranged from 3% to 17%. These ranges might be expected to grow as the number of iterations increase. In the pretests there were only four iterations each

## Measuring the diversity of storylines

The same threefold perspective can be applied to the affiliation matrix, showing how participants contributed to different storylines:

- Variations across the whole matrix
- Variations across rows
- Variations across columns

### The whole matrix view

The same variety and balance measures used above can also be applied to the affiliation matrix, showing the relationship between storylines and participants. Here variety is lower, at 71% of the maximum possible. Balance is also lower, with an SD of 0.8. Overall diversity is lower than in the adjacency matrix. In the Brexit pretest, the corresponding values were 57% for variety and for balance, a SD of 0.98.

### Participants as contributors to storylines – variations across columns

In the MSC pretest affiliation matrix the variety measure for individual contributors ranged from 25% to 100% with an average of 74%. Balance of their contributions ranged from an SD of 0.00 to 1.00 with an average of 0.33.

Another contribution measure is the proportion of all of a participant’s contributions that are present in the surviving storylines to date. In the MSC pretest participants’ scores on this measure ranged from 0% to 80%, with an average of 51%. This might be considered as an achievement measure for individual participants if a gamified approach was being considered

### Storylines as recipients – variations across rows

In the MSC pretest affiliation matrix, the measure of variety of contributions received by different storylines ranged from 25% to 100% with an average of 80%. Balance of their contributions ranged from an SD of 0.00 to 1.00 with an average of 0.22. An SD of 1.00 occurred where the storyline received 3 out of 4 contributions from one participant.

Another recipient measure is the proportion of participants contributing to each surviving storyline (relative to the number possible given the number of iterations completed). In the MSC pretest, storyline scores on this measure ranged from 25% to 75%. If wide ownership of storylines is desired then high scores on this measure would be valued.

### Other storyline measures: Exploration and Exploitation

At its simplest, exploration is the process of searching out and testing out of multiple alternatives. In contrast, exploitation involves focusing in on one option, to extract its full potential.

The distinction, and tension, between exploration and exploitation strategies has been around for a long time but is perhaps most strongly associated with a paper of that name by James March, published in 1991. Here is a recent review of the impact of that 1991 showing just how wide its influence has been.

It seems possible that the prevalence of these contrasting strategies could be identified at two levels: Within individual storylines and within the whole set of storylines in an exercise.

#### Exploration within storylines

The number of side-branching storylines produced by a storyline could be significant. A higher proportion means there was a wider exploration of alternatives in the course of a given storyline’s development. In the MSC pretest one storyline had 3 side branches developed over four iterations (See here). In the Brexit pretest, 5 storylines had 2 side branches developed over four iterations. In an exercise with four iterations and 10 participants the maximum possible number of side branches for a given storyline would be, I think, 27 i.e. 9 per iteration, excluding the final iteration.

#### Exploration across all storylines

The proportion of extinct versus surviving storylines as a whole is another potentially useful measure. A higher proportion means there was a wider exploration of alternatives. If all participants contributed to their own storylines only there would be no dead storylines at all per generation (See the 5th diagram from the top). On the other hand, if all participants contributed to the same storyline in each new iteration there would be the highest possible proportion of dead storylines per iteration (=((N-1)*(N-1))/N= 81% – See the 4th diagram from the top ). In the MSC pretest, 61% of all storylines became extinct. In the Brexit pretest, 47% became extinct. There was less diversity in the form of exploration of alternatives.

There is a bit of a puzzle here. In the two extreme possible tree structures shown above, the top on has the highest level of exploration i.e. proportion of extinct storylines (81%), whereas the tree at the bottom has no exploration behavior at all. Yet, the top tree has the lowest level of diversity, measured in terms of variety, balance and disparity. I would have expected exploration behavior to be associated with high levels of diversity, and exploitation to be associated with low levels of diversity. Any ideas?

**Probability and desirability of storylines**

Each participant’s end-of-exercise summary judgment of each surviving storyline makes judgments on these two criteria. See more on this here: Evaluation

## One thought on “Measurement and Analysis”