reading

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliographie: http://dnb.d-nb.de

Stephan Salinger, Lutz Prechelt:

Understanding Pair Programming: The Base Layer

Typeset with LaTeX in Palatino font Published and printed by:

BoD — Books on Demand, Norderstedt, Germany

www.bod.de

ISBN 978-3-7322-0270-6

This work is licensed under a Creative Commons

Attribution–NonCommercial–NoDerivatives 4.0 International License CC BY–NC–ND 4.0

http://creativecommons.org/licenses/by-nc-nd/4.0/

Contents
I Introduction

1 Introduction
1. 1.1 Pair programming
  1. 1.1.1 What is pair programming?
  2. 1.1.2 Is pair programming advantageous?
2. 1.2 Current understanding of pair programming
3. 1.3 The data used for this book
  1. 1.3.1 Session BA1
  2. 1.3.2 Session CA2
  3. 1.3.3 Session ZB7
4. 1.4 Our research perspective
  1. 1.4.1 Basic research perspective: Understanding programming
  2. 1.4.2 Practitioner perspective: Using pair programming
  3. 1.4.3 Overall research approach: Work in “layers”
  4. 1.4.4 Research method: Grounded Theory Methodology
  5. 1.4.5 On using prior research results
5. 1.5 About this book
  1. 1.5.1 What this book is
  2. 1.5.2 What this book is not
  3. 1.5.3 How to read this book
  4. 1.5.4 How to start performing research based on this book
6. 1.6 Terminology and notation

2 Overview of the base layer
1. 2.1 What are the base concepts?
  1. 2.1.1 Concepts and concept classes
  2. 2.1.2 HHI concepts vs. HCI/HEI concepts vs. supplementary concepts
  3. 2.1.3 HHI concept class groupings
2. 2.2 What is the base layer?
3. 2.3 Key decisions for the base layer
  1. 2.3.1 Primarily rely on verbalization
  2. 2.3.2 Model illocutionary acts
  3. 2.3.3 Let segmentation emerge
  4. 2.3.4 Crave for behavioristic interpretation
  5. 2.3.5 Model the discourse world, not the activity world
  6. 2.3.6 Model dialog episodes
  7. 2.3.7 Design the concepts to reflect relevant phenomena

II The HHI concepts: Human/human interaction

3 Objects and verbs of the HHI concepts
1. 3.1 The structure and meaning of concept names
2. 3.2 The objects
3. 3.3 The verbs
4. 3.4 The existing object/verb combinations
5. 3.5 Types of verbs
6. 3.6 The notion of “knowledge”
7. 3.7 propose vs. explain
8. 3.8 explain vs. think aloud
9. 3.9 disagree+propose vs. challenge

4 Product-oriented concepts: design
1. 4.1 Topic of design concepts
2. 4.2 design concepts and their properties
  1. 4.2.1 Types and intentions of proposals
  2. 4.2.2 Referring to editing steps
  3. 4.2.3 Proposals with rationale
  4. 4.2.4 decide vs. agree
  5. 4.2.5 amend vs. a new propose
  6. 4.2.6 amend or challenge one’s own proposal
  7. 4.2.7 Indicating agreement vs. indicating attentiveness
  8. 4.2.8 Short negations
  9. 4.2.9 Proposal-less questions
  10. 4.2.10 Restricted disagreement
3. 4.3 Discrimination from similar concepts
  1. 4.3.1 propose_design vs. ask_knowledge
  2. 4.3.2 *_design vs. explain_knowledge/explain_finding
  3. 4.3.3 propose_design vs. propose_step/propose_todo

5 Product-oriented concepts:requirement
1. 5.1 Topic of requirement concepts
2. 5.2 requirement concepts and their properties
  1. 5.2.1 remember_requirement
  2. 5.2.2 propose_requirement
  3. 5.2.3 agree_requirement and challenge_requirement
3. 5.3 Discrimination from similar concepts

6 Process-oriented concepts: step
1. 6.1 Topic of step concepts
2. 6.2 step concepts and their properties
  1. 6.2.1 propose_step with rationale
  2. 6.2.2 Purpose of making propose_step utterances
  3. 6.2.3 Reserving time
  4. 6.2.4 Imprecise proposals
  5. 6.2.5 decide_step vs. agree_step
  6. 6.2.6 amend, challenge, or disagree one’s own proposal
  7. 6.2.7 Indicating agreement vs. indicating attentiveness
  8. 6.2.8 ask_step
3. 6.3 Discrimination from similar concepts
  1. 6.3.1 *_step vs. explain_knowledge/explain_finding
  2. 6.3.2 propose_step vs. propose_design
  3. 6.3.3 propose_step vs. ask_knowledge
  4. 6.3.4 ask_step vs. ask_knowledge
  5. 6.3.5 ask_step vs. ask_design
  6. 6.3.6 disagree_step vs. explain_knowledge/explain_finding
  7. 6.3.7 amend_step vs. explain_knowledge/explain_finding

7 Process-oriented concepts:completion
1. 7.1 Topic of completion concepts
2. 7.2 completion concepts and their properties
  1. 7.2.1 Short evaluations
  2. 7.2.2 Indirect evaluations
  3. 7.2.3 Evaluation of quality
3. 7.3 Discrimination from similar concepts

8 Process-oriented concepts:todo
1. 8.1 Topic of todo concepts
2. 8.2 todo concepts and their properties
3. 8.3 Discrimination from similar concepts
  1. 8.3.1 propose_todo vs. propose_step
  2. 8.3.2 propose_todo vs. explain_knowledge/explain_finding
  3. 8.3.3 propose_todo vs. amend_design/propose_design

9 Process-oriented concepts:strategy
1. 9.1 Topic and typology of strategy concepts
  1. 9.1.1 OWP: Organizing Work Packages
  2. 9.1.2 DPR: Determining Procedure Rules
  3. 9.1.3 EXS: Expanding a step into a strategy
  4. 9.1.4 Extensional vs. intensional representation
  5. 9.1.5 Range
  6. 9.1.6 Mixed types
2. 9.2 strategy concepts and their properties
  1. 9.2.1 Proposal mode
  2. 9.2.2 Proposals with alternatives
  3. 9.2.3 decide_strategy vs. agree_strategy
  4. 9.2.4 Secondary issues
  5. 9.2.5 Forms of amend_strategy
  6. 9.2.6 Distinguishing proposals: amend, challenge, propose
  7. 9.2.7 ask_strategy
  8. 9.2.8 agree_strategy
  9. 9.2.9 disagree_strategy
3. 9.3 Discrimination from similar concepts
  1. 9.3.1 *_strategy vs. explain_knowledge/explain_finding
  2. 9.3.2 propose_strategy vs. propose_todo
  3. 9.3.3 propose_strategy vs. propose_step
    1. 9.3.3.1 Recycled strategies
    2. 9.3.3.2 step with forward reference
    3. 9.3.3.3 steps aiming at advantage
    4. 9.3.3.4 Multi-part proposals not forming a strategy
    5. 9.3.3.5 The creative act is invisible
    6. 9.3.3.6 Lowly creative acts
  4. 9.3.4 propose_strategy vs. propose_design
  5. 9.3.5 agree_strategy vs. agree_knowledge
  6. 9.3.6 ask_strategy vs. ask_knowledge
  7. 9.3.7 ask_strategy vs. ask_step

10 Process-oriented concepts:state
1. 10.1 Topic of state concepts
2. 10.2 state concepts and their properties
  1. 10.2.1 Short agree utterances
  2. 10.2.2 Lack of reference to a strategy
  3. 10.2.3 Partial disagreement
3. 10.3 Discrimination from similar concepts
  1. 10.3.1 explain_state vs. explain_completion
  2. 10.3.2 explain_state vs. explain_finding

11 Universal concepts: What is “knowledge”?
1. 11.1 On knowledge
2. 11.2 The base concepts’ notion of knowledge
3. 11.3 Priority rules for assigning knowledge concepts

12 Universal concepts:finding
1. 12.1 Topic and typology of finding concepts
  1. 12.1.1 finding type P: perceived event
  2. 12.1.2 finding type D: discovered issue
  3. 12.1.3 finding type T: thought
  4. 12.1.4 Priority rules for checking finding types
  5. 12.1.5 finding type indicators and examples
2. 12.2 finding concepts and their properties
  1. 12.2.1 Aggregation of utterances
  2. 12.2.2 Repeated statements
  3. 12.2.3 Thinking aloud
  4. 12.2.4 Revoking and replacing findings
  5. 12.2.5 “Additional” findings
  6. 12.2.6 Justifications of proposals
  7. 12.2.7 Justifications of findings
  8. 12.2.8 disagree_finding, challenge_finding
  9. 12.2.9 Reasons for agreement
  10. 12.2.10 Doubt
  11. 12.2.11 ask_finding?
3. 12.3 Discrimination from similar concepts
  1. 12.3.1 finding vs. other universal concepts
  2. 12.3.2 explain_finding vs. propose_design
  3. 12.3.3 explain_finding vs. *_step
  4. 12.3.4 explain_finding vs. explain_completion or explain_state

13 Universal concepts:hypothesis
1. 13.1 Topic of hypothesis concepts
  1. 13.1.1 Uncertain knowledge
  2. 13.1.2 Hard-to-verify assumptions
  3. 13.1.3 Readily verifiable conjectures
  4. 13.1.4 Issue types addressed by hypotheses
2. 13.2 hypothesis concepts and their properties
  1. 13.2.1 propose_hypothesis
  2. 13.2.2 agree_hypothesis, disagree_hypothesis, challenge_hypothesis
  3. 13.2.3 Conditional agreement
  4. 13.2.4 Revoking or replacing one’s own hypothesis
  5. 13.2.5 amend_hypothesis:One hypothesis or several?
  6. 13.2.6 Justification of hypotheses
  7. 13.2.7 Justification by hypotheses
3. 13.3 Discrimination from similar concepts
  1. 13.3.1 propose_hypothesis vs. explain_finding

14 Universal concepts:standard of knowledge
1. 14.1 Topic of standard of knowledge concepts
  1. 14.1.1 PT: Preparing knowledge transfer
  2. 14.1.2 RT:Refusing knowledge transfer
  3. 14.1.3 AT:Acknowledging knowledge transfer
2. 14.2 standard of knowledge concepts and their properties
  1. 14.2.1 ask_standard of knowledge
  2. 14.2.2 AT with paraphrasing
  3. 14.2.3 standard of knowledge in the making
  4. 14.2.4 explain_standard of knowledge may be findings
  5. 14.2.5 Limited-knowledge proposals
  6. 14.2.6 Implicit statements
  7. 14.2.7 Backward-looking statements
  8. 14.2.8 Signaling ongoing thinking
3. 14.3 Discrimination from similar concepts
  1. 14.3.1 explain_standard of knowledge vs. ask_knowledge
  2. 14.3.2 explain_standard of knowledge vs. agree_finding or disagree_finding
  3. 14.3.3 explain_standard of knowledge vs. explain_finding
  4. 14.3.4 explain_standard of knowledge vs. agree/disagree for a proposal
  5. 14.3.5 explain_standard of knowledge vs. propose_hypothesis

15 Universal concepts:gap in knowledge
1. 15.1 Topic of gap in knowledge concepts
2. 15.2 gap in knowledge concepts and their properties
  1. 15.2.1 explain_gap in knowledge
3. 15.3 Discrimination from similar concepts
  1. 15.3.1 explain_gap in knowledge vs. explain_standard of knowledge
  2. 15.3.2 agree_gap in knowledge vs. agree_standard of knowledge
  3. 15.3.3 explain_gap in knowledge vs. propose_step

16 Universal concepts:knowledge
1. 16.1 Topic of knowledge concepts
2. 16.2 knowledge concepts and their properties
  1. 16.2.1 Evaluations and judgments
  2. 16.2.2 Unprompted knowledge transfer
  3. 16.2.3 Rhetorical questions
  4. 16.2.4 Aggregation of utterances
  5. 16.2.5 amend_knowledge?
  6. 16.2.6 “Different” answers
  7. 16.2.7 Modes of agreement
  8. 16.2.8 Indicating agreement vs. indicating attentiveness
  9. 16.2.9 Opposition and controversy
  10. 16.2.10 Disagreeing by agreeing to the opposite
  11. 16.2.11 Opinions
  12. 16.2.12 Limited conviction
  13. 16.2.13 ask_knowledge is not always that
  14. 16.2.14 Questions including possible answers
  15. 16.2.15 Statement or question?
3. 16.3 Discrimination from similar concepts
  1. 16.3.1 explain_knowledge vs. propose_step
  2. 16.3.2 explain_knowledge vs. propose_design
  3. 16.3.3 explain_knowledge vs. explain_finding
  4. 16.3.4 explain_knowledge vs. amend_finding, challenge_finding, disagree_finding
  5. 16.3.5 explain_knowledge vs. agree_design/disagree_design
  6. 16.3.6 ask_knowledge vs. propose_hypothesis
  7. 16.3.7 ask_knowledge vs. explain_finding
  8. 16.3.8 explain_knowledge vs. explain_standard of knowledge
  9. 16.3.9 ask_knowledge vs. explain_standard of knowledge
  10. 16.3.10 agree_knowledge vs. explain_standard of knowledge

17 Universal concepts: activity
1. 17.1 The notion of facade concept class
2. 17.2 Topic of activity concepts
3. 17.3 activity concepts and their properties
  1. 17.3.1 Granularity of think aloud_activity
  2. 17.3.2 think aloud_activity phenomena leading to questions
  3. 17.3.3 HCI/HEI activities resulting from an utterance
  4. 17.3.4 Disconnect of HCI/HEI activity and verbalization
  5. 17.3.5 The partner commenting on activity vs. on verbalizations
  6. 17.3.6 challenge_activity
  7. 17.3.7 agree_activity, disagree_activity
  8. 17.3.8 Comments before the fact
  9. 17.3.9 Comments after the end
  10. 17.3.10 amend_activity vs. challenge_activity
  11. 17.3.11 stop_activity
  12. 17.3.12 Interjections leading to activity change
  13. 17.3.13 think aloud_activity by the “observer”
  14. 17.3.14 Self-criticism

18 Universal concepts: Miscellaneous
1. 18.1 mumble_sth
2. 18.2 say_off topic

III Other concepts

19 The HCI/HEI concepts
1. 19.1 write_sth
2. 19.2 search_sth
3. 19.3 explore_sth
4. 19.4 verify_sth
5. 19.5 read_sth
6. 19.6 sketch_sth
7. 19.7 show_sth
8. 19.8 do_sth
9. 19.9 On drivers, observers, and co-action

20 Supplementary concepts
1. 20.1 become_driver
2. 20.2 work in parallel_sth
3. 20.3 work alone_sth
4. 20.4 wait for_sth
5. 20.5 react to_interrupt

IV Using the base concepts

21 Guidelines for annotating
1. 21.1 How to pick appropriate HHI concepts
2. 21.2 How to pick appropriate HCI/HEI concepts
3. 21.3 What to consider as context
4. 21.4 When to use double HHI annotations
5. 21.5 How to segment utterances
6. 21.6 How to handle specific phenomena
  1. 21.6.1 How to annotate implicit announcements
  2. 21.6.2 How to annotate thematic shifts
  3. 21.6.3 How to annotate repetitions
  4. 21.6.4 How to annotate incomplete agreement or disagreement
  5. 21.6.5 How to annotate self-corrections
  6. 21.6.6 How to annotate justifications
7. 21.7 Method hints
  1. 21.7.1 Step back
  2. 21.7.2 Paraphrase
  3. 21.7.3 Peek into the future

22 Guidelines for modifying the base concept set
1. 22.1 What makes a good concept set
2. 22.2 When to shift a boundary
3. 22.3 When to add a new property value
4. 22.4 When to add a concept and which

23 Guidelines for creating new concept sets
1. 23.1 The idea of layers
2. 23.2 Granularity
3. 23.3 Properties and property values
4. 23.4 Forming “nice” layers
5. 23.5 Go!

Bibliography

Index

Note

The PDF version of this book contains very many cross-reference hyperlinks. It may be convenient to use the paper version for learning but then the PDF version for actually working with the base layer.

Acknowledgments

Sincere thanks to Laura Plonka for collecting a large part of our session recordings and for working closely with Stephan in the early stage of our analysis, to Franz Zieris for the first serious third-party use of the base layer, to Franz Zieris, David Socha, and Helen Sharp for their feedback on the book draft, to Gesine Milde for proofreading, and to all the pairs that agreed to be recorded and scrutinized.

Part I

Introduction

… in which we explain what this book is all about, how to best use it, and what notation we will use for the examples.

Chapter 1 Introduction

This book is a handbook for researchers attempting to make sense of what is going on in pair programming sessions; it is based on Stephan Salinger’s Ph.D. dissertation [11]. The present chapter will introduce pair programming (in Section 1.1), summarize what research has so far found out about it (Section 1.2), explain the the raw data we have used (Section 1.3) and the research approach we propose (Section 1.4), propose how to make use of the book (Section 1.5), and introduce a few key terms and notations (Section 1.6).

1.1 Pair programming

Assume you have a Ph.D. in dancing science and are the only non-programmer at a party full of programmers. According to the stereotype, it is hard to talk to these people. Your best bet would be to grab two or three of them at once and ask

“Is pair programming a good engineering practice?”

The ensuing discussion will be lively and despite talking to techies you can have your part in the discussion!

Pair programming is a subtle matter and so any good answer to the question ought to begin with “Well… ”, but (and that is what makes the discussion so lively) many people appear to have a simplified notion of it and a correspondingly clear opinion.

Why is that so? And what, exactly, is pair programming anyway?

1.1.1 What is pair programming?

Pair programming is an old technique. Fred Brooks (of Mythical Man-Month fame) reports: “Fellow graduate student Bill Wright and I first tried pair programming when I was a grad student (1953–56). We produced 1500 lines of defect-free code; it ran correctly first try.” [15, p.8]. Its modern popularity is largely due to Kent Beck’s 1999 book on eXtreme Programming (XP) [2], a holistic method for small-team software development consisting of twelve practices, a core one of which is pair programming. In the section on pair programming, Beck states “Pair programming really deserves its own book. It’s a subtle skill” [2, p.100], and indeed such a book appeared in 2002: “Pair Programming Illuminated”. It offers the following characterization:

“Pair programming is a style of programming in which two programmers work side by side at one computer, continually collaborating on the same design, algorithm, code, or test. One of the pair, called the driver, is typing at the computer or writing down a design. The other partner, called the navigator, has many jobs, one of which is to observe the work of the driver, looking for tactical and strategic defects.” [15, p.3]

Note that more than half of this definition is concerned with describing the roles of driver and navigator (the latter is now more (Google-)commonly called “observer”). But once you have read this book (or any substantial part of it), you will know that while the first part of the definition is alright, the second part is misleading: The description of both roles is wrong in many respects and the whole driver/observer distinction does not go far in characterizing the pair programming process anyway.¹

Kent Beck’s description is shorter: “Pair programming—All production code is written with two programmers at one machine.” [2, p.54]. There is elaboration later, but this is arguably his definition of this all-important practice. The 2004 second edition of the book is more explicit:

“Write all production programs with two people sitting at one machine. Set up the machine so the partners can sit comfortably side-by-side. Move the keyboard and mouse back and forth so you are comfortable while you are typing. Pair programming is a dialog between two people simultaneously programming (and analyzing and designing and testing) and trying to program better.” [3, p.26]

“Sitting comfortably” sounds like trivial information compared to the presumably illuminating driver/observer characterization, but it is relevant. And once you have read the present book, you will appreciate that the above definition captures, very inconspicuously, a key property of pair programming: “Pair programming is a dialog”. Yes!

At this point, we have nothing to add to that.

1.1.2 Is pair programming advantageous?

This leaves the other question: Why do some people have such a simplified (and then strong) notion of whether pair programming is a good engineering practice? The strongest ones tend to be the strict opponents: their attitude is usually the belief that the obvious cost of pair programming (occupying two precious software developers rather than just one) is so large that no corresponding benefits can possibly outweigh it.

More thoughtful discussants will not readily agree because the list of potential benefits is impressive. Here is (in paraphrased form) the one presented in “Pair Programming Illuminated” [15, p.4]:

The resulting code may contain fewer defects.
The pair will likely finish faster than an individual would.²
“Pair programmers are happier programmers.”
Pair programming builds within-team trust and improves teamwork.
Unless you use fixed pairs only, a developer will become acquainted with larger fractions of the overall code, design, and requirements.
Pair partners learn from each other.

How much do we know about which of these are true and to what degree? Not much.

1.2 Current understanding of pair programming

Since the pioneering study of Nosek (which appeared even before Kent Beck’s book) in 1998 [10] there have been many empirical studies on pair programming, in particular controlled experiments comparing it to solo programming, but the amount of knowledge produced by these studies is not large; an overview of research until 2007 is provided by Hannay et al. [8]. We do not aim at a detailed overview here. Roughly speaking, there is good evidence that pairs tend to be faster than solo programmers, some evidence that their work tends to have fewer defects, and beginning evidence that the designs produced are better. The size of each of these effects, however, is hardly understood: The results of individual studies differ so much (and those differences remain unexplained) that taken together the results are inconclusive.

What is worse, their validity is highly questionable as the conditions under which most of them were created are highly unrealistic: mostly nonprofessional programmers, normally non-gelled pairings, usually either development from scratch or work on fairly small programs, generally little or no relevance of domain knowledge. Even the most ambitious of the controlled experiments, which hired 295 professionals for one day, concluded: “It is possible that the benefits of pair programming will exceed the results obtained in this experiment for larger, more complex tasks and if the pair programmers have a chance to work together over a longer period of time.” [1]. This statement is also one of the few exceptions of the disturbing tendency that most studies tacitly assume there is no such thing as a specific pair programming skill distinct from general software development skill. We believe that this assumption is wrong and that successful pair programming research needs to reflect that. This implies a lot of qualitative pair programming research will be required before meaningful designs for quantitative pair programming studies can even be formulated.

For such qualitative types of research questions, the amount of work done so far is much smaller³ although the number of questions is larger: There is evidence that different capability levels of the pair members play a role [5] and some evidence that personality characteristics of the pair members may play a modest role, too [9]. Only few studies discuss high-level behaviors or mechanisms and those do not do much decomposition or analysis yet, e.g. [6], or are even based on anecdotal evidence only, e.g. [16].

In our view, the most conclusive of the qualitative studies showed that the description of the driver and navigator roles from the above definition does not represent reality: Rather than working on different levels of abstraction (low and high for the navigator versus medium for the driver) as the definition assumes, the partners in fact strongly tend to move through these abstraction levels together [4, 6]. Work towards a more meaningful roles model is still in its infancy [13].

1.3 The data used for this book

The results and all examples presented in this book are based on complete recordings of individual pair programming sessions. The recordings consist of audio, a pixel-precise recording of all screen activity, and a webcam recording of the pair (usually recorded from atop the monitor). We use Techsmith Camtasia Studio⁴ for recording and place the webcam video into the lower-right corner of the screen video. See [12] for a few more details.

We possess a substantial collection of such recordings of typically one to three hours length. 55 recordings stem from pairs of 48 different volunteer industrial software developers (called A1 to K4, see session descriptions below) doing their normal work in their usual environment (domain, code, task, tools, hardware, office, etc.) in one of 11 different companies (called A to K). The reality distortion of these videos is presumably negligible; the pairs do not show (nor report when interviewed afterwards) any acute awareness of being recorded beyond a minute into their work. The videos reflect a variety of domains, developer constellations, and task types; most tasks can be subsumed under extension programming. They reflect only small cultural variety, though: All sessions are from German companies and involve German-speaking developers. See the note on translation in Section 1.6.

Further 28 recordings stem from pairs of 56 different volunteer graduate students (called Z1 to Z56) working in one of 5 different controlled laboratory settings (called ZA to ZE). The advantage of these recordings is that the researcher has a good understanding of the code base, the task, and correct solutions for the task, making it often much easier to understand what is really going on in the session.

Only 7 of these recordings (6 professional/industrial and 1 student/laboratory) were used for the research reflected here. For the concepts reported here, we reached theoretical saturation (see Section 1.4.4) with only this many.⁵ For the examples presented in this book, we even confine ourselves to only three of these sessions, so that over time you can get better acquainted with their respective topics; some of the examples are even understandably related. These three sessions are the following:

1.3.1 Session BA1

An industrial session (with a duration of 1:47 hours) of two professional programmers B1 and B2 who worked for a large community portal operator B and had paired several times before. They built an extension to the community portal, which is implemented in PHP. The task difficulty had several aspects including understanding the design and design rationale of the pre-existing code, which had been written by nearshore programmers.

1.3.2 Session CA2

An industrial session (duration 1:16 hours) of two professional programmers C2 and C5 who worked for a software product company C. The product they work on is a geographics information system (GIS) desktop GUI application written in Java. The design of this software uses abstraction elaborately; the task involves a small functional extension and its main difficulty lies in understanding and properly applying the existing design abstractions.

1.3.3 Session ZB7

A laboratory session (duration 2:58 hours) of two graduate students Z19 and Z20 who had worked together as a pair several times before. They built a small extension to a cleanly designed Java EE web shop system with which they were modestly familiar. The main task difficulty lay in the need to apply certain Java EE technologies (JMS, JNDI, JBoss application server) that the developers had learned about in a recent graduate course but had not applied often beforehand.

1.4 Our research perspective

The purpose of this book is to lay the groundwork for a stream of research aiming at thoroughly understanding pair programming. We will now explain why we believe this is relevant from the perspective of basic software engineering research (Section 1.4.1) as well as from a practioner perspective (1.4.2), what the overall architecture of this research will look like (1.4.3), which specific research method we suggest to primarily use (1.4.4) and what the benefits are with respect to science’s principle of knowledge accumulation (“standing on the shoulders of giants”, Section 1.4.5).

1.4.1 Basic research perspective: Understanding programming

Several decades after research began that attempted to understand what is going on in the activity we call “programming”, this understanding is still very much in its infancy. Pair programming provides a wonderful opportunity for making a lot of progress there, because rather than having to rely on artificial think-aloud data gathering techniques, pair programmers verbalize naturally much of the time.

Pair programming will surely be different from solo programming in many respects, but probably also fundamentally similar. And while think-aloud studies may occasionally be possible even in industrial work contexts, they tend to be difficult to arrange. In comparison, pair programming data can be gathered more easily and almost uninvasively in industrial work contexts on real work tasks; see Section 1.3. This whole basic research aspect, however, is more a fringe benefit, not the core reason why we started this line of work.

1.4.2 Practitioner perspective: Using pair programming

Our overall research goal is to understand the mechanisms of pair programming sufficiently well to provide practitioners with detailed advice regarding (a) in which situations to use pair programming and (b) how pair members might behave to make pair programming effective, smooth, and efficient.

The basic idea for achieving this is to understand many sub-behaviors at work within pair programming and formulate this understanding into one or more patterns or antipatterns of behavior for each. This research will be almost purely qualitative; better quantitative research can then be started based on this differentiated and advanced understanding.

1.4.3 Overall research approach: Work in “layers”

The goals described in Sections 1.4.1 and 1.4.2 are far too ambitious for a single research project; the work needs to be modularized somehow. This, however, will not be easy: Initially, many fundamentals need to be understood before even the first few useful patterns will emerge. Later on, of the various topics studied, many will be interdependent or at least layered on top of each other.

Our overall approach is therefore to first lay a foundation of elementary concepts useful for analyzing and understanding pair programming sessions. This is what the current book is about. We call this foundation the base layer. It consists of a set of base concepts (surprisingly called the base concept set and introduced in Chapters 3 to 20) and rules for its use (Chapter 21) and extension (Chapter 22).

On top of this foundation, a subsequent study of some pair programming topic X (such as “decision-making”) can then build an X-layer of concepts that together characterize X. While working on the X-layer, the study can make use of the base layer and of the concepts found in subsequent studies performed earlier on other topics A, B, C (say, “pair programming roles” and others). If, for understanding X, some other topic Y (say, “knowledge transfer”) is relevant, the study on X will obtain a minimal understanding of Y required internally but needs not work it out fully.

Once the study of Y has been performed later (which may also use the X-layer fully), the X-layer can be consolidated into also using the Y-layer. This will break the layering for the overall results (pair programming is a holistic activity after all!), but still keeps a convenient mostly-layered work style for the individual sub-studies.

Each such study may provide a number of behavioral patterns and antipatterns. The role of the base layer is special because it provides common terminology that not only jumpstarts but also connects the other studies such as to form a whole rather than a set of separate pieces. The number of concepts in the base layer is sufficiently small to allow the various researchers to stay on top of them, so there are good chances of actual (near-)consistency between studies even of different researchers rather than only formal pseudo-consistency.

1.4.4 Research method: Grounded Theory Methodology

When we started with this work, we felt that many of the common statements made about pair programming were likely misleading or at least naive, but we had no expectation of what a better characterization would be like. We shared Kent Beck’s view that pair programming is “a subtle skill”. So once we had made the decision to analyze session recordings such as those described in Section 1.3, we had no idea which aspects of them would be relevant: The dialog content? Its wording? Phrasing? Intonation? Screen content? Changes of screen content? Human activity on the computer? Gestures? Facial expressions? The list went on and on. We quickly decided it would be important to pick a research method that was as empty of assumptions as possible.

Ethnographical approaches are rather far away from software engineering thinking, so we decided for Grounded Theory Methodology (GTM) [14] as our basic research approach. We selected the Straussian variety because we expect its higher degree of structuredness to be more appealing to software engineers compared to the Glaser style – and we believe that both methods, if understood correctly, will lead to similarly valuable results.

We will not give a primer on Grounded Theory Methodology here. If you have not used GTM before, you might want to get a textbook about it and read it up; there are a number of such books. The Strauss/Corbin book (or its second edition but preferably not the third) is a possibility although other books may be easier to work with. To summarize it in a nutshell, GTM suggests to work as follows:

• GTM work aims at a conceptual explanation (theory) of some phenomenon of interest for which each element of the explanation (called a concept or category; we will only use the former term) is directly connected to one or more raw observations (grounding).

• Formulate your research interest. In our case this was “Define the elementary behaviors which constitute pair programming.”⁶ The research question is allowed to drift freely during GTM work.

• Obtain some observation data. In our case this was the first handful of session recordings. GTM work does not require to pre-plan the data collection nor to achieve any kind of representativeness. Additional data will be collected once the researcher has found for what sub-phenomena more data is needed (theoretical sampling). For instance, a study of knowledge transfer in pair programming might find that the general knowledge level difference within the pair appears to be highly relevant. If no recording of an expert working with a true novice is yet available, the researcher would look for such a context and make a recording there. Representativeness is not required because GTM results focus on explaining things that exist, not on making claims about their frequency.

• Work through the observation data and annotate labels to phenomena that appear “interesting” with respect to the research focus. You need theoretical sensitivity to select relevant phenomena and appropriate labels. The phenomenon can be anything and of any granularity. Each label is the name of a (preliminary) concept; see Chapters 4 to 20 for examples. It is meant to be reused in several places in the data. Each concept is chosen such as to help explain some aspect of the phenomenon (theoretical coding). This process is called open coding.

• When assigning the same label again, make sure the phenomena are similar so that you will obtain a consistent concept. To do so, compare to all previous annotations of this concept (constant comparison), determine the commonalities, and record them in a memo. Make sure your concept assignment is fully grounded, that is, is based only on phenomena actually present in your data, not on any prior knowledge you might have (or rather: assume). The ungrounded use of any prior assumption when assigning a concept is called forcing.

• If the differences between phenomena annotated with the same concept appear relevant, represent them by auxiliary concepts: attributes (properties) and attribute values (also properties); apply constant comparison and memoing to them as well. This process is called dimensionalization. Avoid forcing.

• If you have accumulated enough isolated concepts, start discovering relevant relationships between concepts and validate them for specific phenomena. The relationships may pertain to context factors, constraints, causes, effects, the actor’s strategies, etc. This process is called axial coding and should also involve constant comparison as well as a lot of memoing. Avoid forcing. Meanwhile, open coding continues as well.

• If you have accumulated enough relationships, determine the core of the subject matter and extract those concepts around it that allow to formulate a narrative (grounded theory) that explains what is going on around this core concept. This process is called selective coding. Beware of forcing! Selective coding can start as soon as you have the first idea for it and should start no later than when you find you are detecting only known concepts, not creating new ones (theoretical saturation). Selective coding will often point out gaps in your conceptualization and hence trigger theoretical sampling, in particular if you start it early.

Working in this manner (with mostly open coding, some dimensionalization, a little axial coding, and no selective coding) and considering all of the abovementioned aspects of the data, we were initially totally overwhelmed by the amount of information residing in our recordings. To cope with this, we developed several additions to plain GTM (see [12] for details), in particular:

• A perspective on the data: GTM suggests to initially conceptualize “everything” that may be of relevance and only start focussing on fewer concepts during selective coding. This approach does not work for data as rich as ours with a research question as open as ours. We decided early on that we would need to constrain ourselves to behavioristic concepts as much as possible (see Section 2.3.4 for details) and soon thereafter to conceptualize verbal interaction in far more detail than other behaviors (see Section 2.3.1).

• Structured concept names to further constrain and structure the applicable concept universe in order to make it manageable. See Section 2.1.1 and Section 3.1 for details.

• Pair conceptualizing: Doing GTM in pairs (which we originally called pair coding) helps to quickly weed out or improve inadequate conceptualizations, in particular early in a study when the concept set is still small and hence open to a multitude of possible additions, including additions that lead astray. This practice can save inordinate amounts of time and frustration.

• Furthermore, most GTM books recommend transcribing the data but adequate transcription of hour-long audio/video data that is as fine-grained and feature-rich as ours is hardly practical. So we annotate these data directly (without transcription) in the ATLAS.ti⁷ data analysis software.

1.4.5 On using prior research results

When doing GTM, knowing a lot about your phenomenon in advance is a mixed blessing: On the one hand, such prior knowledge can greatly enhance your theoretical sensitivity and hence speed up the research process a lot. On the other hand, it can lead to forcing and thus ruin the validity of your results if you are not careful.

all is data.

Contents

Part I

Introduction

Chapter 1

Introduction

1.1 Pair programming

1.1.1 What is pair programming?

1.1.2 Is pair programming advantageous?

1.2 Current understanding of pair programming

1.3 The data used for this book

1.3.1 Session BA1

1.3.2 Session CA2

1.3.3 Session ZB7

1.4 Our research perspective

1.4.1 Basic research perspective: Understanding programming

1.4.2 Practitioner perspective: Using pair programming

1.4.3 Overall research approach: Work in “layers”

1.4.4 Research method: Grounded Theory Methodology

1.4.5 On using prior research results