Notes

Notes to Introduction

1. See Kolata 2002a; also see Moseley, O’Malley, et al. 2002; Ashton and Wray 2013.

2. Felson and Buckwalter 2002; Gerber and Patashnik 2006; David T. Felson, personal communication, September 27, 2004.

3. Gerber and Patashnik 2006.

4. Quoted in Kolata 2002b.

5. IOM 2007.

6. Ashton and Wray 2013, 3.

7. Brownlee 2008.

8. Ashton and Wray 2013.

9. J. Wennberg and A. M. Gittelsohn 1973; J. E. Wennberg 2010.

10. Balas 1998; Morris, Wooding, and Grant 2011.

11. Carey 2006.

12. Some more precise definitions are in order. Comparative Effectiveness Research (CER):

“The direct comparison of two or more existing healthcare interventions to determine which interventions work best for which patients and which interventions pose the greatest benefits and harms. The core question of CER is which treatment works best, for whom, and under what circumstances.” Patient-Centered Outcomes Research (PCOR): “Research that helps people and their caregivers communicate and make informed healthcare decisions, while allowing their voices to be heard in assessing the value of healthcare options. This research answers patient-centered questions.” See http://www.pcori.org/funding-opportunities/how-apply/glossary. Cost-effectiveness analysis, which is used by the National Institute for Health and Care Excellence (NICE) in the United Kingdom: “An economic analysis that compares the relative costs and outcomes of two or more courses of action (or nonaction).” The PCORI is not permitted to develop or employ dollars per quality-adjusted life year (or similar) measures as a threshold to establish what type of health care is cost-effective or recommended. For background, see “Health Policy Brief: Comparative Effectiveness Research” 2010.

13. PCORI website: http://www.pcori.org/about-us.

14. www.pcori.org/sites/default/files/PCORI_Authorizing_Legislation.pdf

15. See Fairbrother et al. 2014.

16. Gray, Gusmano, and Collins 2003; J. Avorn 2009; Gerber and Patashnik 2010.

17. Buchbinder et al. 2009; Kallmes et al. 2009.

18. McGlynn, Asch, et al. 2003.

19. Aaron and Ginsburg 2009; Berwick and Hackbarth 2012; D. Cutler 2013; D. M. Cutler 2014a; Laugesen and Glied 2011.

20. J. Wennberg and A. M. Gittelsohn 1973.

21. J. E. Wennberg 2010.

22. Skinner et al. 2009.

23. Chandra, Finkelstein, Sacarny, and Syverson 2016.

24. Cooper et al. 2015.

25. Cooper et al. 2015.

26. Cooper et al. 2015.

27. Skinner, Goodman, and Fisher 2015.

28. Sackett, Rosenberg, Gray, et al. 1996, 71; see also Ashton and Wray 2013, chapter 5.

29. M. Rodwin 2001, 439.

30. IOM 2011; see http://nationalacademies.org/hmd/Reports/2011/Learning-What-Works-Infrastructure-Required-for-Comparative-Effectiveness-Research.aspx. The difference between the United States and other countries in their approach to EBM reflects, “at least in part, different approaches to health care financing” (Fairbrother et al. 2014, 1). While health care in Europe, Australia, and Canada is, generally speaking, organized and financed centrally, the United States has a complex mix of public and private payers. But there are also many points of similarity between the U.S. health care financing system and those of other advanced nations, including the use of fee-for-service and prospective payment systems and expert concerns about unwarranted variation, quality, and costs. The weakness of countervailing pressure to address the bad science and waste endemic to U.S. medical care requires careful explanation, especially with respect to the Medicare program.

31. De Vries and Lemmens 2006.

32. On the benefits and limits of standardization in medicine, see Timmermans and Berg 2003.

33. Tanenbaum 2012.

34. We thank David Mechanic for helpful insights on these points.

35. Schlesinger and Gray 2016.

36. We thank Mark Schlesinger for this formulation.

37. Gawande 2015, 42.

38. Sirovich, Woloshin, and Schwartz 2011.

39. McGlynn, Asch, et al. 2003; Welch 2015.

40. Blumenthal and Squires 2014.

41. Wilson 1973.

42. Skowronek, Engel, and Ackerman 2016.

43. See Baumgartner and Jones 1993.

44. For detailed information on the medical product industry’s campaign contributions and lobbying activities, see: http://www.opensecrets.org/industries/indus.php?cycle=2014&ind=H04.

45. Jerry Avorn and Kesselheim 2015; Jerry Avorn and Kesselheim 2017.

46. Mazer and Curfman 2017; see also Jerry Avorn and Kesselheim 2015.

47. Mechanic 2006.

48. “Health Policy Brief: Reducing Waste in Health Care” 2012.

49. Callahan 2009, 7.

50. See, e.g., T. R. Marmor 2000; M. Peterson 2001.

51. For a similar approach, see Terry Moe’s recent essay on the need to study the power of vested interests (2015).

52. Starr 1982; T. R. Marmor 2000; M. Peterson 2001; Laugesen 2016.

53. Cruess and Cruess 2004.

54. On the role of expertise, values, and institutions in health care, see Weimer 2010.

55. Starr 1982; M. Peterson 2001; but see Laugesen 2016.

56. Feldstein 2011; Mechanic 2004b; Robinson 2001.

57. Hall 2003.

58. Ferguson, Dubinsky, and Kirsch 1993.

59. Laugesen and Rice 2003; Laugesen 2009.

60. Arrow 1963.

61. Parsons 1939.

62. Dzur 2008, 56.

63. Dzur 2008, 65; Freidson 1970.

64. Freidson 1970; Starr 1982; M. Peterson 2001.

65. Cruess and Cruess 2004.

66. ABIM 2005, 2.

67. T. R. Marmor 2000.

68. M. Peterson 2001, 1156; see also L. R. Jacobs 1993.

69. Susan Bartlett Foote 2002; Bagley, Chandra, and Frakt 2015.

70. On path dependence, see Pierson 2000.

71. Bagley 2013, 533.

72. Oberlander 2016.

73. Mayes and Berenson 2006.

74. Goldhill 2013.

75. Chambers, Chenoweth, Thorat, and Neumann 2015.

76. See Orren and Skowronek 2004; but see also Frakt 2015a.

77. Swenson n.d.

78. M. Peterson 2001.

79. Mechanic 2004a, 1419.

80. Knox 2012.

81. AAOS 2013a.

82. Rosenberg et al. 2015.

83. Gray, Gusmano, and Collins 2003.

84. Roman and Asch 2014; Ubel 2015.

85. Blumenthal 2014.

86. Tilburt et al. 2013.

87. M. Peterson 2001.

88. Wilson 1973; Sheingate 2003; Posner 2003; Volden and Wiseman 2014.

89. Gerber and Patashnik 2006.

90. Stokes 1963.

91. Lee 2009.

92. J. Avorn 2009.

93. Moe 2015.

94. E.g., Delli Carpini and Keeter 1996.

95. E.g., Zaller 1992.

96. Skowronek, Engel, and Ackerman 2016.

97. Belkin 1997; Morone and Belkin 1995.

98. Stone 2011.

Notes to Chapter 1

1. McGlynn, Meltzer, and Hacker 2008, 88.

2. McGlynn 2004; Fuchs and Milstein 2011.

3. Chandra, Holmes, and Skinner 2013, 261.

4. See Chandra, Holmes, and Skinner 2013.

5. Squires 2011.

6. Fuchs 1974; Skinner et al. 2009.

7. See A. M. Garber and Skinner 2008.

8. Aaron and Ginsburg 2009, 1262.

9. D. M. Cutler 2014a.

10. D. M. Cutler 2014a, 21.

11. A. M. Garber and Skinner 2008.

12. IOM 2013, 102; see also Skinner 2011. For a thoughtful critique of the literature on waste and inefficiency in the U.S. healthcare system, see Glied and Sacarny 2017.

13. IOM 2013, 103.

14. D. M. Cutler 2014a.

15. Berwick and Hackbarth 2012.

16. “Health Policy Brief: Reducing Waste in Health Care” 2012.

17. Berwick and Hackbarth 2012.

18. Kolata 2017.

19. As quoted in Kliff 2013.

20. As quoted in Kliff 2013; see also Sirovich and Welch 2004.

21. Kliff 2013.

22. J. Miller 2014.

23. Quoted in J. Miller 2014.

24. Welch, Schwartz, and Woloshin 2011.

25. Gawande 2015.

26. Brownlee 2008.

27. Quoted in Brown 2011.

28. D. Epstein 2017.

29. D. Carpenter 2010.

30. Olson 2014.

31. Ashton and Wray 2013.

32. Ruger 2012b, 252.

33. CBO 2007a, 4 (note 3).

34. Jerry Avorn 2005.

35. We thank Mark Schlesinger for this point.

36. Jerry Avorn and Kesselheim 2015.

37. Ashton and Wray 2013, 28.

38. Chalkidou et al. 2009, 364.

39. For example, there are stark differences between the United States and European countries in how evidence shapes drug coverage decisions. A 2013 Health Affairs study found that European authorities “systematically assess most newly approved cancer drugs and use a variety of methods of comparative effectiveness research to arrive at recommendations on prescribing and reimbursement…. In contrast, drug reimbursement decisions in the US Medicare program appear less based on evidence” (J. Cohen, Malins, and Shahpurwala 2013, 767–68). For a proposal to reform U.S. drug pricing using the German reference price model, see Bahr and Huelskoetter 2014.

40. Bagley, Chandra, and Frakt 2015, 8.

41. There is some evidence that CMS is scrutinizing evidence more tightly for national coverage decisions under Medicare (Chambers, Chenoweth, Cangelosi, et al. 2015), but the vast majority of coverage decisions continue to be made by local contractors hired by CMS. These contractors have limited capacity and incentive to restrict coverage of low-value technologies. It remains to be seen what posture CMS will adopt for coverage determinations under the Medicare reforms that may be signed by President Donald Trump.

42. Tunis, Berenson, Phurrough, and Mohr 2011, 3.

43. CBO 2007b, 32.

44. Pearson and Bach 2010.

45. Pear 1991.

46. Susan Bartlett Foote 2002.

47. Susan Bartlett Foote 2002; Chandra, Jena, and Skinner 2011.

48. Neumann 2006.

49. Neumann 2006.

50. Neumann 2006.

51. IOM 2007.

52. CBO 2007b, 11.

53. Deyo 2014.

54. Mello and Brennan 2001.

55. Skinner 2011.

56. Fisher, Bell, et al. 2010.

57. Finkelstein, Gentzkow, and Williams 2016.

58. Westfall et al. 2007.

59. McGlynn, Asch, et al. 2003.

60. Skinner 2006.

61. For a synopsis by a leading figure in the field, see J. E. Wennberg 2010.

62. Brown 2011; Rosenthal 2014.

63. Welch et al. 2011.

64. Deyo and Patrick 2005.

65. Brownlee 2008.

66. Abramson 2008.

67. Clifton 2009.

68. Callahan 2009.

69. Chandra, Jena, and Skinner 2011, 30.

70. The application of scientific research on what works best for patients for every possible treatment option would move the country to point B. The nation would be spending more on medical care, but all possible health-related gains would be exhausted. Economists recommend point C over point B because the marginal gains in population health that would be realized by moving to the peak of the production frontier would not be worth it from a cost-effectiveness standpoint— too much consumption of nonhealth goods (e.g., education, clothing, housing, consumer goods, etc.) would have to be sacrificed to reach the peak (Chandra, Jena, and Skinner 2011).

71. Chandra, Jena, and Skinner 2011, 32; Perlroth, Goldman, and Garber 2010.

72. Ashton and Wray 2013, 77.

73. Department of Justice 2012.

74. Makary 2012.

75. Chandra, Jena, and Skinner 2011.

76. Groopman 2010.

77. See Chandra, Jena, and Skinner 2011.

78. Garber and Tunis 2009, 1926.

79. Wilson 1989.

80. Derthick and Quirk 1985; Patashnik 2008.

81. P. Peterson and West 2003.

82. Patashnik 2008.

83. Chalkidou et al. 2009; Stabile et al. 2013; Bevan and Brown 2014; Sorenson et al. 2014.

84. Stabile et al. 2013, 648.

85. Sorenson et al. 2014.

86. T. Marmor, Oberlander, and White 2009; Chalkidou et al. 2009.

87. Sorenson 2015, 205.

88. Sorenson 2015.

89. Steinbrook 2008.

90. McKee 2016.

91. We thank Michael Gusmano for helpful conversations on these issues.

92. On the limited responsiveness of American government to low-income Americans, see Gilens 2012.

Notes to Chapter 2

1. Timbie et al. 2012.

2. Timbie et al. 2012.

3. Jennings 1986.

4. Moseley, O’Malley, et al. 2002, 81; Katz, Brownlee, and Jones 2014.

5. Moseley, O’Malley, et al. 2002.

6. Arrow 1963.

7. Sihvonen et al. 2013.

8. Sihvonen et al. 2013.

9. Belluck 2013, A16.

10. Rosenberg et al. 2015.

11. Felson and Buckwalter 2002.

12. Burman, Finkelstein, and Mayer 1934; Yang and Nisonson 1995.

13. Hanssen et al. 2000.

14. Sprague 1981.

15. Use of subjective measures that are based on remembering something like “pain” levels many months earlier are unlikely to be as accurate as those that ask individuals to reflect on their present state or to engage in more immediate recall. A number of the subsequent studies employed a prospective design and so obtained measures of pain and functionality prior to the operations to compare postoperative measures to.

16. For a comprehensive list of clinical studies published up to 2003, along with a brief description of their designs and main results, see appendix B of CMS 2003.

17. Yang and Nisonson 1995; also see Moseley, O’Malley, et al. 2002.

18. Kalunian et al. 2000; Ravaud et al. 1999; Dawes et al. 1987.

19. Moseley, O’Malley, et al. 2002.

20. Yang and Nisonson 1995.

21. Bernstein and Quach 2003; also see Moseley, O’Malley, et al. 2002.

22. Hanssen et al. 2000, 1769–70.

23. Quoted in McGinty et al. 1992, 1573.

24. Ashton and Wray 2013, chapter 4; Harris 2016; Langreth 2003. We wish to emphasize that our claim is not that orthopedic surgery is generally ineffective. For example, convincing evidence supports major joint replacement operations, such as total knee replacement and hip replacement. These operations improve quality of life and may also be cost saving, because they replace the frequent doctor visits and drug expense associated with chronic care, reduce costs by eliminating falls, and permit the patient to resume more normal activity including employment (David T. Felson, personal communication, October 27, 2004).

25. Buchbinder et al. 2009.

26. Wulff, Miller, and Pearson 2011.

27. Deyo, Nachemson, and Mirza 2004. There have been no studies that we are aware of to determine if spinal fusion works better than placebo.

28. Bunker, Hinkley, and McDermott 1978, 937.

29. Spodick 1975, 35–36.

30. In 1959 the New England Journal of Medicine published the results of a sham surgery trial that found that a procedure known as internal mammary artery ligation worked no better than a placebo (L. Cobb et al. 1959). A more recent placebo-controlled trial of implantation of embryonic neurons in patients with Parkinson’s disease found a very strong placebo effect, though those who received the real procedure did experience slightly greater improvement along certain dimensions (McCrae et al. 2004). On the case for sham surgery studies, see Carroll 2014.

31. The strong taboo against the use of placebo controls in the evaluation of surgeries raises an intriguing ethical question: If fake operations make people feel better, should surgeons perform them? We are skeptical. If patients were told that their operations did not involve real medical interventions, they would likely experience a much smaller placebo benefit. Moreover, because all medical interventions likely produce some placebo benefits, the knowledge that placebos were now an accepted course of personal medical therapy might lead to worse patient outcomes in general. Keeping patients in the dark about the use of placebos (outside of control arms in clinical trials) would only create more problems. Such deceptions would inhibit open communication between physicians and their patients and erode trust in the health care system. It would also damage the medical research enterprise, discouraging the identification of procedures with true medical benefits.

32. F. Miller 2014, 159; as Baruch Brody, the ethicist who participated in the Moseley study, points out, this ethical conclusion “is not based upon a commitment to a utilitarian philosophy that allows for the mistreatment of subjects if it is sufficiently socially or scientifically valuable.” Risks still must be sufficiently minimized, and special steps must be taken to ensure that prospective participants in the trial really understand the nature of the placebo control group. See Baruch A. Brody, “Criteria for Legitimate Placebo Controlled Surgical Trials” (www.bcm.edu/pa/knee-drbrody.htm [October 25, 2004]).

33. AMA cited in F. Miller 2004; also see Tenery, et al. 2002.

34. Spodick 1975, 36.

35. Moseley, O’Malley, et al. 2002, 82.

36. Quoted in Burling 2002, A1.

37. At the time of the trial, the Houston VA (unlike many other VA hospitals around the nation) did not routinely cover the procedure for patients with knee arthritis because an excess of patient demand relative to the supply of available surgical suites forced local hospital officials to set priorities, and other surgical procedures were considered more important. The only way Houston area veterans could receive the procedure from the VA for this condition was by entering the trial (Nelda P. Wray, personal communication, October 5, 2004).

38. Felson and Buckwalter 2002.

39. Okie 2002; Kowalczk 2002.

40. Bandolier (http://www.bandolier.org.uk/booth/Arthritis/arthrokn.html).

41. The authors believe that the placebo effect was responsible for the improvement in part because most of the patients in the study had fairly stable symptoms at the time they entered the trial. They did not have acutely worsened symptoms that would be expected to naturally regress to the mean (J. Bruce Moseley, personal communication, December 7, 2004).

42. The study was conducted at a VA hospital, and therefore most subjects were men, but OA of the knee affects women more than men. There is, however, no medical reason to think surgery response varies according to sex, and critics did not press this argument very hard.

43. Moseley and Wray classified “popping,” “clicking,” “locking,” and “giving way” as mechanical symptoms for purposes of their subgroup analysis ( J. Bruce Moseley, personal communication, April 10, 2006). David T. Felson states that orthopedic surgeons generally regard locking and giving way as mechanical indications for arthroscopic surgery, but not popping and clicking (David T. Felson, personal communication, March 15, 2006). We are unaware of any systematic data on surgeons’ actual clinical decisions or behavior in this area.

44. As it turned out, however, some patients with a torn meniscus inadvertently entered the study. (Such patients often had false-negative magnetic resonance imaging findings.) For ethical reasons, an unstable meniscal tear found among patients in the lavage and debridement arms was treated. Although the number of such patients was too small for firm conclusion, they did not appear to do substantially better than patients in the placebo arm. The authors believed this required further study, but that, until clear evidence emerged, arthroscopy should continue to be performed on those with conditions like a bucket-handle tear (J. Bruce Moseley, personal communication, November 1, 2004).

45. David T. Felson, personal communication, September 27, 2004.

46. “Arthroscopic Surgery for Osteoarthritis of the Knee” (multiple letters) 2002; also see Wray et al. 2003.

47. Dervin et al. 2003.

48. There is a general consensus that the procedure is appropriate for patients with an unstable meniscal tear, though Moseley and Wray believed more research is needed on this question ( J. Bruce Moseley, personal communication, November 1, 2004).

49. Felson and Buckwalter 2002.

50. David T. Felson, personal communication, September 27, 2004; also see Felson 2010.

51. Quoted in Burling 2002, A1; see also Jackson 2002.

52. Nelda P. Wray, personal communication, October 8, 2004.

53. Felson and Buckwalter 2002.

54. Quoted in Kolata 2002b.

55. Bagley 2013.

56. Kolata 2002c.

57. Shamiram Feinglass, personal communication, October 22, 2004.

58. The report (AAOS 2002), was endorsed by the American Academy of Orthopaedic Surgeons, the American Academy of Hip and Knee Surgeons, the Arthroscopy Association of North America, the American Orthopaedic Society of Sports Medicine, and the Knee Society.

59. CMS 2003.

60. See, for example, AETNA 2003; CIGNA 2004.

61. Shamiram Feinglass, personal communication, October 22, 2004.

62. Sung 2003.

63. Sung 2003.

64. Moseley and Wray’s results may have been strong enough that CMS would have been justified in denying coverage of debridement for patients with OA, with an exception for those patients with anatomic abnormalities, such as one that produces locking of the joint, preventing a complete extension of the knee. Ideally, CMS coverage decisions should be based on strong medical evidence, yet CMS acknowledged that the clinical evidence for benefits to subgroups came from case series studies that the agency considered to be methodologically deficient. In light of the agency’s own concerns about the existing level of evidence data, one reasonable approach would have been for CMS to have made no coverage changes regarding debridement for the time being but to have announced that it would stop paying for the procedure after some time period (say, three years) unless more rigorous evidence was presented to demonstrate the procedure’s benefits. This would have created a powerful economic incentive for defenders of the procedure to replicate Moseley and Wray’s study.

65. CMS has taken some steps to address this information problem. In November 2004, CMS chief administrator, Dr. Mark McClellan, who is both an internist and an economist, announced that the agency would make payments for certain new expensive treatments, such as implantable defibrillators for heart patients, conditional on agreement by companies and other actors to pay for studies on whether these new methods are effective on the Medicare population. But this “coverage with evidence development” initiative has struggled owing to poor study designs and insufficient funding and statutory authority. See Neumann and Chambers 2013.

66. Pearson and Bach 2010. Pearson and Bach propose a new Medicare payment model in which CMS would pay more for services demonstrated by research to provide superior clinical benefits compared to alternatives. New services without such evidence would be reimbursed under standard Medicare rates for a limited time but then be reevaluated as evidence emerged.

67. Kirkley et al. 2008.

68. Katz, Brownlee, and Jones 2014.

69. David T. Felson, personal communication, September 27, 2004; also see Felson 2010.

70. S. Kim et al. 2011.

71. D. Howard et al. 2012.

72. Quoted in Kolata 2008.

73. AAOS 2013b.

74. Englund et al. 2008.

75. Englund et al. 2008.

76. Sihvonen et al. 2013.

77. Katz, Brownlee, and Jones 2014, 152.

78. Järvinen, Sihvonen, and Englund 2014, 216 (citing Moseley, O’Malley, et al. 2002, Herrlin, Hållander, et al. 2007, Kirkley et al. 2008, and S. Kim et al. 2011). In 2015, the British Medical Journal, a leading medical journal, published a meta-analysis and systematic review of the benefits and harms of arthroscopic surgery for the degenerative knee. It found no significant benefit on physical function, and some studies finding significant harms. Yet the procedures remain in use. The authors concluded: “Available evidence supports the reversal of a common medical practice. However, disinvestment of commonly used procedures remains a challenge, and use of arthroscopy seems to be undiminished, in analogy with use of vertebroplasty following the publication of trials showing absence of benefit of this procedure. Surgeon confirmation bias in combination with financial aspects and administrative policies may be factors more powerful than evidence in driving practice patterns” (Thorlund et al. 2015, 7).

79. Sihvonen et al. 2013.

80. Sihvonen et al. 2013.

81. Katz, Brownlee, and Jones 2014.

82. Katz, Brophy, et al. 2013.

83. Sihvonen et al. 2013; Belluck 2013.

84. As described by Järvinen, Sihvonen, and Malmivaara 2014. Two other studies showed that APM and exercise therapy were not superior to exercise alone in the treatment of degenerative meniscal tears in patients with varying degrees of knee OA (Herrlin, Wange, et al. 2013; Yim et al. 2013).

85. Quoted in Emery 2013.

86. Bhatia 2014.

87. Sihvonen et al. 2013, table 1.

88. Elattrache et al. 2014, 542.

89. Quoted in Emery 2013.

Notes to Chapter 3

1. Unless otherwise noted, all results are from surveys we conducted. The first survey was conducted November 5–December 31, 2009 (N=1,100); the second May 21–24, 2010 (N=2,200); the third July 30–31, 2010 (N=2,000); the fourth February 17–23, 2011 (N=1,500); the fifth November 9–22, 2011 (N=3,600).

2. The surveys were conducted by YouGov/Polimetrix, an international Internet-based survey research firm based in the United Kingdom. For each survey, YouGov/Polimetrix interviews more respondents than are required (from their panel of over 1.5 million participants) and then uses a combination of sampling and matching techniques to approximate a nationally representative sample. The surveys were contracted by the researchers, and approved by the institutional review board at Yale University.

3. All the analysis presented below uses the analytical weights provided with each data set. Although we cannot rule out the possibility of bias in this sampling method, it is reassuring that the nationally representative survey samples (i.e., the weighted data) produce responses similar to other surveys on baseline questions about insurance coverage and health status: in the May 2010 and February 2011 surveys, for example, 20 and 23 percent report being uninsured, and 72 and 76 percent report their health as “good” or better.

4. As Mark Peterson writes, “Based on the same claims to science and knowledge that medicine has used to invite our dyadic trust in physicians at the individual level, the medical profession has long sought, and often obtained, broad-based social trust in its leadership of health care policy making by local, state, and federal governments” (2001, 1146).

5. M. Peterson 2001.

6. For a review of agency models in political science, see Moe 1984.

7. Maynard and Bloor 2003; see also Arrow 1963.

8. Lupia and McCubbins 1998.

9. Mechanic 1998b; 2004a.

10. Blendon, Hyams, and Benson 1993; L. R. Jacobs and Shapiro 1994.

11. Buhr and Blendon 2011, 21.

12. See Hetherington 2005.

13. Buhr and Blendon 2011.

14. Buhr and Blendon 2011.

15. Cited in Buhr and Blendon 2011, 22.

16. Blendon, Hyams, and Benson 1993; L. R. Jacobs and Shapiro 1994; Krause 1996; M. Peterson 1993; 2001; Schlesinger 2002.

17. Gallup 2009; Rasmussen Reports 2010.

18. All the differences in the public’s assessment of doctors compared to other professions are statistically significant at p<.05, two-tailed, with the exception of the differences between doctors and school teachers on the “interested in helping people” (p=.26) and “can be trusted” (p=.25) items.

19. See Schlesinger 2002.

20. Lillis 2010.

21. Wulff, Miller, and Pearson 2011.

22. Frakt 2015b, A27.

23. Currie, MacLeod, and Van Parys 2016.

24. Carman et al. 2010.

25. On average, we found that people were 5.2 percentage points more likely to find the arguments in favor of treatment guidelines to be somewhat or very convincing when the block of arguments against was presented first. The order of the blocks did not affect evaluations of the arguments against treatment guidelines.

26. Deyo and Patrick 2005; Hadler 2008.

27. Carman et al. 2010.

28. Men were substantially more likely than women to state that more than half of their own care (59.8 percent versus 41.3 percent) and more than half of the care received by others (53.3 percent versus 33.7 percent) is backed by evidence. This suggests that women were more skeptical than men about the evidence basis of medicine.

29. Zaller 1992.

30. Additional analysis revealed that this stylized debate had more of an effect on certain groups. Overall (across all conditions), the mean change in support among Republicans was -9.0; among Independents it was -7.2; and among Democrats the average change was only -5.2. Because Democrats began more supportive of CER than Republicans, exposure to the stylized debate led to greater partisan polarization of opinion regarding comparative effectiveness research. We also found that those with a college degree or more were, on average, less responsive to the debate (-5.6) than individuals with a high school diploma or less (-8.0). We did not find any differences across age groups or between voters and nonvoters.

31. Full argument wording: [doctors want it] “Many doctors’ groups and medical associations are calling for comparative effectiveness research because the research will give doctors the information they need to identify the best treatments for their patients.” [scare tactic (one-size-fits-all)] “The argument that this research will lead to one-size fits-all medicine is just a scare tactic. Doctors will be free to treat patients in the way they think is best.” [works best for most] “It is unrealistic to expect doctors to view every patient as completely unique. Instead it is important to provide doctors with scientific evidence about what works best for most patients with a given medical condition.” [can incorporate group differences] “Medical studies can be designed not only to identify which treatments work best for the average patient, but also which work best for patients with different medical conditions and backgrounds.”

32. The mean persuasiveness ratings for the scare tactic and works best for most arguments were statistically indistinguishable from 50 (p-values=.522 and .216, respectively). We also tested the effectiveness of three rebuttals to the “ration care” argument. An argument that “there is so much waste in the health care system that we can reduce costs without harming patients” was most effective (mean score of 57.2). The other two arguments were statistically indistinguishable from 50: “In a time of budget deficits, difficult choices must be made to get health care costs under control. Every patient cannot get every possible treatment. Comparative effectiveness research will make sure that limited resources are allocated in the fairest and most effective way” (mean score of 51.8); “The rationing argument is just a scare tactic because Congress can prohibit research findings from being used to deny patients access to effective treatments” (mean score of 50.9).

33. Druckman, Hennessy, et al. 2010; Eagly and Chaiken 1993; Lupia 1994.

34. Like other work on public opinion concerning health care politics and policy (e.g., Gollust and Lynch 2011), we adopt Druckman, Hennessy, et al.’s definition of a “cue”: Information “that enable[s] individuals to make simplified evaluations without analyzing extensive information” (2010, 137).

35. The full question wording stated, “A variety of public policies have been proposed to help reduce the amount we spend on health care. Suppose you learned that a proposal was [American Medical Association cue conditions] and [political cue conditions]. Would this make you more or less likely to support the proposal?”

36. The effects of the AMA cues are very similar for Republicans (a .41 unit difference between AMA support and opposition), Democrats (.46 unit difference), and Independents (.38 unit difference). The differences across partisan groups are statistically insignificant (p>.10 for all pairwise comparisons). This suggests that public support of a proposal to help reduce health care spending is likely to be significantly and similarly (across party lines) influenced by the position of the AMA—Republicans, Democrats, and Independents are no more or less likely to respond to the position of the AMA.

37. Twenty percent of the people who received the bipartisan commission supports cue said they were (somewhat or much) less likely to support the proposal, while 17 percent of the people who received no political cue did so; 31 percent of the people who received the bipartisan commission supports cue said they were (somewhat or much) more likely to support the proposal, while 27 percent of the people who received no political cue said the same.

38. The effect of support from a bipartisan commission does not vary across respondents with differing partisan identities, including those who identify as Independent (p>.10 for all pairwise comparisons). In fact, none of the political cue treatment conditions significantly affected Independents relative to the no political cue condition.

39. This finding is consistent with prior work on the effect of partisan cues (e.g., Kam 2005; Popkin 1994; Rahn 1993).

40. Moreover, in additional analysis, we found evidence of “cue substitution”—respondents giving particular weight to the cues they see as most informative (e.g., Schaffner, Streb, and Wright 2001; Ansolabehere et al. 2006). Respondents who identified as Republicans or Democrats (rather than as Independents) relied particularly heavily on cues from the AMA when no directional party cue was provided (i.e., in the Bipartisan Commission Supports, Both Parties Support, and No Political Cue conditions). In the absence of a directional party cue, the effect of an AMA endorsement—rather than AMA opposition—was .566 (p<.01) among partisan respondents. However, when a directional party cue (i.e., the Republicans Support or Democrats Support conditions) was given to partisan respondents, the effect of the AMA position was substantially smaller and not statistically significant (estimated effect=.171; p=.305). In other words, partisans relied on cues from their party when available; but when cues from their own party were not available, partisans were influenced by information about the AMA’s position. As we would expect, there was less evidence of cue substitution among self-identified Independents. When a directional partisan cue was given to Independents, the estimated effect of an AMA endorsement was somewhat larger (estimated effect=.548) than it was when no directional partisan cue was given (estimated effect=.234). However, this difference in effect size falls short of conventional levels of statistical significance (p=.135). This analysis was conducted by estimating a regression model predicting support for the proposal with indicators for each of five treatment conditions (1) Directional Party Cue, AMA Supports; (2) Directional Party Cue, No AMA Cue; (3) No Directional Party Cue, AMA Supports; (4) No Directional Party Cue, No AMA Cue; (5) No Directional Party Cue, AMA Opposes. The model also included an indicator set to 1 for respondents who identified as either Democrats or Republicans and 0 otherwise. Finally, the model included interactions between this “partisan indicator” and each of the treatment indicators.

41. Some limitations should be kept in mind. The magnitude of the effect sizes we identify in the survey experiments are modest (e.g., the estimated effect of receiving “leading doctors support” rather than “leading doctors oppose” was approximately 6 points on a 100-point scale [0.25 standard deviations]). This is particularly important to note given that these experiments present respondents with a highly simplified representation of the world and, thus, may overstate the size of the effects that would occur outside of the experimental context (Barabas and Jerit 2010). Thus, although this stripped-down framework demonstrates that physician and medical association endorsements are potentially important to public support of health policy proposals, we cannot assume that such endorsements would result in similar effects when other information is available to citizens. We believe the public’s high level of confidence in doctors and medical associations is robust to alternative models, but further experimentation that embeds more detailed and complex information about the health care system may provide a more complete picture of the types of information people encounter in the real world and the potential effects that doctors and medical associations can have on public opinion.

Notes to Chapter 4

1. USPSTF 2008.

2. Harris 2011.

3. Welch, Schwartz, and Woloshin 2011, 58.

4. Parker-Pope 2012a.

5. Ablin 2010, A27.

6. USPSTF 2012.

7. Quoted in Jaslow 2013.

8. Quoted in Marshall 2012.

9. Quoted in Jaslow 2012.

10. Pollack 2013.

11. The change in recommendation was “based in part on additional evidence that increased the USPSTF’s certainty about the reductions in risk of dying of prostate cancer and risk of metastatic disease.” USPSTF 2017.

12. Drazer, Huo, and Eggener 2015. See also Jemal et al. 2015; Sammon et al. 2015.

13. Drazer, Huo, and Eggener 2015.

14. D. H. Howard, Tangka, Guy, et al. 2013.

15. Quoted in Azvolinsky 2015.

16. In October 2015, the American Cancer Society announced that it was recommending that women at average risk of developing breast cancer begin getting annual mammograms at age 45, rather than 40 (the previous recommendation), and that at age 55 women transition to screening every two years. The group also recommended that women no longer receive physical breast exams where doctors feel around for bumps (Grady 2015). Yet the recommendations did not sit well with many doctors. In the New York Times, a trio of prominent radiologists and breast surgeons wrote that they “profoundly disagree” with the new guidelines, and that they would continue to recommend that women receive annual screening mammograms starting at age 40 (Drossman, Port, and Sonnenblick 2015). The main message that critics prompt the public to hear is that recommendations “will be used ‘to prevent me or someone I care about from getting something that I believe is important for my health and well-being’” (quoted in Ashton and Wray 2013, 246).

17. M. Kim, Blendon, and Benson 2001.

18. Ubel and Asch 2015; Bach 2012.

19. Arrow 1963.

20. D. Carpenter 2012, 298; M. Peterson 2001.

21. Abbott 1988; Freidson 1988; Starr 1982.

22. M. Peterson 2003, 273.

23. Dzur 2002, 178.

24. J. E. Wennberg 2010, 24.

25. J. E. Wennberg 2010, 24.

26. J. E. Wennberg 2010, 24.

27. Regional variation in utilization and spending has been a major concern among both researchers and health policy makers. For a political analysis of targeting variation as a policy strategy, see Tanenbaum 2012.

28. As medical ethicist Daniel Callahan observes, the relationship between physicians and the medical products industry is symbiotic: “Physicians need industry to provide the technologies and treatments to pursue their profession. Industry needs physicians as the necessary pathway to patients” (2009, 140). In addition, some doctors are de facto small businesspersons (they run and manage their own offices), and all physicians have a stake in their own careers and incomes (M. Rodwin 2011).

29. J. E. Wennberg 2010, 7.

30. J. E. Wennberg 2010.

31. The failure to give patients effective services is a serious issue, but it does not appear to be a major factor in the regional variation in Medicare utilization or spending (see J. E. Wennberg 2010, 9).

32. Makary 2012. The challenge of professional self-regulation is not unique to doctors. As journalist Megan McArdle writes,

Professionals tend to deal with some of the most sensitive and important issues that our society has, like treating illness and educating our children. It’s no accident that these people generally end up being regulated by their peers—and that the rest of us are frequently unsatisfied with the results. When professional groups decide what’s good for the rest of us, it usually turns out that what they think is good for the rest of us is what’s best for them.

This doesn’t have to be nakedly venal, and it often isn’t. College professors genuinely care about their students, lawyers about their clients, doctors about their patients, journalists about their readers, and yes, police care about the communities they serve. But when a proposal comes up that will hurt them in some way, it’s very easy for the professionals to see all the reasons against it, and to convince themselves that the world will be better off without it. And when it comes time to discipline a member for some offense, unless it is straightforwardly heinous, they will naturally sympathize with the accused, thinking of all the times they made mistakes that could have landed them in the same place (2015).

33. Lyu et al. 2016.

34. Swenson n.d.

35. Callahan 2009.

36. Laugesen 2016, 24.

37. Tyssen et al. 2013, 518.

38. A. M. Garber and Skinner 2008, 46.

39. Stone 1977, 38.

40. Starr 1982, 140.

41. Skowronek, Engel, and Ackerman 2016.

42. Starr 1982, 140–2.

43. Starr 1982, 19–20 and 127–34.

44. Morone 1990, 255.

45. Starr 1982, 134.

46. Starr 1982, 140.

47. Ruger 2012a, 231.

48. D. Carpenter 2012.

49. Wilsford 1991.

50. Ruger 2011, 352.

51. Ruger 2011, 353.

52. See Wennberg International Collaborative 2011.

53. Stevens et al. 2006.

54. T. R. Marmor 2000; M. Peterson 2001.

55. According to the Centers for Medicare and Medicaid Services, “U.S. health care spending grew 5.8 percent in 2015, reaching $3.2 trillion or $9,990 per person. As a share of the nation’s Gross Domestic Product, health spending accounted for 17.8 percent” (see https://www.cms.gov/research-statistics-data-and-systems/statistics-trends-and-reports/nationalhealthexpenddata/nationalhealthaccountshistorical.html; accessed March 15, 2017).

56. Ruger 2011.

57. Moe 1989, 267.

58. Bagley 2013; Abelson and Lichtblau 2014.

59. Bagley 2013, 521–22.

60. Bagley 2013, 568.

61. Tunis and Pearson 2006.

62. For an optimistic view, see Robinson 2015.

63. M. Peterson 2001.

64. Starr 1982; M. Peterson 2001.

65. For a general discussion of the power of vested interests in U.S. politics, see Moe 2015.

66. Laugesen 2016; Laugesen et al. 2012.

67. Whoriskey and Keating 2013.

68. Quoted in Whoriskey and Keating 2013.

69. Clemens and Gottlieb 2013.

70. Tuohy 1999.

71. Belkin 1998.

72. M. Peterson 2001, 1158.

73. Mechanic 2004b.

74. Blendon, Donelan, et al. 1993, 1015 (table 4).

75. D. M. Cutler 2014a, 129.

76. Mechanic 2004b; Kronebusch, Schlesinger, and Thomas 2009.

77. Mechanic 2006, 28.

78. Stevens 2001, 348–49.

79. We partnered with Medical Marketing Services (MMS), a firm that specializes in e-mail marketing within the health care industry. The firm maintains a list, derived from the AMA Physician Masterfile, of every physician in the United States (M.D. and D.O.), including both members and nonmembers of the AMA. This list of over 900,000 physicians, updated weekly, takes the AMA Masterfile, which contains demographic, education, and current practice information gleamed from over 2,100 sources, and appends to it demographic, behavioral, and psychographic data from a number of sources. We purchased a random sample of 4,000 physicians. From this list, which included 1,400 primary care and 2,600 non–primary care physicians, we called offices in a random order to verify contact information (i.e., the address at which to be mailed a survey). Once we had verified the contact information for 750 physicians, we began mailing surveys.

80. We have little information about the characteristics of the sample as a whole, but those characteristics we do have suggest that the 374 individuals who responded to our survey are relatively similar to those who did not. For example, the average age of nonrespondents and respondents is around 53 years; the average nonrespondent and respondent graduated medical school 20–21 years ago; and a similar percentage (87–88 percent) of respondents and nonrespondents were also in office-based practices. There are some regional differences between respondents and nonrespondents, but such differences are small—respondents (Midwest, 18.8 percent; South, 32.1 percent; West, 23.8 percent; Northeast, 25.2 percent) vs. nonrespondents (Midwest, 22.7 percent; South, 30.5 percent; West, 22.1 percent; Northeast, 24.6 percent). Finally, non–primary care physicians (i.e., specialists) were more likely to respond than primary care physicians—specialists constituted 65 percent of our total sample, but 72 percent of our respondents. In short, our survey respondents and nonrespondents are broadly similar for the observed characteristics we do have.

81. A response rate of 50 percent is comparable to other mail surveys of physicians (see, e.g., Keyhani, Woodward, and Federman 2010).

82. Grande et al. 2007.

83. Bonica, Rosenthal, and Rothman 2014.

84. Full question wording, for both our survey and the Pew report (see http://www.people-press.org/2014/06/26/section-10-political-participation-interest-and-knowledge/): “Some people seem to follow what’s going on in government and public affairs most of the time, whether there’s an election going on or not. Others aren’t that interested. Would you say you follow what’s going on in government and public affairs most of the time, some of the time, only now and then, or hardly at all?”

85. Full question wording: “Here is a list of federal government officials. For each one, please tell whether or not you have initiated any contacts with that type of official, or someone on the staff of such an official, in the last twelve months.”

86. Unlike the political interest question, the question wording in the Pew report differed slightly from ours in that it asked generally about contacting “elected officials,” rather than about specific elected officials, over a two-year, rather than one-year period.

87. See Davis et al. 2014.

88. In 2013, Americans had a life expectancy at birth of 78.8 years, compared with a median of 81.2 years in OECD nations (see: http://www.commonwealthfund.org/publications/issue-briefs/2015/oct/us-health-care-from-a-global-perspective). Most doctors recognized this. Sixty-five percent of doctors said that people in the United States have a lower life expectancy (1.5 or more years less than people in France and Germany). Six percent indicated that people in the United States had a higher life expectancy, with 29 percent saying the United States fared “about the same” to its Western European counterparts.

89. Large regional differences have been documented in the U.S. Veterans Affairs system (Ashton et al. 1999; CBO 2008; Subramanian et al. 2002), in private insurance markets (Cooper et al. 2015), and in other countries including the U.K. and Canada (McPherson et al. 1981).

90. Gawande 2009.

91. Roy 2010.

92. Skinner 2011; J. E. Wennberg 2010.

93. Cooper et al. 2015.

94. IOM 2013.

95. Baker et al. 2014.

96. J. E. Wennberg 2010, 183; also see D. Cutler et al. 2017.

97. J. E. Wennberg 2010.

98. D. Cutler et al. 2017.

99. D. M. Cutler 2014a.

100. D. Cutler et al. 2017.

101. Finkelstein et al. 2016.

102. Forsythe et al. 2015; Keyhani, Woodward, and Federman 2010.

103. In an earlier pilot study, we surveyed physicians attending a medical society meeting in the Charlottesville, Virginia, area. Many of the physicians did not report knowing anything about the regional variation literature, but of those who did self-report knowledge, most said they had learned about it from Gawande’s 2009 New Yorker article. This suggests the limits of medical society leadership in physician education as well as the importance of outlets other than medical journals for informing rank-and-file doctors about health policy issues.

104. This analysis was conducted by estimating a regression model (table A4.1) predicting reported familiarity with studies about geographic variation in health care spending on a scale from zero (no familiarity) to four (most familiarity) with the following information about the doctors: female, region (South, Northeast, Midwest, West) indicators, political interest, partisan identification, years in practice, whether residency took place at a VA, whether their practice is affiliated with an academic medical center, specialty (medical, surgical, or primary care) indicators, income source (salary, salary plus bonus, billing only, or other) indicators, and practice type (office based, office-based specialty group, hospital based, or other) indicators.

105. All p-values that we report are two-tailed.

106. Hersh and Goldenberg 2016.

107. In a separate regression analysis, we also found that the more familiar doctors reported they were with the studies on regional variation, the more likely they were to say that the availability of expensive medical technologies contributed “a lot” to regional variation in Medicare spending (p<.05). This was the only item for which we observed a statistically significant association between reported knowledge of the studies on regional variation in Medicare spending and the items presented in figure 4.1.

108. Walker 2011. A regression analysis estimating trust in the AMA using the same predictors listed for the previous regressions discussed in this chapter reveals a statistically significant (p<.05) association between the longer a doctor has been in practice and less trust in the AMA. No other covariates were statistically significant.

109. Bagley 2013, 534–6.

110. This analysis was conducted by estimating a regression model (table A4.3) predicting beliefs about the importance of (1) protecting the clinical autonomy of physicians in the society’s area of specialization and (2) identifying physicians in the society’s area of specialization who are not following best medical practices and bringing them to the attention of disciplinary boards for medical societies on a scale from one (not that important) to four (extremely important) with the eight beliefs about the causes of regional variation in Medicare spending and the following information about the doctors: female, region (South, Northeast, Midwest, West) indicators, political interest, partisan identification, years in practice, whether residency took place at a VA, whether their practice is affiliated with an academic medical center, specialty (medical, surgical, or primary care) indicators, income source (salary, salary plus bonus, billing only, or other) indicators, and practice type (office based, office-based specialty group, hospital based, or other) indicators.

111. Hersh and Goldenberg 2016.

112. There are also some stark regional differences in response to this item. Specifically, doctors from the South rated discouraging clinical interventions with minor or no benefit to patients as more important than doctors from the Northeast (p<.01) and West (p<.10).

113. There were no significant differences among the medical specialties on whether medical societies should argue that individual physicians should be permitted to continue practicing as they think best or disseminate the results but not take a position on them.

114. Currie, MacLeod, and Van Parys 2016.

115. Currie, MacLeod, and Van Parys 2016.

116. A. Epstein and Nicholson 2009.

117. Lipitz-Snyderman et al. 2016.

118. Van Parys and Skinner 2016, 1549.

119. Currie, MacLeod, and Van Parys 2016.

120. Van Parys and Skinner 2016, 1550.

121. This study found that heart attack patients with more aggressive doctors who use invasive procedures consistently have better health outcomes, although also higher costs (Currie, MacLeod, and Van Parys 2016).

Notes to Chapter 5

1. Beane, Gingrich, and Kerry 2008, A31.

2. Jasanoff 2016, 383.

3. R. Cobb, Ross, and Ross 1976.

4. On the origins, goals, accomplishments, and limitations of the Progressive reform model, see Knott and Miller 1987; and Skowronek, Engel, and Ackerman 2016. On the application of this model to the rise of the medical profession, see Starr 1982.

5. Gerber and Patashnik 2006; Mayhew 1974.

6. Schumpeter 1942.

7. Mayhew 2006.

8. Mayhew 2006, 223.

9. Kingdon 2003, 122.

10. T. Oliver 2004; Sheingate 2003.

11. Posner 2003, 194.

12. See, e.g., Schumpeter 1942; Dahl 1961; Kingdon 2003; R. W. Cobb and Elder 1983; Baumgartner and Jones 1993; Schickler 2001; Sheingate 2003; Mintrom and Norman 2009; Volden and Wiseman 2014.

13. But see Schneider, Teske, and Mintrom 1995.

14. Baumol 2010.

15. Scholars have found that not only can a source influence the credibility of a message, but “messages also influence perceptions concerning the credibility of the source” (Slater and Rouner 1996, 975). For studies on how assessments of message sources are made based on the quality and content of the message, see, for example, Brehm and Lipsher 1959; Combs and Keller 2010; Eisinger and Mills 1968; Hosman and Siltanen 2011; and Reimer, Mata, and Stoecklin 2004.

16. Gerber and Patashnik 2006.

17. In their study of legislative effectiveness, Volden and Wiseman 2014 find that political entrepreneurs are critical to policy change, but that policy sectors, including health, characterized by entrepreneurial politics—diffuse benefits/concentrated costs—are more prone to gridlock.

18. Wilson 1973.

19. Derthick and Quirk 1985.

20. Arnold 1990.

21. Patashnik 2008.

22. Wilson 1973, 322–4.

23. Mayhew 1974.

24. We note that another aspect of district representation—bringing grants and projects to the district—does not rate particularly high; however, there is a partisan divide—43 percent of Democrats, 33 percent of Republicans, and only 23 percent of Independents say that this factor would make them much or a great deal more likely to vote for a representative. This pattern may reflect the anger of Republican voters about the Obama administration’s economic stimulus spending. In recent years, many GOP lawmakers have declined to claim credit for projects that benefited their districts in order to inoculate themselves against primary challenges (Grimmer, Westwood, and Messing 2014).

25. Wilson 1974.

26. Both the heart disease and PSA vignette were introduced with the following text: “Please consider the following scenario based loosely on a recent medical controversy. This question is designed to understand public opinion. You should not make any health decisions based on this scenario.”

27. We note that when doctors oppose the task force recommendation in the absence of political disagreement (i.e., Rep. A supports the task force recommendation, but there is no mention of Rep. B), Representative A’s job approval does not decline as much (Rep. A Supports Task Force and Rep. A. Supports Task Force / Doctors Oppose Task Force are statistically indistinguishable, p>.10).

28. For insightful historical analyses of these developments, see Ashton and Wray 2013; Sorenson, Gusmano, and Oliver 2014.

29. Wilensky 2006.

30. As previously noted, new drugs and devices generally do not have to be shown to be superior (or even equivalent) to alternative products to gain FDA marketing approval. Pharmaceutical firms and medical device makers therefore gain little by paying for rigorous studies to determine if their products lead to superior patient outcomes. For example, the maker of a positron emission tomography (PET) scanner may be able to sell this $2.5 million machine to hospitals by marketing the PET scan’s impressive ability to find areas of abnormal metabolic activity that may indicate malignancy. An expensive study to determine whether the PET scan actually leads to more accurate disease staging, more focused treatments, and better patient outcomes is unlikely to be worth the costs (Ashton and Wray 2013).

31. CBO 2007b, 8.

32. Wilson 1974.

33. D. Carpenter and Fendrick 2004.

34. Markman 2008.

35. Angell 2004; Rose 2013.

36. Keller and Packel 2014.

37. Different terminology has been used historically to describe activities intended to promote evidence-based medicine (EBM), including health technology assessment (HTA) and comparative effectiveness research (CER): As Sorenson, Gusmano, and Oliver note,

Both HTA and CER address the question “Does it work?” and involve evidence synthesis. However, they can be distinguished by the fact that CER also focuses on evidence generation and is principally concerned with the comparative assessment of effectiveness of a broad range of interventions and care delivery approaches in routine settings, whereas HTA considers evidence on effectiveness, safety, cost-effectiveness, and, when broadly applied, social, ethical, and legal aspects of health technologies. Because HTA often (but not always) includes an economic dimension (cost-effectiveness), it also addresses the question “Is it worth it and should it be paid for?” and is often used to inform coverage and reimbursement decisions. As previously discussed, these are aspects that are not included in current conceptions of CER…. Taken together, CER can be viewed as a potential input into HTA and EBM (2014, 144).

38. Brody 1979.

39. Blumenthal 1983.

40. Perry 1982.

41. Blumenthal 1983.

42. Heritage Foundation 2005.

43. As quoted in Perry 1982, 1098.

44. Perry 1982.

45. J. Wennberg and A. M. Gittelsohn 1973; J. Wennberg and A. Gittelsohn 1982; see also Brownlee 2008; J. E. Wennberg 2010.

46. Skinner and Fisher 2010.

47. Kingdon 2003.

48. Gray 1992.

49. Gray 1992.

50. Gray, Gusmano, and Collins 2003.

51. Bimber 1996.

52. Gray, Gusmano, and Collins 2003.

53. Leary 1994.

54. Gray, Gusmano, and Collins 2003.

55. Quoted in Gray, Gusmano, and Collins 2003, W3–307.

56. Quoted in Rich 1996, A27.

57. J. Avorn 2005, 278.

58. Sorenson, Gusmano, and Oliver 2014, 150.

59. Sorenson, Gusmano, and Oliver 2014.

60. Allen 2013, 114.

61. Allen 2013, 108. Earlier in his career, as an Oregon state senator, Kitzhaber had developed a plan prioritizing the state’s Medicaid spending through an explicit process of ranking medical services based on their clinical effectiveness and net benefits. The goal was to expand the access of low-income citizens to health insurance while saving money by rationing care. But the program unraveled as a result of economic and budgetary pressures and eroding political support (see Oberlander 2007).

62. Allen 2013, 116.

63. Allen 2013, 115.

64. Pear 2003.

65. Unlike the Allen-Emerson bill, the Clinton proposal did not authorize cost-effectiveness research, only comparative effectiveness studies.

66. Confidential interview with congressional staff member.

67. Congressional Record, June 25, 2003, S8529.

68. Ashton and Wray 2013.

69. In the late 1990s, pharmaceuticals were one of the fastest growing components of national health care spending (Oberlander 2003). An explosion of new drugs for treating arthritis, heart disease, and many other conditions had emerged on the market. Eight in ten seniors took medications, each getting an average of twenty prescriptions filled per year (Oberlander 2003). Yet the original Medicare benefit package did not cover outpatient prescription drugs, creating a widening gap between Medicare and the standards of private insurance plans. In the 2000 presidential campaign, all the major candidates endorsed some version of a Medicare prescription drug benefit (Oliver, Lee, and Lipton 2004). For the winning presidential candidate, George W. Bush, adding an expensive prescription drug benefit was both clever politics and good policy. By signing onto an expansion of the Medicare benefit package, the Bush administration believed it could co-opt seniors and neutralize an issue that played to the advantage of the Democratic Party. In addition, the White House believed it could use a prescription drug coverage bill as a vehicle to restructure Medicare along conservative lines by increasing the role of market competition in the program (Oberlander 2012).

70. In a memo to members of Congress, the industry group also argued that federal studies would stymie innovation, inevitably influence private insurers, and in focusing on the benefits of drugs on the average patient, overlook the value of medicines for individuals or subgroups such as racial minorities (Pear 2003).

71. Ashton and Wray 2013.

72. Wilensky 2006.

73. Heritage Lectures 2007, 13.

74. CBO 2007b.

75. Suskind 2011, 140.

76. CBO 2007b, 1.

77. Cited in Carey 2007.

78. Subcommittee on Health, “Hearing on Strategies to Increase Information on Comparative Clinical Effectiveness,” June 12, 2007. Serial No. 110-46. https://www.gpo.gov/fdsys/pkg/CHRG-110hhrg45994/html/CHRG-110hhrg45994.htm.

79. Ashton and Wray 2013, 165.

80. CBO letter to Pete Stark, September 5, 2007.

81. CBO 2007b.

82. Ashton and Wray 2013.

83. Confidential interview with congressional staff member.

84. McDonough 2012, 194.

85. Reichard 2007.

Notes to Chapter 6

1. McCarty, Poole, and Rosenthal 2008.

2. Binder 2003.

3. Lee 2009.

4. Stokes 1963; Lee 2009. Our argument here is complementary to Frances Lee’s (2009) insightful argument that parties strategically exploit good government issues to embarrass opponents and burnish their own party’s image. Whereas Lee highlights partisan behavior on valence issues, we focus on how partisanship creates incentives to transform valence issues into position issues.

5. See Druckman, Peterson, and Slothuus 2013.

6. Polsby 1985.

7. Heclo 1996, 23. Heclo uses the term “gestation” rather than incubation.

8. Polsby 1985.

9. R. Cobb, Ross, and Ross 1976.

10. Quoted in Patel 2010, 1778.

11. Quoted in Patel 2010, 1778.

12. Daschle et al. 2008.

13. Orszag 2010, A39.

14. Orszag 2014.

15. Mooney 2005.

16. See Tierney 2016.

17. Quoted in Volsky 2009.

18. Quoted in Volsky 2009.

19. Chalkidou et al. 2009, 363.

20. Quoted in Iglehart 2010, 1759.

21. Mayhew 2006.

22. Mayhew 2006, 220.

23. Zaller 1992.

24. Zaller 1992. The Zaller model of elite opinion leadership should not be understood to suggest that citizens should never reject elite cues. As Jennifer Hochschild points out in a thoughtful essay, whether public followership is normatively desirable or undesirable depends on whether leaders’ empirical claims are empirically supported and morally justified. For example, one wishes, in retrospect, that the mass public had not followed elite leadership on the need for Jim Crow laws; conversely, there are strong reasons to wish that more of the public would accept the elite scientific consensus on global warming (Hochschild 2013).

25. Lee 2009.

26. Petrocik 1996.

27. A rebuttable implication of our argument is that Republicans will soften their opposition to CER if and when the electoral incentives surrounding health reform changes. While a wholesale shift in the GOP position seems unlikely until the legislative fate of the ACA is completely settled, early signs have been interesting. When the Republicans took back the House in 2010, the new chairman of the Oversight Committee, Darrell Issa (R-CA) told the Wall Street Journal that Republicans should talk less about death panels and focus more on reducing the overuse of expensive medical procedures. Issa said that his own doctor told him that spine surgeons have an incentive under Medicare to implant joint and bone screws to support patients’ spines, when fewer implants would be equally effective. Issa even praised having some form of CER board: “Medical panels of people who care about what’s best for their patients … is good science and good medicine…. Republicans have to step back from the words ‘death panels,’” he said (quoted in Mundy 2010).

28. Pear 2009b.

29. Gimpel, Lee, and Thorpe 2012, 582.

30. Alonso-Zaldivar 2008.

31. Confidential interview with Republican congressional staff member.

32. Fiorina 2006, 250.

33. Quoted in Grunwald 2012, 173.

34. Ashton and Wray 2013.

35. Quoted in Levey 2009. In fact, ambitious reform ideas for linking evidence to cost control were percolating, such as HHS secretary nominee Daschle’s proposal to deny federal tax benefits to private insurers that failed to comply with a board’s recommendations (Daschle et al. 2008, 179).

36. Quoted in Edney 2009. Representative Emerson, who had introduced CER legislation in previous Congresses with Tom Allen, stated that she did not support the stimulus language (Edney 2009).

37. The CBO reinforced the view that CER was unlikely to significantly increase spending on effective but underused treatments because “current incentives already favor the adoption and spread of more-expensive treatments, so new research that found those treatments to be more effective or more cost effective would probably increase their use only modestly” (2007b, 29).

38. Quoted in Ashton and Wray 2013, 193.

39. McCaughey 2009.

40. Pear 2009b.

41. The Senate Appropriations Committee even tried to address concerns that research would be used for cost-effectiveness analysis by placing the word “clinical” before every mention of comparative effectiveness research, but the final version did not include this change.

42. Quoted in Ashton and Wray 2013, 195.

43. Authors’ calculations based on Lexis/Nexis data.

44. For a political history and analysis of the ACA, see L. Jacobs and Skocpol 2010; and Starr 2011.

45. Quoted in Leonhardt 2009, MM36.

46. Grunwald 2012, 442.

47. Patel 2010.

48. Quoted in McDonough 2012, 195.

49. Patel 2010, 1779–80.

50. See, e.g., Selker and Wood 2009.

51. Pew Research Center 2009.

52. Stein and Eggen 2009.

53. Ashton and Wray 2013, 202.

54. PCORI 2010.

55. Sox 2012, 2176.

56. Section 937(a)(2)(B) with the PPACA statute.

57. For an excellent discussion of PCORI’s authorizing language, structure, and financing, see Ashton and Wray 2013, chapter 10.

58. Quoted in Iglehart 2010, 1759.

59. Quoted in Iglehart 2010, 1759.

60. Ashton and Wray 2013, 213.

61. Bagley 2013, 574.

62. Stokes 1963.

63. Zaller 1992.

64. Nicholson 2012; Green, Palmquist, and Schickler 2002.

65. Levendusky 2010, 114–15. In addition to cues from polarized leaders, the effort to cultivate enlightened public opinion is undermined by “motivated reasoning”—the tendency of people “to seek out information that confirms prior beliefs (i.e., a confirmation bias), view evidence consistent with prior opinions as stronger or more effective (i.e., a prior attitude effect), and spend more time arguing and dismissing evidence inconsistent with prior opinions, regardless of objective accuracy (i.e., a disconfirmation bias)” (Druckman, Peterson, and Slothuus 2013, 59).

66. Lenz 2012.

67. Fiorina and Abrams 2008, 581.

68. Or, in addition, consider net favorability ratings of Russian president Vladimir Putin. According to YouGov/Economist polls, from July 2014 to December 2016 Putin’s net favorability among Republicans rose 56 points (from -66 to -10); during the same timeframe, Putin’s net favorability among Democrats dropped by 8 points (from-54 to -62). During that time, Republican presidential candidate, and subsequently president-elect, Donald Trump spoke highly of Putin.

69. Dunlap and McCright 2008, 27.

70. Dunlap and McCright 2008, 27.

71. McCright and Dunlap 2011.

72. All bivariate relationships discussed in this chapter are statistically significant (p <.05), unless otherwise noted. The relationships we discuss are also statistically significant in multivariate analysis controlling for turnout, party identification, age, education, and race as well as income and gender. Exceptions are noted.

73. Race indicators are not jointly significant in multivariate analysis.

74. This survey was conducted as part of the 2014 Cooperative Congressional Election Study (CCES) on Yale University’s private team content (Gerber 2014).

75. An alternative explanation is simply that public awareness of the CER concept, which was originally opaque to the general public and therefore led to undifferentiated survey responses among non-voters in 2010, grew by 2014 to the point that the CER concept was more understandable to the general public. We thank Mark Schlesinger for this point.

76. Interestingly, support for CER among Independents also increases with the Obamacare reference (from 54.7 to 60.1, a 10 percent increase).

77. This drop in support is even more pronounced among self-reported voters from the Republican Party (from 53.6 to 41.6, a 22 percent decrease).

Notes to Conclusion

1. This chapter discusses the political challenge of endowing EBM reforms with durability. It does not provide a policy analysis of specific reform options. For an excellent discussion of the case for the kind of reforms we believe are needed—such as strengthening the Medicare coverage process to reduce or eliminate funding on low-value treatments—see Bagley, Chandra, and Frakt 2015.

2. Amitabh Chandra states: “PCORI may do good work occasionally, but they’re not cost-effective. We’ve not learnt much relative to the $$$ they receive.” https://twitter.com/amitabhchandra2/status/798993185682821120.

3. The ACA does not explicitly prohibit the consideration of costs; rather, it forbids PCORI from using a dollar per QALY (quality-adjusted life year) metric “as a threshold” for establishing cost-effectiveness or for making recommendations.

4. Quoted in Longman 2013, 22–23.

5. Gray, Gusmano, and Collins 2003.

6. Wilensky 2006, w579.

7. McGinn 2015, 2.

8. Jerry Avorn 2011; McCulloch et al. 2013; Reames, Shubeck, and Birkmeyer 2014.

9. Schlesinger and Grob 2017, 70.

10. Goldberg et al. 2001.

11. M. Rodwin 2001; Swenson n.d.

12. Patashnik 2008.

13. Kaufman 1976, 64.

14. Berry, Burden, and Howell 2010.

15. Bimber 1996; Sorenson, Gusmano, and Oliver 2014.

16. Patashnik 2008.

17. On policy sustainability, see Heclo 1998.

18. Maltzman and Shipan 2012. Of course, it is also important to examine the direction of postenactment policy change, not merely its existence. Some amendments are friendly, while others weaken preexisting statutes.

19. Mayhew 2012, 263.

20. Skocpol 1992; Pierson 1995.

21. Mayhew 2012, 263.

22. Bagley 2013.

23. Campbell 2005.

24. Patashnik 2008.

25. Patashnik and Zelizer 2013; Oberlander and Weaver 2015.

26. This is not to suggest that diffuse-benefit reforms can never become self-reinforcing. The key point is that the direct beneficiaries of such policies will rarely develop into an effective organizational force because the per capita stakes are too small. When such policies do generate positive feedback, it is typically either because they bring about institutional shifts that privilege common interests or because they produce substantial resources for service providers that obtain an incentive to protect their “spoils” (Patashnik and Zelizer 2013).

27. Patashnik 2008.

28. Patashnik 2008; Moe 2015.

29. Levine 2006; Patashnik 2008.

30. Ironically, the criticisms against CER and PCORI were reinforced in 2015 when the Obama administration announced a new Precision Medicine Initiative to advance medical research that takes into account the specific characteristics of individuals, such as a person’s genetic makeup or the genetic profile of an individual tumor. While CER and precision medicine should be mutually reinforcing—both are intended to ensure that patients receive the treatments that work best—the administration defended its new initiative by focusing on the need to avoid “one-size-fits-all” medicine. This is precisely the language that critics have used to attack CER (see http://www.whitehouse.gov/the-press-office/2015/01/30/fact-sheet-president-obama-s-precision-medicine-initiative).

31. Sedensky and Alonso-Zaldivar 2015.

32. The Independent Payment Advisory Board (IPAB) was established as an institution to constrain Medicare spending. Congress must consider Medicare reforms proposed by the board under special fast-track legislative rules, including limits on debate, designed to ensure speedy action. If Congress does not enact legislation containing those proposals or alternative policies that achieve the same savings, the IPAB’s recommendations go into effect. Republicans and some Democrats have argued that IPAB represents an unacceptable delegation of legislative authority to an unelected body. Six years after the passage of the ACA, Congress had not appointed any members to IPAB’s board, and President Obama had not nominated anyone. It remains to be seen whether IPAB will endure and shape health policy—or become an irrelevancy (Oberlander 2016).

33. Keller et al. 2015.

34. Keller et al. 2015.

35. Brill 2015, 139; see also Patel 2010; Longman 2013. The ACA does not prohibit Medicare from using CER, but it limits how study results can be used. Specifically, the CMS is allowed under the ACA to use CER to make a determination concerning Medicare coverage if (1) such use is through an iterative and transparent process and (2) a determination to deny Medicare coverage for a product or service is not based solely on CER. In addition, the ACA does not prohibit PCORI from considering costs. Rather, the statute forbids PCORI from using a dollar per quality-adjusted life year metric “as a threshold” for establishing cost-effectiveness or for making recommendations.

36. Orren and Skowronek 2004.

37. We avoid entering into the debate over whether agencies like the FDA were at their birth (or later became) “captured” by drug companies seeking to use government regulations to limit entry and preserve market power (Stigler 1971). Our claim here is simply that an indicator (and cause) of policy sustainability is compatibility between a policy’s goals and the identities, interests, and incentives of firms subject to the policy’s mandates. On theories of regulatory capture and the FDA, see D. Carpenter 2013.

38. Abrams 2012.

39. Abrams 2012.

40. Mahinka 2013, 19–20.

41. Mahinka 2013, 20.

42. Mahinka 2013, 20.

43. Temple 2012, 56.

44. Mahinka 2013, 19.

45. For a more optimistic view, see Robinson 2015.

46. Swenson n.d.

47. McCubbins, Noll, and Weingast 1989.

48. Lewis 2012.

49. Moe 1989.

50. D. P. Carpenter 2001, 359.

51. Eskridge and Ferejohn 2013, 190.

52. Moe 1989.

53. To be sure, scholars recognized during the 1980s and 1990s that political deals could be undermined by future Congresses (this risk is known as “legislative drift”) as well as by the bureaucracy (“agency drift”). Nonetheless, the assumption was that the original bargain had a reasonable level of support in the enacting Congress; there was little discussion of the sustainability risks put into play when ambitious laws pass by razor-thin, partisan majorities. On legislative drift and agency drift, see Shepsle 1992.

54. Lee 2009.

55. Berry, Burden, and Howell 2010.

56. Millenson 2015.

57. Patashnik 2015.

58. Wilson 1989.

59. D. Carpenter 2010, 33.

60. D. Carpenter 2010.

61. The FDA’s reputation has however been tarnished in recent years as a result of scandals, including its failure to quickly recall the arthritis drug Vioxx (Teles 2010).

62. Teles 2010, 49.

63. Tanden et al. 2014. The American Enterprise Institute had an even less forgiving assessment. “PCORI has attracted a skilled leadership team that rivals many similar private institutions. But even with its talent, and its $3.5 billion, ten-year trust fund … PCORI never had enough resources to fund the rigorous kinds of clinical trials that would actually inspire change in clinical practice. It never aimed to make grants on a scale to accomplish this mission. [Its] proponents and opponents alike didn’t want it to. Proponents didn’t really want definitive clinical answers, just policy screeds that government payers could peg decisions to. And opponents didn’t really want to see it work at all” (Gottlieb 2014).

64. Tanden et al. 2014. PCORI disputed these conclusions. See http://www.pcori.org/content/statement-pcori-executive-director-joe-selby-md-mph-following-center- american-progress-event.

65. See Emanuel, Spiro, and Huelskoetter 2016.

66. Rice 2015.

67. PCORI, “Comparison of Peer-Facilitated Support Group and Cognitive Support Group and Cognitive Behavioral Therapy for Hoarding Disorder.”

68. NPC 2015.

69. Quoted in Schulte 2015.

70. Ashton and Wray 2013.

71. PCORI 2013, 2.

72. Sox 2012, 2181.

73. Selby 2014.

74. Klein 2010.

75. Skocpol 2003.

76. Moore 2015. See also Carpenter’s thoughtful remarks at a New America Foundation panel: https://www.youtube.com/watch?v=8vGusVXMgxY.

77. Finegold and Skocpol 1995.

78. Twedt 2016.

79. J. Avorn 2009.

80. Dayoub 2014.

81. Indeed, it is worth noting that the AHRQ back surgery study that generated such controversy was carried out through one of the agency’s Patient Outcomes Research Teams (PORTs), multidisciplinary centers that focused on particular medical problems and reviews (see Gray, Gusmano, and Collins 2003). However, PORTs were fragmented and not closely tied to specific hospitals and medical centers. We are imagining locally based centers that would have stronger network ties to politicians and offer attractive credit-claiming opportunities.

Notes to Appendix to Chapter 3

1. There is a general professional consensus, supported by the findings of the Institute of Medicine (IOM), that physicians lack adequate information on the relative effectiveness of different treatment modalities (see AMA 2011). However, some doctors may be wary of CER for conceptual reasons, believing that studies focus on “average” treatment effects and miss the idiosyncratic ways in which an intervention works for a particular subgroup of patients. In addition, physicians who earn a significant portion of their income from performing a given procedure may fail to “implement the results of a study that found the procedure to be no better than a less costly or safer alternative” ( J. Avorn and Fischer 2010, 1894).

These diverging possibilities are reflected in both the opinions of individual doctors and the formal statements of major physician associations. For example, a survey of physician opinion found that more than half of physicians agree that having more hard data would improve the quality of care, but that two-thirds are concerned that CER will be used to restrict their freedom to select treatments for their patients (Keyhani, Woodward, and Federman 2010; also see Ray and Sokolovsky 2009). A 2011 survey of primary care physicians and specialists found seven out of ten doctors believe that implementing CER will be made difficult by conflict between clinical effectiveness and cost-effectiveness (Deloitte 2011). And while groups such as the AMA have released official statements expressing support for CER, these endorsements are contingent on the CER agency maintaining “physician discretion in the treatment of individual patients” (AMA 2011). A 2012 letter from two former AMA presidents to the director of PCORI stated, “Physicians have always been hopeful that CER will be done in ways that actually support us in making decisions with our patients. But at the same time we’ve been deeply concerned that the research could easily be skewed towards a government cost-cutting agenda and misused in ways that come between doctors and patients” (Coalition to Protect Patients’ Rights 2011).

Similar concerns that CER will lead to government interference with the doctor-patient relationship were raised in a statement from a Tennessee oncologist: “Comparative effectiveness research done right is a good thing for our country’s health care system. However, when the government begins telling physicians what medicines they should or should not prescribe, ultimately it’s the patient who suffers” (Patton 2013). The American Academy of Orthopaedic Surgeons (AAOS) and other doctors’ groups have given CER similarly qualified endorsements (AAOS 2009). Because levels of support for the implementation of CER among both individual doctors and medical associations remain in flux, it is important to understand how changes in the public positions such actors and groups adopt—including whether they adopt a position at all—may affect public support for CER.

2. The outcome measure was respondents’ support for a policy that would “allow the government and insurance companies to refuse payment for treatments or procedures if their effectiveness has not been demonstrated by rigorous scientific evidence” and was measured using a 100-point sliding scale ranging from “strongly oppose” (0) to “strongly support” (100) where the midpoint of the scale indicated that respondents “neither support nor oppose this policy.”

3. These nine conditions were randomly assigned with equal probability, except that 20 percent of respondents were randomly assigned to receive no group cue and every other condition (e.g., leading doctors oppose) was assigned to 10 percent of respondents.

4. In this experiment we did assign respondents to the condition in which neither a political cue nor a group cue was provided.

5. The 5.4 net difference between leading doctors support (48.1) and leading doctors oppose (42.7) in column F is also statistically significant (p=.07).

6. The difference between the support and opposition of high-level government administrators is also statistically significant in column F (p=.07), whereas the difference between the support and opposition of top drug companies in column F is not (p=.32).

7. Aggregate public support for the proposal is also not significantly affected by the support of both parties (p=.30) or the support of Republicans and opposition of Democrats (p=.48).

If you find an error or have any questions, please email us at admin@erenow.org. Thank you!