Thursday, March 21, 2013

Comparing Research, Quality Improvement, and Lean Startup: A Crosswalk between Approaches to Innovation

Innovation in healthcare is fundamentally changing.[1] In order to understand the impact of new approaches to innovation on health, we need to be able to measure these approaches. In order to measure them, we need to be able to define them. The three approaches that will be reviewed are research, quality improvement (QI), and lean startup methodology (LSM). Research is fundamentally different from both QI and LSM. But the differences between QI and LSM are more subtle. The discussion below primarily focuses on the similarities and differences between QI and LSM. By providing a crosswalk between the three approaches to innovation, we hope to create a framework for measuring their comparative effectiveness on optimizing value for patients, providers, and the health system as a whole.


Research is the rigorous application of the scientific method that is used to determine a causal relationship between an intervention and an outcome. [2] This approach stringently controls for variability. The gold standard to this approach to scientific experimentation is the randomized controlled trial (RCT).

QI is a systematic, iterative method of creating better care for a specific, local patient population. [3] Care can be improved by making it more safe, timely, effective, equitable, efficient, and/or patient-centered.

LSM is an approach to launching a product or service that relies on iterative scientific experimentation to shorten product development cycles and minimize risk to the business model. [4] This set of techniques emphasizes achievement of validated learning from customers as efficiently as possible.


Research is often used to test the efficacy and/or effectiveness of various types of intervention within a patient population. RCTs in particular may provide an opportunity to gather useful information about adverse effects, such as drug reactions. This study design typically helps to answer very focused, prescribed questions which remain fixed throughout the duration of the study. Similarly, the hypothesis is an educated guess that is also rigorously defined and is not subject to much change throughout the trial.

RCTs typically have four phases: enrollment, intervention allocation, follow-up, and data analysis. The cornerstone of RCTs is the randomization process which eliminates substantial bias from the study. RCTs are the gold standard for clinical trials and carry the most weight in influencing the establishment of clinical guidelines and best practices.

Quality Improvement 
QI is a model of improvement that tries to achieve a measurable change in a local care delivery to a specific patient population.[3] The initial approach requires the identification of an aim that is measurable over a specific timeframe with a definitive target population. The measures for assessing achievement of that aim include outcome measures, process measures, and a combination of the two.

To determine the potential impact of an intervention through QI, it is important to understand the basic elements of the system in which the intervention is being deployed. According Deming’s work, understanding the system requires Appreciation of the System, Understanding variation, the Theory of knowledge, and elements of psychology on change.[3]

This system understanding is used to gain an understanding of the primary and secondary drivers of the underlying problem. Primary drivers are the system components that contribute directly to the chosen aim or goal. Secondary drivers are elements of the primary drivers, which can be used to create change projects.

Once the aim is identified, measures established, and impact of the intervention contemplated, the aim is then transformed into an improvement goal that is made actionable through the Plan, Do, Study, Act (PDSA) cycle.

After each PDSA cycle, the learning is compiled and assessed for whether the intervention will be taken to scale, dropped completely, or will be subjected to further PDSA cycles. The purpose of the PDSA is to learn what works or does not work and why it did or did not work.

Lean Startup Methodology
The Lean Startup Methodology is an approach that combines fast-release, iterative development methodologies, customer-centric testing for value, systematic elimination of risk from the business model, and achievement of scalability through repeatable revenue-generation. [5]

LSM has its roots in manufacturing but was more recently refined in the IT startup community to inexpensively and quickly create software that people are willing to pay for. Although ideally suited for technological innovation, LSM is a generalizable process of validated learning that can be applied in myriad settings including non-profit and non-technological development of products and services.

The initial step of LSM is identifying a global vision for a large-scale problem that the innovator is passionate about solving. Once a vision is clarified, there is a rigorous process of selecting strategies for exploring how that vision can be achieved. Within each strategy, multiple product ideas are tested for value creation and execution of the vision.

The Lean Startup Up, Eric Reis 
Each strategy to achieve the vision of the innovator is processed through the lens of a business model. Heuristic tools accelerate traditional entrepreneurial thinking about business models into concise thought-experiments to identify the greatest risks to scaling a business model to the largest audience possible.

   Ash Maurya. Running Lean.

The business canvas is a great example of how a several hour business plan can be distilled into a 20-minute exercise. Once the basic elements of the business model canvas are identified, the next step is to determine the highest risk elements of the business model. It is helpful to think about the 3 major types of risk to a business which include 1) product risk (can something be built that other people want); 2) customer risk (is there a customer who perceives a problem big enough to make it worth solving and paying for the solution); and 3) market risk (are there enough customers that can reliably purchase your product at a high enough cost to grow the company at a fast enough pace to make it worth investors time to invest in you in the first place to get your idea off the ground.)

Once the 3 risks are identified, they can be mitigated through a four step process of innovation accounting that keeps track of progress of each experiment being conducted in the value discovery and risk mitigation process. The four steps include: 1) understanding the problem, 2) defining the solution, 3) validating qualitatively, and finally 4) validating quantitatively. 

Within each of these four steps, there are series of smaller experiments that are conducted through an approach including building a test, measuring the impact of that test, and learning from the test. The learning from the build-measure-learn (BML) test is then used to inform the next series of tests.

Another way to look at LSM is through the lens of customer development, which is the rapid process of discovering what customers are willing to pay for. There are 4 elements of customer development: Customer Discovery, Customer Validation, Company Creation, and Company Building. These four stages overlap with the 4 steps of innovation accounting and highlight important thresholds of value creation including problem-solution fit (does the proposed solution address the assumed problem) and product-market fit (will the customer actually pay for the solution being offered to them). Customer development is intimately intertwined with product development, which are techniques for building technology in rapid iterative sprints to complement learning achieved through customer development. 


Similarities Between QI and LSM

One of the only similarities that all three approaches have is that they are all grounded in the scientific method: question, hypothesis, methods, metrics, conclusions, use of outcomes to inform next hypothesis. Otherwise, QI and LSM are fundamentally different from research.

An important similarity between QI and LSM is their grounding in improvement theory, particularly the theory employed in lean manufacturing made famous by Toyota. [4] Their common origins make QI and LSM share many overlaps. To begin the comparison at the macroscopic level, both approaches start with a broad problem and focus on a precise, actionable intervention to fix that problem.
The process of identifying a vision and executing a strategy in LSM serves a similar function as creating an aim in QI. Strategy creation in LSM also shares common elements with measurement and identifying drivers in QI. Identifying the intervention that will be evaluated through QI is analogous to identifying a product that will be tested for problem-solution fit in LSM. Once the vision/aim, strategy/methods, and products/intervention have been identified, both LSM and QI begin the testing process.

The experimentation phase of QI and LSM is based on PDSA and BML cycles, respectively. The Plan and Do Stages in QI correspond with the Build Stage in LSM. These stages represent the definition and setup phase of the experiment.

The Do phase falls between the Build and the Measure phase of LSM and is represented by the experiments that are executed, the prototypes that are demoed, and the mock ups that are put in front of a customer.

The Study phase in QI and the Measure phase in LSM are similar. In LSM, the moment the experiment is shown to at least one customer, the measure stage begins and data is analyzed. The Study phase of QI also bleeds into the Learn phase in LSM because it entails comparing data to predictions and summarizing what was learned.

The Act Phase ties together the original visioning into a final intervention.

The next steps after the Act phase of QI and the Learn phase of LSM are based on the following: if the hypothesis was clearly validated, then the intervention is deployed in QI or scaled in LSM. If the hypothesis is clearly invalidated, then the intervention is dropped in QI or a pivot occurs in LSM. If the learning is unclear, both approaches propose refining the question and conducting further testing.

In addition to overlap between the component parts of QI and LSM cycles, these approaches are also similar in the way they sequentially aggregate small scale tests to build knowledge over time.

The iterative tests in LSM are performed along the 4 key activities of eliminating risk in the business model (understand the problem, define solution, validate qualitatively, and validate quantitatively.) There are direct overlaps between LSM and QI along all 4 domains. Although many examples exist to demonstrate this overlap, we will highlight just a few.

Within the first domain of understanding the problem, an important question that needs to be answered is how to decrease product risk by having customers rank their problems. The analogous exercise in QI is identifying the primary and secondary key drivers of care delivery.

In the second domain of defining the solution, LSM addresses customer risk by defining the target audience for an intervention. This process has parallels with QI’s approach of creating an aim for a specific target population.

The third stage of mitigating risk through LSM is validating an intervention qualitatively. During this stage, a key LSM concept is the Minimum Viable Product (MVP). This is the product with the fewest number of features needed to get users to “pay” in some form of a scarce resource. Identification of the MVP is similar to clearly identifying the change that can be made through the QI intervention that will result in improvement.

The last stage of eliminating risk through experimentation in LSM is quantitative validation. The product risk is mitigated in this stage by identifying rigorous metrics for determining whether an intervention not only provides replicable value but if that value can be grown to a large scale. In QI, the analog is the establishment of measures for assessing achievement of the aim. And similar to LSM, these measures may be outcome measures, process measures, or a combination of the two.

Data Analysis
Research takes a very different approach to data analysis relative to QI and LSM. In research, the goal is to gather as much data as possible to prove or disprove a hypothesis through rigorous statistical measures. Testing is one large ‘blind’ test. And, outcomes are explicit and are measured for statistical significance with very high threshold for type I errors, typically with p-values needing to be < 0.05.

Both QI and LSM are very different from research. For both QI and LSM, the goal is to gather just enough data to learn and complete another cycle. They both entail testing that has many sequential, observable tests. And the certainty of the data in QI and LSM is implicit.

With LSM, testing is also sequential and observable, but it is guided by where the largest risk for the business model lies. The level of evidence in LSM depends where in the process of innovation accounting the tests are. As the testing progresses from understanding a problem to quantitative validation, the level of rigor of data increases.

There is further nuance in healthcare because often times adoption of an intervention may require a higher level of validation than a startup outside the walls of a hospital. So the level of validation depends also on the end user or purchaser of the intervention.

In general, though, data needed for early validation is closer to the level of precision of smoke signals or gross trends rather than that of p-values.

Differences Between QI and LSM

Cycle Time

Research takes the longest of the three approaches to deploy, typically lasting years. At the end of a cycle, whether the study gets published or not, there may be a role for follow up studies, but rarely does a research trial result in a direct improvement in care.

QI, on the other hand, does not take years to complete. Rather, it can take weeks to months. This approach employs rapid, small tests of change. Invalidation is a welcome outcome. After the cycle is complete, the next step is either local deployment or another cycle.

LSM has a very rapid cycle and is driven by rapid customer-centric iteration synchronized with 2-3 week agile product development sprints. The pace of discovery is typically accelerated by the (unlikely) opportunity for substantial financial gain and by typically very scarce resources. The goal of each cycle is rapid validation or invalidation. If validation happens, the company attempts to reproduce and scale the intervention. If a hypothesis is invalidated, a pivot occurs where we change just one element of the business plan but still incorporate previous learning. If neither validation nor invalidation happen, then the questions being asked need to be refined in order to achieve definitive validation or invalidation of a hypothesis.

Research is very expensive to conduct. On average, excluding overhead expenses, it costs slightly more than $6,094 (range, $2,098 to $19,285) per enrolled subject for an industry-sponsored trial, including $1,999 devoted to nonclinical costs. [6] Overhead is typically 60-90% on top of that cost. So the average cost per patient is approximately $10,000 to conduct a clinical trial. RCTs often require enormous resource investments. The recently completed Clinical Outcomes Utilizing Revascularization and Aggressive Drug Evaluation (COURAGE) trial, which compared optimal medical therapy with and without percutaneous coronary intervention (PCI) in 2287 patients, resulted in nearly $60 million in total costs shared by both public and private sponsors. [7]

The cost for conducting QI is much lower than for an RCT. Unlike RCTs, which have a high floor and a high cost ceiling, QI generally has a low floor and a low ceiling. Some literature suggests that an average QI project yielding a cost savings of $100,000 can cost about $15,000 to implement. [8]

Similar to QI, the cost of LSM has a very low floor. In fact, with current prototyping software available for free, the majority of validation through LSM can be done for free, excluding opportunity costs.

Although typically low, the ceiling for LSM has no real limit. The point of LSM is not to financially bootstrap the learning process, but rather to learn efficiently. Some forms of learning may be expensive, particularly when it comes to health systems innovation and policy creation. But LSM ensures the path of least resistance to discovering value which minimizes waste and in turn minimizes costs. A common mantra along this theme in the lean startup community is “fail fast, fail cheap, fail often.”

In addition to the differences in cycle time and cost, QI and LSM have profound differences in intent and outcome.

The purpose of research is to discover new knowledge that is generalizable and advances the practice of medicine. For QI, the purpose is to bring new knowledge to daily practice and achieve local improvement in process or outcome.
LSM, however, has a very different purpose. The intent of LSM is to discover and/or create enough value in a service or product that someone will be willing to pay for it or exchange some other scarce resource such as time or attention for its use.

Example Illustrating Difference in Purpose between QI and LSM

Traditionally employed in the for-profit sector, the purpose of LSM was to efficiently discover the highest ROI products or services. This was achieved through developing repeatable and scalable business models as quickly and inexpensively as possible. At its core, the purpose of LSM is validated learning that informs commercial value creation. And validation for LSM is payment from the patient or end-user for a product or service.

Validation for QI, on the other hand, is improving care for the specific local patient population through making it more safe, timely, effective, equitable, efficient, and/or patient-centered. The important point to highlight here is that better care is not necessarily something a customer would want to pay for because they may want to spend their limited resource (money, time, attention) on something different, like paying rent.

An example of the difference in purpose between QI and LSM is no-shows to the clinic. A patient may want good healthcare, but they may not be willing to pay for it. For example, they may not be able to afford the co-pay or they would rather spend their copay on a cheeseburger. In the latter scenario, QI may lead to higher quality care but it is not care the end user is willing to pay for with their time or money, even though it may be better for them.

Another scenario demonstrating the difference between QI and LSM is when the end user is not the customer. For example, insurance companies, Medicare, or Medicaid (payers) typically cover the cost of most medical expenditures in the US. QI may lead to better care for the patient. But if better care does not result in lower cost or higher revenue for the payer, then that service may not be reimbursed and ultimately may never be used. In fact, a recent analysis by the Commonwealth Fund found that insurers paid less than 1 percent of premiums on quality improvement activities in 2011. [9]

Conversely, with LSM, an intervention would be developed through validation from the ultimate customer so that the health intervention (perhaps a wellness app) shows cost savings and ultimately would lead to reimbursement because the payer would find that valuable. The end vision is the same: improve health. The strategy of executing that vision is very different.

Most research publications provider small contributions to a larger body of research that may eventually lead to development of an intervention that may benefit patients. So there is usually no immediate, direct benefit to patients on a large scale from any one successful research study cycle.

In QI, the objective is to improve local problems so the resultant solutions are very context dependent and are limited in their scalability beyond the immediate system in which they are deployed. Since QI aims to improve the patient experience emanating from drivers at the “C-level” or microsystem-level down to the D-level or the patient experience level, then by definition, QI solutions are local and would need to be adapted and evolved if they were attempted to be scaled.

Since the origins of LSM are in the for-profit realm, the goal is to create products and services that will have the largest possible return on investment (ROI). Consequently, the methodology is designed to generate as much generalizability and scalability as possible.

Choosing the Right Approach

For logistics purposes, there are certain types of questions more suitable for QI or LS rather than RCTs, such as care delivery models, clinical workflow interventions, or technology assessments. But more broadly, the type of method for evaluation can be looked through the lens of the degree of belief that it will lead to improvement, the cost of failure, and the commitment to the intervention within the organization. [3]

As pretest likelihood of success decreases, cost of failure increases, and/or level of commitment decreases, then aim to do smaller tests of validation. As pretest likelihood of success increases, cost of failure decreases, and/or level of commitment increases, then aim to do larger tests of validation. With RCTs, its very hard to do a small-scale test of validation.

LSM is particularly well suited for testing when there is a low degree of belief because this approach is best at minimizing down-side risk, or the cost associated with a failure to validate. LSM is also well suited when the goal of creating an innovation is to meet the needs of the end user, to generate revenue, to build scalable solutions, and to move quickly.


Although one of the primary drivers of LSM is to discover commercial value, LSM can simultaneously be used as a vehicle to discover social or clinical value. LSM enables value optimization through the fastest and least expensive route to discovering value. And while there may not be sufficient validation for commercialization of an intervention developed through LSM, it can still provide value to patients or providers. So using LSM can simultaneously achieve the QI goal of creating better care and explore whether better care can also lead to the generation of revenue. And the ability to generate revenue has an indirect effect on improving care because it facilitates sustainability. In a time when hospitals are increasingly being squeezed due to healthcare payment reform, hospitals may benefit from exploring LSM as a vehicle for achieving the Triple Aim as well as protecting their bottom and top lines.

Special thank you for Dr. James Moses for his insights into QI.


[1] Zuckerman et al. Health Services Innovation. JAMA 2013.


[3] IHI. QI Curriculum. 2013.

[4] Reis E. Lean Startup. 2012.

[5] Vlaskovitz & Cooper. Entrepreneurs Guide to Customer Development. 2012.

[6] Emanual et al. The Costs of Conducting Clinical Research.

[7] Nallamathou et al. Key Issues in Outcomes Research. Circulation.

[8] Juran. The Quality Improvement Process.

[9] Hall & McCue. Insurers’ Medical Loss Ratios and Quality Improvement Spending in 2011. Commonwealth Fund. 

No comments:

Post a Comment