“Balanced incomplete block designs: selected business-related applications and usage caveats”

Whenever respondents must rank-order a large number of items and/or the reliability of their rankings may be questionable, balanced incomplete block designs (BIBDs) represent a more effective means for doing so than either complete rankings or paired comparisons for business and marketing researchers. By providing a type of balancing and replication across items and respondents, BIBDs significantly reduce the number of subjective evaluations each individual must make. But, at the same time, BIBDs allow a limited number of respondents as a group to rank many items. This balancing and replication in BIBDs also reduces standard deviation, which increases the precision of a study. BIBDs, therefore, can improve response rates as well as increase the accuracy and reliability of the data collected. After discussing the general nature of BIBDs and statistical techniques for analyzing preference data collected by BIBDs, three business-related applications are presented to illustrate the benefits of BIBDs. Next, caveats concerning the use of BIBDs are presented. In the last section, advantages of BIBDs are discussed.


Introduction ©
Business and marketing researchers oftentimes confront situations where they need to collect subjective or judgmental information from respondents.To further compound the situation, these individuals must rank-order objects in terms of preference or importance.In addition, there are many items to rank (e.g., 15).As a result, the ability of respondents to effectively and reliably rank-order these objects may be questionable.Finally, only a few qualified subjects, that are sufficiently knowledgeable, exist or are available (e.g., 10).Together, these factors elevate the researcher's task to one of Herculean proportions.

Purpose
The purpose of this paper is to provide business and marketing researchers with a mathematically accepted alternative for collecting subjective data from a small number of respondents, who must rank many items (e.g., product attributes) in order of importance or preference, which can result in higher response rates, reduce study costs, and increase the accuracy and reliability of the data.After discussing this method, with its accompanying statistical techniques, three real-world examples will be presented, which illustrate the application of this method.Next, caveats concerning the use of this approach will be provided.In the last section of the paper, advantages of this method will be discussed.

Development of the problem
As a means for determining the relative importance of a set of items, business and marketing researchers commonly ask respondents to rank these from most to least preferred.However, if the number of items to be ranked is relatively large (e.g., 15), and the number of respondents is limited (e.g., 10), it may be impractical for all items to be applied to each individual for comparison at the same time (e.g., Conover, 1999; Gisbrecht and Gumbertz, 2004).Otherwise, this may be more than respondents can effectively handle.Feeling overwhelmed, they may not assign a rank to each item.
The alternative typically used in lieu of complete rankings is paired comparisons.Here individuals indicate their preference for one item in successive pairs of items.While paired comparisons substantially reduce the number of items respondents must evaluate, it dramatically increases the number of pairs of items they must rate.Continuing the previous example of 15 items, paired comparisons would necessitate 105 individual sets of paired items to be evaluated.Such a laborious undertaking would likely tire and/or frustrate respondents 1 .In either case, the accuracy and reliability of respondents' answers may be questionable (e.g., Gibbons, 1971; Green et al., 1988;Green et al., 1989;Stinson, 2003).The ability of respondents, therefore, "to rank objects effectively and reliably may be a function of the number of comparative judgments to be made.For example, after 10 different brands of bourbon have been tasted, the discriminatory powers of the observers may legitimately be questioned" (Gibbons, 1971, p. 257).
3. General nature of balanced incomplete block designs 2 .Several business and marketing researchers have found a more effective means for collecting judgmental data than complete rankings and paired comparisons.Labelled incomplete block design, this preference data collection method is especially suited to cases where the number of items is relatively large and the number of respondents is limited.In this approach, items are first divided into blocks according to a specific design.Then, the items in each block are presented to respondents for evaluation (e.g., Cox, 1958;Wu and Hamada, 2000;Aloke, 2010).
If the design is balanced such that each block contains a specified number of experimental units, each item appears in a certain number of blocks, and every item appears with every other item an equal number of times, then the design is called a balanced incomplete block design (BIBD) (e.g., Federer, 1955;Conover, 1971;Cochran and Cox, 1992;Conover, 1999;Colbourn and Dinitz, 2007).Every BIBD must satisfy these two defining relations: where t = number of treatments (items) to be examined, b = total number of blocks (respondents), k = number of experimental units per block (k < t), r = number of times each treatment appears (r < b), and λ = number of blocks in which the i th treatment and the j th treatment appear together (λ is the same for all pairs of treatments).
BIBDs are suitable in cases involving "subjective ranking by a small panel of judges for the detection of differences" (Bradley and Terry, 1952, p. 335); that is, in situations where "individuals are asked to make a comparative rating of different objects that are presented to them" (Cochran and Cox, 1957, p. 440).Through balancing and replication of items and respondents, BIBDs reduce standard deviation.This, in turn, increases the precision of the study, thereby increasing the accuracy and reliability of respondents' answers (e.g., Green et al., 1988;Green et al., 1989;Colbourn and Dinitz, 2007).

Statistical analysis techniques for BIBD data
Several approaches have been suggested for analyzing data in BIBDs.The traditional technique for doing so has been analysis of variance.However, the spe-cific form of analysis of variance for BIBDs "differs according to the nature of the design, the number of replications, and the restrictions" (Banks, 1974, p. 493) 3 .An analytical procedure that is computationally simpler than traditional analysis of variance techniques incorporates the Durbin test (e.g., Durbin, 1951;Conover, 1971;Hollander and Wolfe, 1999), coefficient of concordance (Kendall, 1955;Gibbons, 1971), and Guttman scale (Guttman, 1946).
where R j = In terms of a decision rule, the null hypothesis should be rejected at the alpha level of significance, if the Durbin test statistic T exceeds the (1 -α) th quantile of a Chi-square random variable with t -1 degrees of freedom (e.g., Conover, 1971; Hollander and Wolfe, 1999).

Coefficient of concordance.
The coefficient of concordance provides a measure of the degree of agreement among respondents regarding their rankings of items.The coefficient of concordance is defined in convenient computing form as (e.g., Kendall, 1955;Gibbons, 1971): The value of W is statistically significant at the same level as the Durbin test statistic, because the coefficient of concordance is simply a linear transformation of the Durbin test statistic. 4.If the null hypothesis of no difference among treatments is rejected, a Guttman scale can be developed.A Guttman scale is appropriate whenever an ordered set of statements exists, and agreement with one statement implies agreement with all statements that are less positive.This notion of agreement is best handled quantitatively by using scalogram analysis developed by Guttman (1946).Because the researcher is also interested in determining the intensity with which respondents rank-ordered objects, a Guttman scale can be derived directly from the rankings.The Guttman scale, therefore, is applicable to preference testing where the primary interest is in the objects under comparison (e.g., David, 1988;Cochran and Cox, 1992).After plotting the results of these calculations on a linear scale, the researcher can visually determine those items respondents considered most important, as well as ascertain the intensity with which they ranked these items.

Selected business-related applications
In the remainder of this paper, three business-related applications are presented to illustrate the benefits of BIBDs 5 .These applications range from dimensions for selecting real estate brokers to procurement strategies for each stage of a product's sales cycle to location sites for a distribution center.Green (1975) asked corporate real estate 4 An alternative method to computing a Guttman scale is to rank item columnar sums in order of increasing magnitude.However, relative to the Guttman scale approach, this procedure is ad hoc in nature.Further, while the ranking of treatment columnar sums has strong theoretical support in the literature on complete rankings (e.g., Kendall, 1955;Siegel, 1956), it is not clear how well this technique applies to blocks of incomplete rankings.Therefore, the Guttman scale reflects more accurately the intensity of respondents' incomplete rankings of items.This will simplify the researcher's task of differentiating which items respondents judged of greatest importance. 5BIBDs can be applied to any situation where systematic comparisons are possible.To date, a plethora of applications of BIBDs have been made in the fields of agriculture, biology, engineering, medicine, physical and chemical sciences, communications systems, cryptology, business, education, healthcare, mathematics, pharmaceuticals, to name a few (e.g., Yang, 1985;Stinson, 2003;Van der Linden et al., 2004;Bose and Mukerjee, 2006;Dey, 2010;DeMuth, 2014).Some of more recent and specific applications of BIBDs include software testing, biological assay, medical clinical trials, sensory analysis, networking, quality control, image analysis, coding theory, bioenergy, algorithms and analysis, mathematical biology, signal processing, genetics, and industrial experimentation (e.g., Yang, 1985 managers from several different large-sized companies in a major American Southern metropolitan area to identify and rank in order of importance attributes they used when selecting independent brokers.Because of qualitative considerations (e.g., real estate managers would have trouble ranking more than seven selection characteristics, and sample constraints associated with the pilot study), Green felt the most appropriate BIBD was the following: 11 selection dimensions (treatments) would be evaluated; 11 corporate real estate managers (blocks) would be sampled; each selection factor (treatment) would be repeated six times; each manager would rank-order six selection characteristics; and each selection factor would be compared with every other attribute by three managers.In ranking their specific set of six broker selection characteristics, corporate real estate managers were asked to assign the rank of "1" to the characteristic they felt was most important in selecting a broker, the rank of "2" to the second most important attribute, and so on until the rank of "6", which represented the least important dimension.
The results of the pilot survey among corporate real estate managers concerning broker selection criteria are summarized in Table 1.The first row of this table means Corporate Manager # 1 ranked his/her set of six real estate broker selection attributes from most to least important as follows: Dimension #s 2, 1, 5, 3, 6, and 4.And so on, until Corporate Manager # 11, whose ranking of his/her set of six broker selection characteristics was: Dimension #s 7, 1, 10, 4, 11, and 2.
After adding the ranks for each broker selection factor (or column) in Table 1, the Durbin test statistic was computed: Next, the coefficient of concordance was calculated: (3) (11)[( 11) 1] At the .05level, both were found to be significant.This meant there was a preferred order of broker selection characteristics among corporate real estate managers.In addition, these managers' rankings exhibited some degree of consistency.Since the rank of "1" signified the selection dimension each manager felt was most important while the rank of "6" indicated the least important, the attribute with the lowest sum of ranks would represent the charac-teristic corporate real estate managers felt was most important in selecting a broker.As shown in Table 1, this dimension was the "broker's real estate and business experience" (Dimension # 7).

Procurement strategies by product life cycle
stage.Berenson (1967)  Having defined the number of strategies and sample size, Raghavarao's BIBD layout table was consulted again.Of the four qualifying designs, the one with the highest efficiency value (i.e., tλ/rk) was selected as the most appropriate design.In this particular BIBD, each purchasing executive would rank seven procurement strategies, and each strategy would be repeated 10 times.Each strategy would be compared with every other strategy by three executives.With the BIBD parameters defined, the specific design for each PLC stage was formulated (Table 3).The first row of assignment of executives to blocks; and assignment of identification numbers to strategies.The survey was conducted through a combination of personal interview and paper-pencil methodology.
In the interests of brevity, only the results from the Introduction stage of the PLC will be presented with regards to purchasing executives' rankings of procurement strategies (Table 4).The first row of Following the summation of the ranks for each procurement strategy in the Introduction phase of the PLC, the Durbin test statistic was computed: Next, the coefficient of concordance was calculated: Both were found to be significant at the .05level for the Introduction phase of the PLC (Table 5).At least one procurement strategy within the Introduction stage yielded a larger value than at least one other strategy.That is, executives viewed some procurement strategies as being more important than other strategies in the Introduction phase of the PLC.Also, purchasing executives were utilizing the same criterion in evaluating procurement strategies in the Introduction stage.
For each of the remaining PLC phases -growth, maturity, saturation, decline, and abandonment, the Durbin test statistic and coefficient of concordance were both found to be significant at the .05level (Table 5).This meant at least one purchasing strategy in each of these stages yielded a larger value than at least one other strategy.That is, for each of these remaining PLC phases, managers perceived some procurement strategies as being more important than other strategies.In addition, these purchasing executives were using the same criterion in rating procurement strategies in each of these PLC stages.
As a means for determining the order and intensity with which executives ranked the 21 strategies in each PLC phase, a Guttman score was derived for each strategy (Table 6).If the most important procurement strategy received a rank of "1" and the least important strategy was accorded a rank of "7", then, those procurement strategies possessing the highest negative Guttman scores would represent the most important strategies for these purchasing executives.Next, these Guttman scores were plotted on a linear scale for each PLC stage.
Again, in the interests of brevity, only the results from the Introduction stage of the PLC will be shown.In Table 4, PS # 3 was rated more important in terms of sum of ranks by executives than PS # 2. But, the opposite was true with regard to Guttman scores.Several similar instances occurred (e.g., PS #s 5 and 14).In addition, some pairs of procurement strategies in Table 4 had the same sum of ranks (e.g., PS #s 1 and 6).The strategies in each pair, however, had different Guttman scores.This phenomenon transpired because sum of ranks is merely a summation of the ranks for each procurement strategy, while Guttman scores reflect the configuration of ranks within each strategy.Thus, Guttman scaling affords a more accurate picture of the actual order and distribution of purchasing strategies than sum of ranks.Of the 21 procurement strategies evaluated by purchasing executives for the Introduction phase of the PLC, these six strategies were rated most important (in order): PS #s 2, 3, 14, 5, 10, and 4 (Table 6 and Figure 1).Four of these procurement strategies (i.e., PS #s 2, 3, 5, and 4) coincided with the six strategies recommended by Berenson (1967) for the Introduction stage of the PLC (Table 2).

Location for a new distribution center.
A large multi-national conglomerate wanted to determine where to locate a new distribution center.After gathering preliminary information on 14 potential sites, the CEO eliminated three from further consideration.For long-term strategic planning purposes, the CEO wanted his top managers to evaluate and rank these 11 possible locations in order of preference.Realizing these executives may have difficulty doing so, he sought the assistance of a consultant (Rink, 2006) to make this task more manageable.This individual recommended the CEO use a BIBD.
Consulting a BIBD layout table (e.g., Raghavarao and Padgett, 2005) for 11 potential sites (treatments), four possible BIBDs emerged.Selection of the optimal BIBD depended upon both quantitative criteria (i.e., highest efficiency value, or tλ/rk) and qualitative considerations (e.g., number of objects respondents could reliably rank).The CEO felt top management could rank six locations more accurately than seven.As there were 11 executives, the CEO was constrained to a BIBD with 11 blocks.As a result, the consultant was able to select the optimal BIBD.Each site (or treatment) would be repeated six times, and each location would be compared with every other location by three top managers.
With the establishment of these parameters, the consultant was able to develop the BIBD layout (Table 7).Specifically, the first row of this table means Manager # 1 would rank Location Site #s 1, 2, 3, 4, 5, and 6 in order of preference from the most to the least preferred.And so on, until Manager # 11 ranked Location Site #s 1, 2, 4, 7, 10, and 11.Within this BIBD, the consultant randomized three elements: order of distribution center location numbers presented to each executive to be ranked; assignment of top managers to blocks; and assignment of identification numbers to location sites.
After conducting the survey, the consultant obtained the results summarized in Table 8.The first row of this table means Manager # 1 ranked his/her set of six distribution center locations from most preferred to least preferred as follows: Location Site #s 1, 3, 5, 2, 4, and 6.And so on, until Manager # 11, whose ranking of his/her set of six sites was: Location Site #s 7, 1, 10, 4, 2, and 11.After summing the ranks for each distribution center, the Durbin statistic was computed:  At the .05level, both results were found to be significant.This meant at least one location site tended to yield a larger observed value than at least one other site.Hence, there appeared to exist at least a partial ordering of location sites among top management.Also, the rankings of distribution center locations by executives exhibited some degree of consistency.In other words, these managers were using the same criterion in evaluating location sites.
As a means for determining the order and magnitude with which top management ranked the 11 distribution center locations, the consultant derived a Guttman score for each location site, which is shown in the last row of Table 8.Since the most preferred site received a rank of "1" while the least preferred location site was awarded a rank of "6" by each executive, then, those sites possessing the most negative Guttman scores would represent the most preferred location sites across all managers.Next, these Guttman scores were plotted on a linear scale (Figure 2).On the basis of sum of ranks, Site # 11 was rated more preferred than Site # 8; but, in terms of Guttman scores, the opposite was true (Table 8).In addition, two distribution center location sites had the same sum of ranks (i.e., Site #s 1 and 9); however, they had different Guttman scores (Table 8).This phenomenon transpired because Guttman scores reflect the configuration of ranks among each location site while the sum of ranks does not.Hence, Guttman scaling provides a more detailed and accurate picture of the actual order and spread among distribution center location sites than sum of ranks.In the present case, executives overwhelmingly rated Site # 10 as the most preferred location site for the new distribution center.Although Site # 7 was a distant second, the consultant recommended the company retain it in case negotiations for Site # 10 deteriorated.

Caveats concerning BIBDs
Before using BIBDs, several caveats warrant attention.Items being considered for ranking must be amenable to rank-ordering according to some criterion of interest.Usually, this is not a major problem.A prerequisite for applying BIBDs is the existence of a high degree of commonality among respondents.Otherwise, there is likely to be much variation across individuals' rankings.This potential problem was averted in each of the previously described applications.In the first example, respondents were corporate real estate managers from different large-sized companies in a major American Southern metropolitan area.In the second case, respondents were purchasing executives from various major manufacturers in the Southwestern part of the U.S. But, in the last application, all of the respondents were top-level executives from the same large multi-national conglomerate.
When consulting a BIBD layout table (e.g., Raghavarao and Padgett, 2005), several possible BIBDs may emerge.In order to objectively determine the optimal BIBD, first, compute each design's efficiency factor (i.e., tλ/rk) 6 , and second, select the BIBD with the largest value.However, in some instances, the researcher has to consider qualitative factors (e.g., respondents may have difficulty ranking nine product attributes) in selecting the "best" BIBD.This is exactly what occurred in the second application before the number of purchasing strategies to be ranked was reduced from 34 to 21.Other things being equal, λ should be greater than or equal to two, because each treatment is, then, compared at least twice with every other treatment rather than once if λ = 1.Also, the efficiency factor will be higher in the former case than in the latter.Both of these recommendations were followed by Rink (1987).
Once the number of respondents has been determined in the optimal BIBD, this exact number -no more, no less -must be obtained.This may necessitate some form of personal interview procedure, which can be time-consuming and expensive, as both Green (1975) and Rink (1976) discovered.Following the selection of the "optimal" BIBD layout and prior to data collection, these three things should be randomized: (1) order of items presented to each respondent; (2) assignment of respondents to each set of items to be ranked; and (3) assignment of identification numbers to items.
After the requisite rank-order data have been collected, relying solely upon sum of ranks to identify the most preferred item may lead to an erroneous conclusion, especially if two items have the same columnar sum of ranks.Because Guttman scaling incorporates the configuration of ranks within each item instead of simply summation of ranks, Guttman scores more accurately reflect the intensity with which respondents ranked items, thereby making it easier to correctly ascertain the most preferred item (Guttman, 1946).For example, in the second application, four different sets of two purchasing strategies (e.g., PS #s 11 and 19) had the same columnar sum of ranks; however, they had different Guttman scores.Finally, while a significant coefficient of concordance means individuals applied the same criterion in ranking items, it "does not mean that the orderings observed are correct.In fact, they may all be incorrect with respect to some external criterion" (Siegel, 1956, p. 238).Replicating a study using "similar" respondents is one way to generate the required external criterion for subsequent investigations.

Summary and conclusions
Whenever respondents are asked to rank a large number of items and/or the reliability of their rankings may be questionable, balanced incomplete block designs (BIBDs) should be considered.They are relatively easy to construct and analyze.BIBDs substantially reduce the number of items each individual must subjectively evaluate.Through balancing and replication of items and respondents, a small group of individuals is able to rank many items.Because balancing and replication reduce standard deviation, BIBDs also increase the precision of a study, even with a small sample.If the population is homogeneous, then, a small sample will likely result in more valid inferences than one selected from a large, heterogeneous population.
Since data collected by BIBDs are ordinal scale in nature, nonparametric statistical techniques may be used.Nonparametric statistics are also appropriate when sample sizes are small, which oftentimes is the case with business-related situations.Because nonparametric statistics are "distribution-free", they are easier to learn and use than their parametric counterparts.By incorporating Guttman scaling in BIBDs (and plotting Guttman scores on a linear scale), the intensity of individuals' rankings can be accurately and readily determined, which will simplify the identification of items rated most important.Moreover, with the rank-order procedure, a scale of measurement does not need to be invented.Finally, from an administration standpoint, since each respondent is ranking a subset of the total number of items, BIBDs will save respondents' time, which can result in higher response rates, reduce study costs, and increase the accuracy and reliability of the data.Notes: a Dimensions for selecting real estate brokers are coded as follows (Green, 1975): 1 = Broker's professional affiliation and/or achievement; 2 = Broker's favorable position in the industry; 3 = Reference by a third party and/or previous contact with the broker; 4 = Broker's flexibility to customer's needs; 5 = Broker's geographical location and/or convenience factors; 6 = Broker's marketing innovativeness; 7 = Broker's real estate and business experience; 8 = Commission to/broker quality ratio; 9 = Broker's general reputation; 10 = Broker's knowledge as a source of information; and 11 = Auxiliary services offered by the broker.Notes: a The entry in each cell represents a procurement strategy number corresponding to Table 2 while the decimal figure in parentheses symbolizes the Guttman score for that strategy.b Purchasing executives' evaluation of this procurement strategy coincided with Berenson's model (Table 2).

Appendices
assigned to the r observed values under the j th treatment.

5. 1 .
Attributes used in the selection of real estate brokers.

Fig. 1 .Fig. 2 .
Fig. 1.Linear scale depicting Guttman scores for procurement strategies for the introduction stage

Table 3
important.And so on, until Purchasing Executive # 30 ranked Procurement Strategy #s 8, 9, 10, 11, 12, 13, and 14 from most to least important.Within this BIBD, three elements were randomized: order of strategies presented to each purchasing executive;

Table 1 .
Ranking of dimensions for selecting real estate brokers

Table 1 (
cont.).Ranking of dimensions for selecting real estate brokers

Table 2 .
Berenson's product life cycle-procurement strategy model 31Be prepared to dispose of surplus materials that will no longer be needed 32 Be prepared to assume purchasing responsibilities for new or replacement items in the firm's line 33 Be ready to recommend alternatives that will avoid the necessity for dropping the product (e.g., the firm stops manufacturing certain materials or products, and instead acts only to resell such items produced by other organizations)

Table 3 .
General BIBD for procurement strategies in each product life cycle stage

Table 4 .
Results from BIBD for procurement strategies for the introduction stage of the product life cycle

Table 5 .
Summary of statistical results from BIBD for procurement strategies for each PLC stage

Table 6 .
Ranking of procurement strategies by Guttman scores a by product life cycle stage

Table 7 .
General BIBD for ranking of distribution center locations

Table 8 .
Results from BIBD for distribution center location preference study

Table 8 (
cont.).Results from BIBD for distribution center location preference study