## Based upon the comparative ROC plot for two competing models, which is the champion model and why?

A . Candidate 1, because the area outside the curve is greater
B . Candidate 2, because the area under the curve is greater
C . Candidate 1, because it is closer to the diagonal reference curve
D . Candidate 2, because it shows less over fit than Candidate 1

## Refer to the confusion matrix: Calculate the sensitivity. (0 – negative outcome, 1 – positive outcome)

A . 25/48
B . 58/102
C . 25/B9
D . 58/81

## Which SAS program computes the profit for each customer in the data set VALID?

Assume a \$10 cost for soliciting a non-responder and a \$200 profit for soliciting a responder. The logistic regression model gives a probability score named P_R on a SAS data set called VALID. The VALID data set contains the responder variable Pinch, a 1/0 variable coded as 1 for responder. Customers will be solicited when their probability score is more than 0.05.

Which SAS program computes the profit for each customer in the data set VALID? A . Option A
B . Option B
C . Option C
D . Option D

## As you move along the curve, what changes?

A . The priors in the population
B . The true negative rate in the population
C . The proportion of events in the training data
D . The probability cutoff for scoring

## What is the purpose of the training data set?

An analyst has a sufficient volume of data to perform a 3-way partition of the data into training, validation, and test sets to perform honest assessment during the model building process.

A . To provide an unbiased measure of assessment for the final model.
B . To compare models and select and fine-tune the final model.
C . To reduce total sample size to make computations more efficient.
D . To build the predictive models.

## Which SAS programs can be used to find the p-value for comparing men’s salaries with women’s salaries?

An analyst compares the mean salaries of men and women working at a company.

The SAS data set SALARY contains variables:

– Gender (M or F)

– Pay (dollars per year)

Which SAS programs can be used to find the p-value for comparing men’s salaries with women’s salaries? (Choose two.) A . Option A
B . Option B
C . Option C
D . Option D

## Which SAS procedure provides a viable solution?

There are missing values in the input variables for a regression application.

A . GLM
B . VARCLUS
C . STDI2E
D . CLUSTER

## In partitioning data for model assessment, which sampling methods are acceptable? (Choose two.)

A . Simple random sampling without replacement
B . Simple random sampling with replacement
C . Stratified random sampling without replacement
D . Sequential random sampling with replacement

## What is a drawback to performing data cleansing (imputation, transformations, etc.) on raw data prior to partitioning the data for honest assessment as opposed to performing the data cleansing after partitioning the data?

A . It violates assumptions of the model.
B . It requires extra computational effort and time.
C . It omits the training (and test) data sets from the benefits of the cleansing methods.
D . There is no ability to compare the effectiveness of different cleansing methods.

## Which statement is correct at an alpha level of 0.05?

A . School*Gender should be removed because it is non-significant.
B . Gender should be removed because it is non-significant.
C . School should be removed because it is significant.
D . Gender should not be removed due to its involvement in the significant interaction.