## Based upon the comparative ROC plot for two competing models, which is the champion model and why?

Refer to the exhibit: Based upon the comparative ROC plot for two competing models, which is the champion model and why?
A . Candidate 1, because the area outside the curve is greater
B . Candidate 2, because the area under the curve is greater
C . Candidate 1, because it is closer to the diagonal reference curve
D . Candidate 2, because it shows less over fit than Candidate 1

## Refer to the confusion matrix: Calculate the sensitivity. (0 – negative outcome, 1 – positive outcome)

Refer to the confusion matrix: Calculate the sensitivity. (0 – negative outcome, 1 – positive outcome) Click the calculator button to display a calculator if needed.
A . 25/48
B . 58/102
C . 25/B9
D . 58/81

## Which SAS program computes the profit for each customer in the data set VALID?

Assume a \$10 cost for soliciting a non-responder and a \$200 profit for soliciting a responder. The logistic regression model gives a probability score named P_R on a SAS data set called VALID. The VALID data set contains the responder variable Pinch, a 1/0 variable coded as 1 for responder. Customers will be solicited when their probability score is more than 0.05.

Which SAS program computes the profit for each customer in the data set VALID? A . Option A
B . Option B
C . Option C
D . Option D

## As you move along the curve, what changes?

Refer to the ROC curve: As you move along the curve, what changes?
A . The priors in the population
B . The true negative rate in the population
C . The proportion of events in the training data
D . The probability cutoff for scoring

## What is the purpose of the training data set?

An analyst has a sufficient volume of data to perform a 3-way partition of the data into training, validation, and test sets to perform honest assessment during the model building process.

What is the purpose of the training data set?
A . To provide an unbiased measure of assessment for the final model.
B . To compare models and select and fine-tune the final model.
C . To reduce total sample size to make computations more efficient.
D . To build the predictive models.

## Which SAS programs can be used to find the p-value for comparing men’s salaries with women’s salaries?

An analyst compares the mean salaries of men and women working at a company.

The SAS data set SALARY contains variables:

– Gender (M or F)

– Pay (dollars per year)

Which SAS programs can be used to find the p-value for comparing men’s salaries with women’s salaries? (Choose two.) A . Option A
B . Option B
C . Option C
D . Option D

## Which SAS procedure provides a viable solution?

There are missing values in the input variables for a regression application.

Which SAS procedure provides a viable solution?
A . GLM
B . VARCLUS
C . STDI2E
D . CLUSTER

## In partitioning data for model assessment, which sampling methods are acceptable? (Choose two.)

In partitioning data for model assessment, which sampling methods are acceptable? (Choose two.)
A . Simple random sampling without replacement
B . Simple random sampling with replacement
C . Stratified random sampling without replacement
D . Sequential random sampling with replacement

## What is a drawback to performing data cleansing (imputation, transformations, etc.) on raw data prior to partitioning the data for honest assessment as opposed to performing the data cleansing after partitioning the data?

What is a drawback to performing data cleansing (imputation, transformations, etc.) on raw data prior to partitioning the data for honest assessment as opposed to performing the data cleansing after partitioning the data?
A . It violates assumptions of the model.
B . It requires extra computational effort and time.
C . It omits the training (and test) data sets from the benefits of the cleansing methods.
D . There is no ability to compare the effectiveness of different cleansing methods.

## Which statement is correct at an alpha level of 0.05?

Given the following GLM procedure output: Which statement is correct at an alpha level of 0.05?
A . School*Gender should be removed because it is non-significant.
B . Gender should be removed because it is non-significant.
C . School should be removed because it is significant.
D . Gender should not be removed due to its involvement in the significant interaction.