Based upon the comparative ROC plot for two competing models, which is the champion model and why?

Refer to the exhibit:

Based upon the comparative ROC plot for two competing models, which is the champion model and why?
A . Candidate 1, because the area outside the curve is greater
B . Candidate 2, because the area under the curve is greater
C . Candidate 1, because it is closer to the diagonal reference curve
D . Candidate 2, because it shows less over fit than Candidate 1

Answer: B

Which SAS program computes the profit for each customer in the data set VALID?

Assume a $10 cost for soliciting a non-responder and a $200 profit for soliciting a responder. The logistic regression model gives a probability score named P_R on a SAS data set called VALID. The VALID data set contains the responder variable Pinch, a 1/0 variable coded as 1 for responder. Customers will be solicited when their probability score is more than 0.05.

Which SAS program computes the profit for each customer in the data set VALID?

A . Option A
B . Option B
C . Option C
D . Option D

Answer: A

What is the purpose of the training data set?

An analyst has a sufficient volume of data to perform a 3-way partition of the data into training, validation, and test sets to perform honest assessment during the model building process.

What is the purpose of the training data set?
A . To provide an unbiased measure of assessment for the final model.
B . To compare models and select and fine-tune the final model.
C . To reduce total sample size to make computations more efficient.
D . To build the predictive models.

Answer: A

Which SAS programs can be used to find the p-value for comparing men’s salaries with women’s salaries?

An analyst compares the mean salaries of men and women working at a company.

The SAS data set SALARY contains variables:

– Gender (M or F)

– Pay (dollars per year)

Which SAS programs can be used to find the p-value for comparing men’s salaries with women’s salaries? (Choose two.)

A . Option A
B . Option B
C . Option C
D . Option D

Answer: A,B

What is a drawback to performing data cleansing (imputation, transformations, etc.) on raw data prior to partitioning the data for honest assessment as opposed to performing the data cleansing after partitioning the data?

What is a drawback to performing data cleansing (imputation, transformations, etc.) on raw data prior to partitioning the data for honest assessment as opposed to performing the data cleansing after partitioning the data?
A . It violates assumptions of the model.
B . It requires extra computational effort and time.
C . It omits the training (and test) data sets from the benefits of the cleansing methods.
D . There is no ability to compare the effectiveness of different cleansing methods.

Answer: D