SAS System

Important Note:
• Please submit your SAS code in one file. Use your name as the filename of the SAS file. Use /* */
comments to separate each question. See format below.
• Please work alone on the final assignment. Do not discuss the exam contents with your classmates
or anyone else. Evidence revealing identical solutions will be considered cheating and students will
receive an F for the term grade.

ACME WRITERS

Don't use plagiarized sources. Get Your Custom Essay on
SAS System
Just from $13/Page
Order Essay

• All the exam contents are related to the lecture notes. There is more than one way to solve
each problem. However, you must use what you’ve learned from this class to solve the
problems; otherwise, you will receive 0 credit.
• I will not provide any hints for this final exam. If you are unclear about the exam problems, please
email me directly.
/*
Name: Your name
*/
/*
Question 1
*/

Your SAS code

/*
Question 2
*/

Your SAS code




/*
Question 5
*/

Your SAS code

1
Final Assignment
Problem 1 (7 points)
You will use two data sets: geocode.sas7bdat and households.sas7bdat. These data sets were originally
downloaded from the US Census Bureau. The description of these two data sets is listed below:
geocode.sas7bdat:
VARIABLE TYPE DESCRIPTION EXAMPLE
GEOID CHAR 9 – digit Geography Code 04000US34
STATE CHAR State Name New Jersey
households.sas7bdat:
VARIABLE TYPE DESCRIPTION EXAMPLE
GEOID NUM 1- or 2-digit Geography Code 34
TOTHOUSE NUM Total Households 3064645
UNMARRIED NUM Total Unmarried-Partner Households 151318
The data set geocode.sas7bdat contains 51 observations and the data set households contains 52 observations. For this problem, you will need to create one single data set that contains the variables STATE,
GEOID (in 1- or 2-digits), TOTHOUSE, and UNMARRIED, and only contains observations that occur
from both data sets. Notice that the last two digits from the 9-digit geography code are the same as the
2-digit geography codes. When you combine these two data sets, be careful about the variable type. The
first five observations of your final data set should look similar to the one below:
The SAS System
Obs state id tothouse unmarried
1 Alabama 1 1737080 58537
2 Alaska 2 221600 16568
3 Arizona 4 1901327 118196
4 Arkansas 5 1042696 40543
5 California 6 11502870 683516
2
Final Assignment
Problem 2 (8 points)
You will use the base.sas7bdat for this problem. Here are the complete observations of the data set:
Obs ID SBP visit_time trtmt_time
1 1 140 02/13/2013 05/15/2013
2 1 130 09/30/2013 05/15/2013
3 1 132 07/13/2013 05/15/2013
4 1 138 05/15/2013 05/15/2013
5 2 122 04/05/2013 06/05/2013
6 2 128 06/05/2013 06/05/2013
7 2 130 07/09/2013 06/05/2013
8 2 125 04/30/2013 06/05/2013
In the BASE data set, the variable VISIT TIME is the visiting time. Please keep it in mind that the
visiting time is not properly ordered in the data set. TRTMT TIME is the treatment time or the baseline
measurement time. SBP is the systolic blood pressures that are measured at each visiting time. Based
on this data set, create the following two variables:
• B SBP: contains the SBP value at the treatment time. For SBP that is measured before the
treatment time, B SBP will be set to missing.
• C SBP: the difference between the current SBP measurement and the baseline SBP measurement.
For SBP that was measured before treatment date or on the treatment date, C SBP will be set to
missing.
The final data set should look similar to the one below:
Obs ID SBP visit_time trtmt_time b_sbp c_sbp
1 1 140 02/13/2013 05/15/2013 . .
2 1 138 05/15/2013 05/15/2013 138 .
3 1 132 07/13/2013 05/15/2013 138 -6
4 1 130 09/30/2013 05/15/2013 138 -8
5 2 122 04/05/2013 06/05/2013 . .
6 2 125 04/30/2013 06/05/2013 . .
7 2 128 06/05/2013 06/05/2013 128 .
8 2 130 07/09/2013 06/05/2013 128 2
3
Final Assignment
Problem 3 (6 points)
Write a macro named impute_num, which is used to replace the missing numeric value of a variable with
either the mean or the median value of this variable. The macro takes four arguments:
dat : the name of the data set.
var name : the name of the numeric variable that you want to impute.
method : you can use either mean or median for its value. If you specify mean, the macro will use the
mean value to replace the missing value. Similarly, if you specify median, the macro will use the
median value. Set the default value to mean.
result : you can use either var only or all for its value. Using var only means you only need to keep
the newly-imputed variable in the result data. Using all means you need to keep the newly-imputed
variable in the result in addition to all the variables from in the input data. Set the default value
to var only.
Also you need to add new as suffix for the newly-imputed variable name. For example, if you are imputing
the variable HDL, the newly-imputed variable name will be HDLnew.
The following example imputes HDL variable by replacing the missing value with the mean of HDL. Only
the newly-imputed variable HDLnew is kept in the output data.
%impute_num(dat=patients, var_name=HDL)
The SAS System
Obs HDLnew
1 32.0000
2 60.0000
3 55.6667
4 65.0000
5 55.6667
6 32.0000
7 55.6667
8 70.0000
9 55.6667
10 75.0000
4
Final Assignment
The following example imputes TGL variable by replacing the missing value with the median of TGL. The
output data contains all the variables from the original data plus the newly-imputed variable TGLnew.
%impute_num(dat=patients, var_name=TGL, method=median, result=all)
The SAS System
Obs ID GLUC TGL HDL LDL HRT MAMM SMOKE TGLnew
1 A 88 . 32 99 Y ever 180
2 B . 150 60 . no never 150
3 C 110 . . 120 N 180
4 D . 200 65 165 yes never 200
5 E 90 210 . 150 Y never 210
6 F 88 . 32 210 yes ever 180
7 G 120 164 . . Y yes 164
8 H 110 170 70 188 ever 170
9 I . 190 . 190 N no 190
10 J 90 . 75 . yes never 180
5
Final Assignment
Problem 4 (6 points)
Write a macro named impute_freq, which is used to replace the missing value of a variable (numeric or
character) with the value of the highest frequency of this variable. The macro takes three arguments:
dat : the name of the data set.
var name : the name of the variable that you want to impute.
result : you can use either var only or all for its value. Using var only means you only need to keep
the newly-imputed variable in the result. Using all means you need to keep the newly-imputed
variable in the result in addition to all the variables from in the input data. Set the default value
to var only.
Also you need to add new as suffix for the newly-imputed variable name. For example, if you are imputing
the variable smoke, the newly-imputed variable name will be smokenew.
The following example imputes smoke variable by replacing the missing value with the most frequent
value of the smoke. Only the newly-imputed variable smokenew is kept in the output data.
%impute_freq(dat=patients, var_name=smoke)
The SAS System
Obs smokenew
1 ever
2 never
3 never
4 never
5 never
6 ever
7 never
8 ever
9 never
10 never
6
Final Assignment
The following example imputes HRT variable by replacing the missing value with the most frequent value
of the HRT. The output data contains all the variables from the original data plus the newly-imputed
variable HRTnew.
%impute_freq(dat=patients, var_name=HRT, result=all)
The SAS System
Obs ID GLUC TGL HDL LDL HRT MAMM SMOKE HRTnew
1 A 88 . 32 99 Y ever Y
2 B . 150 60 . no never Y
3 C 110 . . 120 N N
4 D . 200 65 165 yes never Y
5 E 90 210 . 150 Y never Y
6 F 88 . 32 210 yes ever Y
7 G 120 164 . . Y yes Y
8 H 110 170 70 188 ever Y
9 I . 190 . 190 N no N
10 J 90 . 75 . yes never Y
7
Final Assignment
Problem 5 (3 points)
Write a macro named impute, which is used to impute the missing values for one or more numeric variables
with either the mean or the median value of these numeric variables and/or impute one or more variables
with the value of the highest frequency. The macro takes four arguments:
dat : the name of the data set.
num vars : the name(s) of one or more numeric variables. For this group of variable, you need to replace
the missing values with either the mean or the median values.
method : you can use either mean or median for its value. If you specify mean, the macro will use the
mean value to replace the missing value. Similarly, if you specify median, the macro will use the
median value. Set the default value to mean.
freq vars : the name(s) of the one or more variables. You want to replace the missing values with the
value of the highest frequency.
Please make sure the result data will contains all the newly-imputed variable in addition to all the variables
from in the input data. Please test all the macro calls below to ensure your macro works properly.
The following macro call imputes HRT variable with the most frequent value of the HRT.
%impute(dat=patients, freq_vars=HRT)
The SAS System
Obs ID GLUC TGL HDL LDL HRT MAMM SMOKE HRTnew
1 A 88 . 32 99 Y ever Y
2 B . 150 60 . no never Y
3 C 110 . . 120 N N
4 D . 200 65 165 yes never Y
5 E 90 210 . 150 Y never Y
6 F 88 . 32 210 yes ever Y
7 G 120 164 . . Y yes Y
8 H 110 170 70 188 ever Y
9 I . 190 . 190 N no N
10 J 90 . 75 . yes never Y
The following macro call imputes HRT, MAMM, and SMOKE variable with the most frequent value of these
three variables.
%impute(dat=patients, freq_vars=HRT MAMM SMOKE)
8
Final Assignment
The SAS System
S
M M
H A O
S R M K
G M M T M E
O L T H L H A O n n n
b I U G D D R M K e e e
s D C L L L T M E w w w
1 A 88 . 32 99 Y ever Y yes ever
2 B . 150 60 . no never Y no never
3 C 110 . . 120 N N yes never
4 D . 200 65 165 yes never Y yes never
5 E 90 210 . 150 Y never Y yes never
6 F 88 . 32 210 yes ever Y yes ever
7 G 120 164 . . Y yes Y yes never
8 H 110 170 70 188 ever Y yes ever
9 I . 190 . 190 N no N no never
10 J 90 . 75 . yes never Y yes never
The following macro call imputes GLUC with the mean value of this variable.
%impute(dat=patients, num_vars=GLUC)
The SAS System
G
L
S U
G M M C
O L T H L H A O n
b I U G D D R M K e
s D C L L L T M E w
1 A 88 . 32 99 Y ever 88.000
2 B . 150 60 . no never 99.429
3 C 110 . . 120 N 110.000
4 D . 200 65 165 yes never 99.429
5 E 90 210 . 150 Y never 90.000
6 F 88 . 32 210 yes ever 88.000
7 G 120 164 . . Y yes 120.000
8 H 110 170 70 188 ever 110.000
9 I . 190 . 190 N no 99.429
10 J 90 . 75 . yes never 90.000
9
Final Assignment
The following macro call imputes GLUC, GLUC, HDL, and LDL variables with the median values, and imputes
HRT, and SMOKE variables with the most frequent value.
%impute(dat=patients, num_vars=GLUC TGL HDL LDL, method=median, freq_vars=HRT SMOKE)
The SAS System
Obs ID GLUC TGL HDL LDL HRT
1 A 88 . 32 99 Y
2 B . 150 60 .
3 C 110 . . 120 N
4 D . 200 65 165
5 E 90 210 . 150 Y
6 F 88 . 32 210
7 G 120 164 . . Y
8 H 110 170 70 188
9 I . 190 . 190 N
10 J 90 . 75 .
Obs MAMM SMOKE GLUCnew TGLnew HDLnew LDLnew HRTnew SMOKEnew
1 ever 88 180 32.0 99 Y ever
2 no never 90 150 60.0 165 Y never
3 110 180 62.5 120 N never
4 yes never 90 200 65.0 165 Y never
5 never 90 210 62.5 150 Y never
6 yes ever 88 180 32.0 210 Y ever
7 yes 120 164 62.5 165 Y never
8 ever 110 170 70.0 188 Y ever
9 no 90 190 62.5 190 N never
10 yes never 90 180 75.0 165 Y never

Acme Writers
Calculate your paper price
Pages (550 words)
Approximate price: -

Why Work with Us

Top Quality and Well-Researched Papers

We always make sure that writers follow all your instructions precisely. You can choose your academic level: high school, college/university or professional, and we will assign a writer who has a respective degree.

Professional and Experienced Academic Writers

We have a team of professional writers with experience in academic and business writing. Many are native speakers and able to perform any task for which you need help.

Free Unlimited Revisions

If you think we missed something, send your order for a free revision. You have 10 days to submit the order for review after you have received the final document. You can do this yourself after logging into your personal account or by contacting our support.

Prompt Delivery and 100% Money-Back-Guarantee

All papers are always delivered on time. In case we need more time to master your paper, we may contact you regarding the deadline extension. In case you cannot provide us with more time, a 100% refund is guaranteed.

Original & Confidential

We use several writing tools checks to ensure that all documents you receive are free from plagiarism. Our editors carefully review all quotations in the text. We also promise maximum confidentiality in all of our services.

24/7 Customer Support

Our support agents are available 24 hours a day 7 days a week and committed to providing you with the best customer experience. Get in touch whenever you need any assistance.

Try it now!

Calculate the price of your order

Total price:
$0.00

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

No need to work on your paper at night. Sleep tight, we will cover your back. We offer all kinds of writing services.

Essays

Essay Writing Service

No matter what kind of academic paper you need and how urgent you need it, you are welcome to choose your academic level and the type of your paper at an affordable price. We take care of all your paper needs and give a 24/7 customer care support system.

Admissions

Admission Essays & Business Writing Help

An admission essay is an essay or other written statement by a candidate, often a potential student enrolling in a college, university, or graduate school. You can be rest assurred that through our service we will write the best admission essay for you.

Reviews

Editing Support

Our academic writers and editors make the necessary changes to your paper so that it is polished. We also format your document by correctly quoting the sources and creating reference lists in the formats APA, Harvard, MLA, Chicago / Turabian.

Reviews

Revision Support

If you think your paper could be improved, you can request a review. In this case, your paper will be checked by the writer or assigned to an editor. You can use this option as many times as you see fit. This is free because we want you to be completely satisfied with the service offered.