• No results found

Suppose you want to investigate whether child’s birth weight (bwt) is related to weight of mother at last menstrual period (lwt)

N/A
N/A
Protected

Academic year: 2021

Share "Suppose you want to investigate whether child’s birth weight (bwt) is related to weight of mother at last menstrual period (lwt)"

Copied!
4
0
0

Loading.... (view fulltext now)

Full text

(1)

LAB 5 - SOLUTIONS

Aim of the lab - Correlation - Linear regression

Scatter plot

1. Suppose you want to investigate whether child’s birth weight (bwt) is related to weight of mother at last menstrual period (lwt). Draw a scatter plot of these two variables (h scatter) and describe it.

. scatter bwt lwt

10002000300040005000Birth Weight (grams)

50 100 150 200 250

Weight of Mother at Last Menstrual Period (pounds)

There may be a linear increasing trend. The highest values of mother’s weight may have substantial leverage.

2. Calculate the Pearson’s (h pwcorr) and Spearman’s (h spearman) correlation coefficients and test the hypotheses that the population counterparts are zero.

. pwcorr bwt lwt, sig

| bwt lwt ---+--- bwt | 1.0000

|

1

(2)

|

lwt | 0.1858 1.0000 | 0.0105

| . spearman bwt lwt

Number of obs = 189 Spearman's rho = 0.2483

Test of Ho: bwt and lwt are independent Prob > |t| = 0.0006

Both the Pearson’s and the Spearman’s correlation coefficients are small, indicating weak positive linear relationship (Pearson’s) and low rank-correlation (Spearman’s).

3. Estimate a linear regression model with child’s birth weight (bwt) as dependent variable and mother’s weight (lwt) as independent variable. Interpret the coefficients’ estimates and the value of R-squared (i.e. coefficient of determination). Write any assumptions you have to make.

. regress bwt lwt

Source | SS df MS Number of obs = 189 ---+--- F( 1, 187) = 6.69 Model | 3448881.3 1 3448881.3 Prob > F = 0.0105 Residual | 96468171.3 187 515872.574 R-squared = 0.0345 ---+--- Adj R-squared = 0.0294 Total | 99917052.6 188 531473.684 Root MSE = 718.24 --- bwt | Coef. Std. Err. t P>|t| [95% Conf. Interval]

---+--- lwt | 4.429264 1.713025 2.59 0.010 1.049927 7.8086 _cons | 2369.672 228.4306 10.37 0.000 1919.04 2820.304 ---

The estimate for the constant term is 2369.7 grams. It corresponds to the estimated child’s birth weight when mother’s weight is zero, an impossible case. One should not extrapolate outside the range of observed mother’s weight values (80 to 250 pounds).

We estimate that mean child’s weight increases by 4.42 grams every one-pound increase in mother’s weight. We are 95% confident that the true increase is between 1.05 and 7.81 grams. Given that the 95% confidence interval does not include zero, or

equivalently that the p-value is less than 0.05, we conclude that the increase in child’s birth weight associated is a statistically significant.

The R-squared value (0.03) is very small and indicates that there is a large proportion of variability in child’s weight that remains unaccounted for.

2

(3)

4. Inspect the validity of the model in question 3 by assessing the residuals. Use the command rvpplot lwt (h rvpplot).

. rvpplot lwt

-2000-1000010002000Residuals

50 100 150 200 250

Weight of Mother at Last Menstrual Period (pounds)

The residual plot shows no evident signs of lack of fit of the linear model. The residuals do not have any clear residual trend and there is no clear indication that either the equal-variance assumption or normality is violated.

5. Suppose you want to test whether mean birth weight (bwt) varies over race

categories. Verify that the linear regression model (regress bwt i.race) is equivalent to ANOVA (oneway bwt race). Interpret the regression coefficients

. regress bwt i.race

Source | SS df MS Number of obs = 189 ---+--- F( 2, 186) = 4.97 Model | 5070607.63 2 2535303.82 Prob > F = 0.0079 Residual | 94846445 186 509927.124 R-squared = 0.0507 ---+--- Adj R-squared = 0.0405 Total | 99917052.6 188 531473.684 Root MSE = 714.09 --- bwt | Coef. Std. Err. t P>|t| [95% Conf. Interval]

---+--- race |

2 | -384.0473 157.8744 -2.43 0.016 -695.5019 -72.59266 3 | -299.7247 113.6776 -2.64 0.009 -523.9878 -75.4615 |

_cons | 3103.74 72.88169 42.59 0.000 2959.959 3247.521 ---

3

(4)

. oneway bwt race

Analysis of Variance

Source SS df MS F Prob > F --- Between groups 5070607.63 2 2535303.82 4.97 0.0079 Within groups 94846445 186 509927.124

--- Total 99917052.6 188 531473.684

Bartlett's test for equal variances: chi2(2) = 0.6545 Prob>chi2 = 0.721

The sum of squares table from linear regression is identical to that from ANOVA and so is the inference we can draw. The linear regression, however, provides estimates for the difference in the means across race groups. For example, we can conclude that mean birth weight is significantly smaller in blacks (group 2 in the Stata output) than in whites (referent group) by 384 grams (95% confidence interval: 73, 696 grams).

4

References

Related documents

In this part of the report the different existing standards for communication will be shown based on important factors for this project such as current consumption, range, speed

Suppose you are reading a paper and you find the number of subjects, mean and standard deviation of birth weight among smoker and non-smoker women, and you want to perform a

Suppose you are reading a paper and you find the number of subjects, mean and standard deviation of birth weight among smoker and non-smoker women, and you want to perform a

This study investigates how spoken language clause constituent order functions for two weight-sensitive constructions - heavy noun phrase shift and dative alternation - in

High gestational weight gain (GWG) is primarily linked to morbidity associated with high fetal birth weight (FBW) (1) but women with excessive GWG are also more likely to have

AIMS The primary aims of this thesis were to study associations between the glycemic properties of maternal diet during pregnancy and the risks of preterm delivery and abnormal

Meal frequency patterns and glycemic properties of maternal diet in relation to preterm delivery: Results from a large prospective cohort study.. Englund-Ögge L, Brantsæter AL,

PeCDD (adj.): log-linear regression model with PeCDD concentration as dependent variable and year of sampling, and maternal age, BMI, weight+, weight-, and education, as