• Subject Code : AM238
  • Subject Name : Business Data Analysis

Business Data Analysis

Problem 1.

a) The reported value of mean is 28.4697 minutes. This tells, on an average, it takes 28.5 minutes of the patients to wait to see doctors in the general practitioner (GP) clinics.

b) The reported value of the median is 30 minutes. It refers to the middle value of the waiting time in the data. This shows half of the patients in the sample have waiting time of more than 30 minutes and half of the patients in the sample have waiting time of less than 30 minutes to see doctors in the GP clinics.

c) The reported value of the first quartile is 24.5 minutes. It shows, 25% of the patients in the sample have waiting time of less than 24.5 minutes to see doctors in the GP clinics. Whereas, 75% of the patients in the sample have a waiting time of more than 24.5 minutes.

d) The histogram can be slightly negatively distributed. This is because the mode is greater than median, which is also greater than the mean of the data. The maximum population has a long waiting time to see the doctors. Also, the skewness turns out to be 0.2673, which implies the distribution is approximately symmetric.

e) The sample standard deviation can be calculated as the square root of :

∑(Xi - X̅)2 / (N-1) = (∑Xi2 – N(X̅)2 ) / (N-1)

But since we are not given with the sum of squares, we cannot use this formula. Instead we can use standard error to find the value :

Standard Deviation = Standard error * square root of N-1

Standard deviation = 1.5784 * 5.656 = 8.928.

The difference is due to (N-1). In the reported table standard deviation is calculated using N whereas in this formula (N-1) is used as it is sample standard deviation.

Problem 2.

a) Step 1: Setting Null hypothesis: H0 : p = 0.65. This means that 65% of the Queenslanders are agree for the state to re-open the non essential businesses.

Step 2: Alternative Hypothesis: HA: p > 0.65. This means that more than 65% of the Queenslanders are agree for the state to re-open the non essential businesses.

Sample proportion is calculated as p͡ = 1020 / 1500 = 0.68

Step 3: Set alpha, level of significance. Let α = 0.01, establishing a 99% confidence interval.

Step 4: Calculation of test statistic. It is given as Z = ( p͡ - p) / [√ p*(1-p)/N]

Z = (0.68 – 0.65) / (√0.65*0.35/1500)

Z = 2.43902

Step 5: Construction of acceptance and rejection regions.

A threshold (critical) value of Z is established. This Z value can be obtained from statistical tables and is referred to as Z critical or Zα. This critical value is the minimum value for the test statistic for us to be able to reject the null. From the Z table with N = 1500 and α = 0.01, the Z critical is 0.50399.

Step 6: Comparison of Z critical and calculated Z statistic. Since absolute Z statistic is more than the ctritical Z, we can reject the null hypothesis at 1 % significance level. Therefore, it is statistically significant that more than 65% of the Queenslanders are agree for the state to re-open the non essential businesses.

b) P value is the probability the Z statistic is more than 2.43902.

P Value = P(Z > 2.43) = 0.0075 (From the Z table)

For the hypothesis testing, we have to compare the p value with the alpha.

If p value is less than alpha, we reject the null and vice versa. Here,

P value = 0.0075 and α = 0.01. Since, p value is less than alpha, we reject the null.

c) The condition for the sampling distribution to be normal is that the sample size should be large enough. According to the Central Limit Theorem, as the sample size increases, the sampling distribution approximates to normal distribution. The sample size of 1500 is not large enough compared to the population of 5.1 million.

d) There is a possibility of type I error. Type I error is the grevious error to make, which is rejecting the null hypothesis when it is true. By taking 1% level of significance, there are high chances of rejecting the true null hypothesis.

Problem 3.

a) The confidence interval formula to be used here is:

X̅ +- Z * (S/√n)

The reason to use this formula is that we are provided with the sample mean and sample standard deviation. Also, this interval will give us the range of actual mean net weight content of 20 kg rice bags.

b) Confidence interval is given by:

[ 20.30 - 0.5199* (1.2 / √96) , 20.30 + 0.5199* (1.2 / √96)]

[20.2364, 20.3636]

c) The confidence interval provides the range of the actual mean net weight content of 20 kg rice bags. It tell that there is 95% confidence level that the population mean will lie in this range.

d) Since, the lower level of confidence interval is more than 20 kg of rice bags, this means that the XYZ Consumer Protection Authority should not worry about the weight of rice bags. The weight of the rice bags is more than the weight advertised on the bag.

e) When the sample size increases, keeping everything constant, the width of the confidence interval decreases. This is due to the fall in the standard error. It implies that that the sample is more accurately reflecting the population mean and standard deviation.

Problem 4.

a) The dependent variable is quantity demanded of box of apples and the independent variable is price of a box of apples. This is according to the law of demand, which says, price effects the quantity demanded. As, the price increases, the quantity demanded falls and vice-versa.

b) S2x is the sample variance of X. It is calculated as:

S2x = ∑(Xi - X̅)2 / (N-1) = (∑Xi2 – N(X̅)2 ) / (N-1) = (1896.73 – 9*(14.27)2) / 8

S2x = 64.0339 / 8 = 8.004

S2y is the sample variance of Y. It is calculated as:

S2y= ∑ (Yi - Y̅) 2 / (N-1) = (∑Yi2 – N(Y̅)2 ) / (N-1) = (9808.03 – 9*(32.87)2) / 8

S2 y = 84.0979 / 8 = 10.5122

S2xy is the sample covariance between X and Y. It is calculated as:

S2xy= ∑ ((Xi - X̅) (Yi - Y̅)) / (N-1) = (∑Xi Yi – N (X̅ Y̅) / (N-1)

= (4164.6 – 9*14.27*32.87) / 8

S2xy = (-7.111)

c) Sample correlation coefficient ( r ) can be calculated as:

r = [N ∑ Xi Yi - ∑Xi ∑Yi ] / √ [N ∑Xi2 – (∑Xi )2 ] [N ∑Yi2 – (∑Yi )2 ]

r = (9*4164.600)-(128.5*295.9) / √ [9*1896.73 – (128.5 )2 ] [9*9808.03 – (295.9)2 ]

r = (37481.4 - 38023.15 ) / √ (558.32*715.46)

r = (-541.75) / 632.025021

r = (- 0.8571 )

d) The sample correlation coefficient is – 0.85. The negative sign of the correlation coefficient shows the inverse relationship of the variables. Since, the value of the correlation coefficient is close to -1, this defines the strong relationship between the variables. Therefore, there is a strong negative relationship between quantity demanded and price of apples.

e) Slope coefficient can be calculated as:

Estimated β2 = [N ∑ Xi Yi - ∑Xi ∑Yi ] / [N ∑Xi2 – (∑Xi )2 ]

= ( 9*4164.6 – 128.5*295.9) / (9*1896.73 - 128.5*128.5)

= (37481.4 - 38023.15) / (17070.57 - 16512.25)

= ( - 541.75) / 558.32

= (-0.97032)

f) The slope coefficient is -0.97. This can be interpreted, when the price of a box of apples falls by one dollar, on an average, the quantity demanded of apples rises by 970 boxes.

g) The intercept term can be calculated as:

Estimated β1 = Y̅ - β2 * X̅ = (295.9/9) - ( -0.97)*(128.5/9) = 46.7317

h) The intercept term explains that if the price of apples falls to zero, then the quantity demanded of apples is 46,731 boxes. This tells us that there are other factors besides price which influences the quantity demanded of apples.

i) Estimated sample regression equation can be written as:

Y = β1 + β2 * X

Quantity Demanded Of Apples = 46.7317 + (– 0.97032) * Price Of Apples

j) If the price of apple is $ 19, the the quantity demanded will be:

Quantity Demanded Of Apples = 46.7317 + (– 0.97032) * 19 = 28.29562

The prediction of the quantity demanded when the price of a box of apple is $19 is 28,295 boxes of apples. This shows that when the price increases from 18.2 to 19, the quantity demanded also rises from 27500 boxes to 28295 boxes. This is opposing to the law of demand. Thus, there are other factors which influences the quantity demanded to rise inspite of the rise in price.

k) Step 1: Setting Null hypothesis: H0 : β2 = 0. This means there is no relationship between price of apple and quantity demanded of apples.

Step 2: Alternative Hypothesis: HA: β2 < 0. This means there is a negative relationship between price and quantity demanded of apples.

Step 3: Set alpha, level of significance. Let α = 0.05, establishing a 95% confidence interval.

Step 4: Calculation of test statistic. It is given as: t = -4.40

Step 5: Construction of acceptance and rejection regions.

A threshold (critical) value of t is established. This t value can be obtained from statistical tables and is referred to as t critical or tα. This critical value is the minimum value for the test statistic for us to be able to reject the null. From the t table with N = 9, df = 7 and α = 0.05, the t critical is 1.895.

Step 6: Comparison of t critical and calculated t statistic. Since absolute t statistic is more than the ctritical t, we can reject the null hypothesis at 5 % significance level. Therefore, it is statistically significant that there is negative relation between price and quantity demanded of apples.

l) Coefficient of determination is calculated as:

R2 = ESS / TSS = β22 * [∑(Xi - X̅)2 / ∑ (Yi - Y̅)2]

= β22 * [(∑Xi2 – N(X̅)2) / (∑Yi2 – N(Y̅)2 )]

= (-0.97)2 [(1896.73 - 9*14.27*14.27) / (9808.03 - 9*32.87*32.87)]

= 0.9409 [ (1896.73 – 1832.6961) / (9808.03 – 9723.9321) ]

= 0.9409 * ( 64.0339 / 84.0979)

= 0.716

m) Coefficient of determination defines the proportion of variation in the dependent variable which is due to the changes in the independent variables. It is a measure of goodness of fit. R square in our model is 0.716. It tells our model is moderately good fit. It shows 0.716 of the total changes in the quantity demanded of apples is due to the changes in the price of apples.

Remember, at the center of any academic work, lies clarity and evidence. Should you need further assistance, do look up to our Data Analysis Assignment Help

Get It Done! Today

Applicable Time Zone is AEST [Sydney, NSW] (GMT+11)
Upload your assignment
  • 1,212,718Orders

  • 4.9/5Rating

  • 5,063Experts

Highlights

  • 21 Step Quality Check
  • 2000+ Ph.D Experts
  • Live Expert Sessions
  • Dedicated App
  • Earn while you Learn with us
  • Confidentiality Agreement
  • Money Back Guarantee
  • Customer Feedback

Just Pay for your Assignment

  • Turnitin Report

    $10.00
  • Proofreading and Editing

    $9.00Per Page
  • Consultation with Expert

    $35.00Per Hour
  • Live Session 1-on-1

    $40.00Per 30 min.
  • Quality Check

    $25.00
  • Total

    Free
  • Let's Start

Browse across 1 Million Assignment Samples for Free

Explore MASS
Order Now

My Assignment Services- Whatsapp Tap to ChatGet instant assignment help

refresh