Paper 1 (38511)
Paper 2 (28560)
Paper 3 (Summer 2022)
Paper 4 (Dec 2022)
MathJax rendering enabled
01
Introduction to Statistics
Functions · Importance · Uses and Limitations · Classification · Tabulation · Diagrammatic & Graphic Representation of Data
6 HRS
Define "Statistics". Explain the Uses and Limitations of Statistics.
Define Statistics and list the limitations of statistics.
Define the term "Statistics" and discuss its use in business and trade. Also point out its limitations.
Represent the following data by a percentage sub-divided bar diagram.
| Item of Expenditure | Family A (Income ₹500) | Family B (Income ₹300) |
|---|---|---|
| Food | 150 | 150 |
| Clothing | 125 | 60 |
| Education | 25 | 50 |
| Miscellaneous | 190 | 70 |
| Saving / Deficits | +10 | −30 |
Explain Bar chart with the following example. The following table shows the number of books of different subjects in a library.
| Subject | Phy. | Chem. | Bio. | Hist. | Gio. | Eng. | Math. | Comp. |
|---|---|---|---|---|---|---|---|---|
| No. of Books | 100 | 125 | 75 | 75 | 50 | 200 | 250 | 175 |
Draw the Histogram and Frequency Polygon for the following frequency distribution of weekly wages (in '00 Rs.) of 100 workers in a factory.
| Weekly Wages ('00 Rs.) | 20–24 | 25–29 | 30–34 | 35–39 | 40–44 | 45–49 | 50–54 | 55–59 | 60–64 |
|---|---|---|---|---|---|---|---|---|---|
| No. of Workers | 4 | 5 | 12 | 23 | 31 | 10 | 8 | 5 | 2 |
The frequency distribution of scores obtained by 250 candidates in an entrance test is as follows. Draw a less than and more than frequency curve (ogive). Also explain the significance of the point of intersection of the two ogive curves.
| Scores | 400–450 | 450–500 | 500–550 | 550–600 | 600–650 | 650–700 | 700–750 | 750–800 |
|---|---|---|---|---|---|---|---|---|
| No. of Candidates | 25 | 30 | 45 | 37 | 30 | 33 | 15 | 35 |
What is diagrammatic representation of data? Explain its advantages.
Write a short note on Pie chart and its advantages and disadvantages.
The mode of the calls received on 7 consecutive days 11, 13, 13, 17, 19, 23, 25 is:
A) 11 B) 13 C) 17 D) 23
"More than type Ogive" and "less than type Ogive" for a distribution intersect at:
A) Mean B) Median C) Mode D) Origin
In ________ method, the upper limit of one class is the lower limit of the next class.
A) Inclusive B) Exclusive C) Inter D) Intra
Find the Mean Deviation from the Median for the following data:
| Age of Workers | 20–25 | 25–30 | 30–35 | 35–40 | 40–45 | 45–50 | 50–55 | 55–60 |
|---|---|---|---|---|---|---|---|---|
| No. of Workers | 120 | 125 | 175 | 160 | 150 | 140 | 100 | 30 |
02
Data Collection & Sampling Methods
Primary & Secondary Data · Sources · Methods of Collection · Census & Sample Methods · Probability & Non-Probability Sampling
6 HRS
Distinguish between primary data and secondary data. What precautions should be taken in the use of secondary data?
Explain sampling and the purpose of sampling.
Explain primary data and secondary data in detail.
What are the various methods of collecting statistical data? Which of these is most reliable and why?
What is Stratified sampling? Explain the merits and limitations of stratified sampling.
What do you mean by a questionnaire? What is the difference between a questionnaire and a schedule? State the essential points to be remembered in drafting a questionnaire.
In a simple study about coffee habits in two Towns A and B, the following information is given. Present the data in a table format.
Town A: Females were 40%, total coffee drinkers were 45%, and female non-coffee drinkers were 20%.
Town B: Males were 55%, male non-coffee drinkers were 30%, and female coffee drinkers were 15%.
Town B: Males were 55%, male non-coffee drinkers were 30%, and female coffee drinkers were 15%.
A survey of 370 students from Commerce Faculty and 130 from Science Faculty revealed that 180 students were studying only C.A. Examinations, 140 only Costing, 80 for both C.A. and Costing. The rest offered part-time Management Courses. Of those studying Costing only, 13 were girls and 90 boys from Commerce. Out of 80 studying both, 72 were from Commerce, 70 were boys. Among those with part-time Management, 50 boys from Science, 30 boys and 10 girls from Commerce. Total boys = 110 in Science. Present the above information in tabular form. Find the number of students from Science Faculty studying for part-time Management Courses.
Inspectors for a hospital chain with multiple locations randomly select some of their locations for a cleanliness check of their operating rooms. This is an example of:
A) Cluster sampling B) Stratified Sampling C) Quota Sampling D) Snowball Sampling
03
Introduction to Regression
Mathematical & Statistical Equation · Intercept & Slope · Error Term · Model Fit — R², MAE, MAPE
8 HRS
The equations of two lines of regression obtained in correlation analysis are given below. Obtain the value of the correlation coefficient:
\(2X = 8 - 3Y\) and \(2Y = 5 - X\)
In a laboratory experiment on correlation research study, the equations to the two regression lines were found to be \(2x - y + 1 = 0\) and \(3x - 2y + 7 = 0\).
Find the mean of x and y. Also work out the values of regression coefficients and correlation coefficient between the two variables x and y.
Equations of the two lines of regression are: \(x + 6y = 6\) and \(3x + 2y = 10\). Find:
i) Mean of x and mean of y
ii) Regression coefficients \(b_{yx}\) and \(b_{xy}\)
iii) Correlation coefficient between x and y
ii) Regression coefficients \(b_{yx}\) and \(b_{xy}\)
iii) Correlation coefficient between x and y
From the data given below find: (a) The two regression coefficients, (b) The two regression equations, (c) The coefficient of correlation between marks in Economics and Statistics, (d) The most likely marks in Statistics if marks in Economics are 30.
| Marks in Economics | 25 | 28 | 35 | 32 | 31 | 36 | 29 | 38 | 34 | 32 |
|---|---|---|---|---|---|---|---|---|---|---|
| Marks in Statistics | 43 | 46 | 49 | 41 | 36 | 32 | 31 | 30 | 33 | 39 |
The following table gives the age of cars and annual maintenance costs. Obtain the regression equation for Maintenance costs (age as independent variable). Also find the maintenance cost when age = 5 years.
| Age (Years) | 2 | 4 | 6 | 8 |
|---|---|---|---|---|
| Maintenance Cost (₹ thousands) | 10 | 20 | 25 | 30 |
Perform simple linear regression. Determine slope and intercept.
| X | 1 | 2 | 3 | 3 | 4 | 5 |
|---|---|---|---|---|---|---|
| Y | 8 | 4 | 5 | 2 | 2 | 0 |
A departmental store gives in-service training to salesmen, followed by a test. The following data gives test scores and sales by nine salesmen. Calculate the coefficient of correlation. If a minimum sales volume of Rs. 30,000 is required, what minimum test score ensures continuation of service? Also estimate the most probable sales volume for a salesman scoring 28.
| Test Scores | 14 | 19 | 24 | 21 | 26 | 22 | 15 | 20 | 19 |
|---|---|---|---|---|---|---|---|---|---|
| Sales ('000 Rs.) | 31 | 36 | 48 | 37 | 50 | 45 | 33 | 41 | 39 |
Write a detailed note on least square regression.
Explain the following methods to check the performance of a Regression Model:
i) MAE (Mean Absolute Error)
ii) MAPE (Mean Absolute Percentage Error)
ii) MAPE (Mean Absolute Percentage Error)
If the regression coefficients are \(b_{yx} = 0.5\) and \(b_{xy} = 0.46\), then the value of correlation coefficient (r) is:
A) 0.39 B) 0.48 C) 0.23 D) 0.48 → \(r = \sqrt{b_{yx} \cdot b_{xy}} = \sqrt{0.5 \times 0.46} \approx 0.48\)
A linear regression (LR) analysis produces the equation \(Y = 0.4X + 3\). This indicates that:
A) When Y = 0.4, X = 3 B) When Y = 0, X = 3 C) When X = 3, Y = 0.4 D) When X = 0, Y = 3
In regression analysis, if the independent variable is measured in Kilometers, the dependent variable:
A) Must also be in Kilometers B) Must be in some unit of Distance C) Cannot be in Kilometers D) Can be any units
If all the dots of a scatter diagram lie on a straight line falling from left bottom corner to the right upper corner, the correlation is called:
A) Zero correlation B) High degree of positive correlation C) Perfect negative correlation D) Perfect positive correlation
What is regression analysis? How does it differ from correlation?
04
Introduction to Multiple Linear Regression
MLR Model · Partial Regression Coefficients · Testing Overall Significance · Testing Individual Regression Coefficients
8 HRS
What are the assumptions of Multiple Linear Regression?
The data below relates to the cost of production (Y), cost of ingredients (X₁), and packaging cost (X₂) for 8 different drugs:
a) Fit a regression \(\hat{y} = a + b_1x_1 + b_2x_2\)
b) Find the coefficient of multiple determination (R²)
c) Test the significance of regression. (Given F = 5.786, α = 0.05)
| Sr No | Y (Rs.) | X₁ (₹ thousands) | X₂ (Rs.) |
|---|---|---|---|
| 1 | 100 | 17 | 19 |
| 2 | 79 | 50 | 54 |
| 3 | 100 | 90 | 75 |
| 4 | 129 | 30 | 36 |
| 5 | 158 | 15 | 16 |
| 6 | 106 | 20 | 25 |
| 7 | 58 | 20 | 24 |
| 8 | 78 | 50 | 53 |
b) Find the coefficient of multiple determination (R²)
c) Test the significance of regression. (Given F = 5.786, α = 0.05)
Data about weights (X₁, Kgs), distances moved (X₂, Km), and damage incurred (Y, ₹ thousands) for 10 shipments:
i) Fit a regression \(\hat{Y} = a + b_1X_1 + b_2X_2\)
ii) Find R²
iii) Test significance of regression (F = 9.55, α = 0.01)
| Shipment | Y | X₁ | X₂ |
|---|---|---|---|
| 1 | 12 | 17 | 10 |
| 2 | 15 | 15 | 6 |
| 3 | 14 | 15 | 10 |
| 4 | 19 | 10 | 21 |
| 5 | 8 | 13 | 8 |
| 6 | 16 | 15 | 13 |
| 7 | 15 | 11 | 9 |
| 8 | 25 | 6 | 25 |
| 9 | 10 | 15 | 10 |
| 10 | 11 | 7 | 8 |
ii) Find R²
iii) Test significance of regression (F = 9.55, α = 0.01)
Data regarding output of gram (Y), cost of seed (X₁), and cost of labour (X₂) per hectare for 8 farmers' fields:
a) Fit \(\hat{y} = a + b_1x_1 + b_2x_2\)
b) Find R²
c) Test significance. (Given F = 13.27, α = 0.01)
| Sr No | Y (Rs./hectare) | X₁ (Rs./hectare) | X₂ (Rs./hectare) |
|---|---|---|---|
| 1 | 190 | 50 | 10 |
| 2 | 50 | 30 | 10 |
| 3 | 300 | 150 | 15 |
| 4 | 100 | 50 | 20 |
| 5 | 150 | 40 | 10 |
| 6 | 90 | 40 | 35 |
| 7 | 300 | 100 | 14 |
| 8 | 120 | 60 | 14 |
b) Find R²
c) Test significance. (Given F = 13.27, α = 0.01)
Given \(r_{12} = 0.7\), \(r_{13} = 0.61\) and \(r_{23} = 0.4\). Compute:
i) \(r_{23.1}\) ii) \(r_{13.2}\) iii) \(r_{12.3}\)
In a trivariate distribution, the simple coefficients of correlation are: \(r_{12} = 0.86\), \(r_{13} = 0.65\) and \(r_{23} = 0.72\). Calculate the coefficient of partial correlation \(r_{12.3}\).
In a certain trivariate distribution: \(r_{12} = 0.7\), \(r_{23} = r_{31} = 0.6\). Find the partial correlation coefficient \(r_{12.3}\).
Write a short note on Multiple Regression.
In MLR, the square of the multiple correlation coefficient or R² is called the:
A) Coefficient of determination B) Variance C) Covariance D) Cross-product
05
Statistical Inference
Random Sample · Parametric Point Estimation · Unbiasedness & Consistency · Method of Moments · Maximum Likelihood
6 HRS
Explain the following Point Estimation Properties with examples:
i) Consistency
ii) Unbiasedness
ii) Unbiasedness
Explain with illustration the concept of Point estimation.
Explain the following Point Estimation Properties with examples:
i) Consistency
ii) Unbiasedness
ii) Unbiasedness
Explain the following Point Estimation Properties with examples:
i) Consistency
ii) Unbiasedness
ii) Unbiasedness
Show that sample variance (S²) is an unbiased estimator of population variance (σ²). Also illustrate with an example.
Write a short note on Method of Moments.
Explain the method of maximum likelihood estimation.
Explain the method of maximum likelihood with its advantages and disadvantages.
A point estimator is defined as:
A) A single value from the sample B) Average of all sample values C) Average of all population values D) A single value that is the best estimate of an unknown population parameter
A random sample of size 100 has a standard deviation of 5. What can you say about the maximum error with 95% confidence (Z = 1.96)?
Formula: \(E = Z \cdot \dfrac{\sigma}{\sqrt{n}} = 1.96 \times \dfrac{5}{\sqrt{100}} = 0.98\)
Define a random variable and its mathematical expectation.
06
Tests of Hypotheses
Null & Alternative Hypotheses · Types of Errors · Neyman-Pearson Lemma · MP & UMP Tests
5 HRS
Distinguish between Null and Alternative hypothesis.
Differentiate between Null Hypothesis and Alternative Hypothesis.
Differentiate between Critical Region and Region of Acceptance.
What is hypothesis testing? Explain:
i) Z-Test for Single Mean
ii) Z-Test for Difference of Mean
ii) Z-Test for Difference of Mean
What is Hypothesis Testing? Explain:
i) Z-Test for single mean
ii) Z-Test for Difference of Mean
ii) Z-Test for Difference of Mean
What is Hypothesis Testing? For large samples, explain:
i) Test of significance for a single mean
ii) Test of significance for difference between two means
ii) Test of significance for difference between two means
The manufacturer of electric bulbs claims a mean life of 25 months with σ = 5 months. A random sample of 6 bulbs gave: 24, 26, 30, 20, 20, 18.
Is the manufacturer's claim valid at 1% level of significance?
(Given: table values of the appropriate test statistic at said level are 4.032, 3.707, and 3.499 for 5, 6, and 7 degrees of freedom respectively)
(Given: table values of the appropriate test statistic at said level are 4.032, 3.707, and 3.499 for 5, 6, and 7 degrees of freedom respectively)
Explain the Neyman-Pearson Lemma.
Write a short note on the Neyman-Pearson Lemma.
What are the tests of skewness?
A survey over 25 years indicates 10 mild winters, 8 cold, 7 very cold. A company sells 1000 woollen coats in mild years, 1300 in cold, 2000 in very cold. A coat costs Rs. 1730 and is sold at Rs. 2480. Find the yearly expected profit of the company.