Probability and Statistics: Questions and Answers

Probability and Statistics: Questions and Answers

Chapter 1

1. Define

a. Mutually exclusive event [1] (76 Ch, 80 Ba)

Answer:
Two events, A and B, are said to be mutually exclusive (or disjoint) if they cannot occur at the same time. This means that the occurrence of one event precludes the occurrence of the other. Mathematically, their intersection is an empty set, and the probability of both events happening together is zero. $$P(A \cap B) = 0$$

b. Independent events [1] (76 Ch, 80 Ba)

Answer:
Two events, A and B, are said to be independent if the occurrence of one event does not affect the probability of the occurrence of the other event. Mathematically, the probability of both events occurring is the product of their individual probabilities. $$P(A \cap B) = P(A) \cdot P(B)$$

c. Conditional Probability [1] (79 Ba)

Answer:
Conditional probability is the probability of an event (A) occurring, given that another event (B) has already occurred. It is denoted by $P(A|B)$ and is read as “the probability of A given B”. The formula is: $$P(A|B) = \frac{P(A \cap B)}{P(B)}, \quad \text{provided } P(B) > 0$$

2. What are the merits and demerits of mode? [2] (80 Bh)

Answer:
The mode is the value that appears most frequently in a data set.

Merits of Mode:

  1. Easy to Understand and Calculate: It is the simplest measure of central tendency to identify, often just by observation.
  2. Not Affected by Extreme Values: The mode is not influenced by outliers, making it a stable measure for skewed data.
  3. Applicable to Categorical Data: It is the only measure of central tendency that can be used for nominal (categorical) data, such as a favorite color or brand.
  4. Can be Located Graphically: The mode can be easily determined from a histogram or frequency polygon as the value corresponding to the highest peak.

Demerits of Mode:

  1. Not Rigidly Defined: A dataset can have no mode (all values occur once), one mode (unimodal), or multiple modes (bimodal/multimodal), which can make it ambiguous.
  2. Not Based on All Observations: It only considers the most frequent value and ignores all other values in the dataset.
  3. Not Suitable for Further Mathematical Treatment: The mode is not used in more advanced statistical calculations like variance or regression.
  4. Sampling Fluctuations: The mode is more susceptible to sampling fluctuations compared to the mean or median.

3. State

a. Bayes theorem for conditional probability [2] (78 Ka, 79 Bh, 81 Ba, 80 Bh)

Answer:
Bayes’ Theorem describes the probability of an event based on prior knowledge of conditions that might be related to the event. If $A_1, A_2, …, A_n$ are a set of mutually exclusive and exhaustive events, and B is another event associated with them, then the conditional probability of an event $A_i$ given that B has occurred is: $$P(A_i|B) = \frac{P(B|A_i)P(A_i)}{\sum_{j=1}^{n} P(B|A_j)P(A_j)}$$ Where:

  • $P(A_i|B)$ is the posterior probability.
  • $P(A_i)$ is the prior probability.
  • $P(B|A_i)$ is the likelihood.

b. Law of addition of probability [2] (78 Bh)

Answer:
The law of addition of probability is used to find the probability that at least one of two events will occur.

  1. For Non-Mutually Exclusive Events: If events A and B are not mutually exclusive (i.e., they can happen at the same time), the probability of A or B occurring is the sum of their individual probabilities minus the probability of both occurring together. $$P(A \cup B) = P(A) + P(B) – P(A \cap B)$$
  2. For Mutually Exclusive Events: If events A and B are mutually exclusive (i.e., they cannot happen at the same time, so $P(A \cap B) = 0$), the probability of A or B occurring is simply the sum of their individual probabilities. $$P(A \cup B) = P(A) + P(B)$$

c. Various measures of central tendency [2] (78 Bh)

Answer:
Measures of central tendency are summary statistics that represent the center point or typical value of a dataset. The main measures are:

  1. Mean (Arithmetic Average): The sum of all values divided by the number of values. It is sensitive to outliers.
  2. Median: The middle value in a dataset that has been arranged in ascending or descending order. It is not affected by outliers.
  3. Mode: The most frequently occurring value in a dataset. It can be used for categorical data and is not affected by outliers.
Other measures include the Geometric Mean and the Harmonic Mean.

Chapter 2

4. Define Random variable. Differentiate between discrete random variables and continuous random variable with suitable examples [5] (79 Bh)

Answer:
Random Variable:
A random variable is a variable whose possible values are numerical outcomes of a random phenomenon. In other words, it is a function that assigns a real number to each outcome in the sample space of a random experiment. It is typically denoted by an uppercase letter, such as $X$.

Differentiation between Discrete and Continuous Random Variables:

Basis of Difference Discrete Random Variable Continuous Random Variable
Definition A variable that can take on a finite or countably infinite number of distinct values. A variable that can take on any value within a given range or interval.
Values The values are typically integers and there are “gaps” between them. The values can be any real number within a range, and there are no gaps.
Measurement The values are obtained by counting. The values are obtained by measuring.
Probability Function Its probability distribution is described by a Probability Mass Function (PMF), $P(X=x)$. Its probability distribution is described by a Probability Density Function (PDF), $f(x)$.
Probability at a Point The probability at a specific point is non-zero, $P(X=x) > 0$. The probability at any single specific point is zero, $P(X=x) = 0$. Probability is calculated over an interval.
Examples 1. The number of heads in three coin tosses (can be 0, 1, 2, 3).
2. The number of defective items in a batch of 20.
3. The number of cars passing a toll booth in an hour.
1. The height of a student (can be 1.65m, 1.651m, etc.).
2. The temperature of a room.
3. The time it takes to complete a race.

5. Write conditions for a probability mass function of a discrete random variable. [2] (80 Bh)

Answer:
A function $f(x) = P(X=x)$ can be considered a valid Probability Mass Function (PMF) for a discrete random variable $X$ if it satisfies the following two conditions:

  1. The probability of any specific outcome must be non-negative. $$f(x) \ge 0 \quad \text{for all possible values of } x$$
  2. The sum of the probabilities of all possible outcomes must be equal to 1. $$\sum_{\text{all } x} f(x) = 1$$

6. Under what condition Binomial distribution can be approximated by Poisson Distribution [5] (78 Bh, 79 Ba)

Answer:
The Binomial distribution can be approximated by the Poisson distribution under the following specific conditions:

  1. The number of trials, $n$, is very large ($n \to \infty$).
  2. The probability of success, $p$, on any single trial is very small ($p \to 0$).
  3. The mean of the distribution, which is the product of $n$ and $p$ ($\lambda = np$), is a finite, constant, and moderate value.

Essentially, the Poisson distribution is a limiting case of the Binomial distribution for rare events occurring in a large number of trials.

7. Write diff and similarities between Binomial and Negative Binomial distribution [2 + 3] (78 Ka, 79 Bh)

Answer:
Similarities:

  • Bernoulli Trials: Both distributions are based on a sequence of independent Bernoulli trials, where each trial has only two possible outcomes (success or failure).
  • Constant Probability: In both distributions, the probability of success ($p$) remains constant from trial to trial.
  • Discrete Nature: Both are discrete probability distributions.

Differences:

Basis Binomial Distribution Negative Binomial Distribution
Random Variable The random variable $X$ is the number of successes in a fixed number of trials. The random variable $X$ is the number of trials required to obtain a fixed number of successes.
Number of Trials The number of trials ($n$) is fixed and finite. The number of trials ($x$) is variable and can theoretically go to infinity.
Number of Successes The number of successes ($x$) is variable. The number of successes ($r$) is fixed.
Example Find the probability of getting 3 heads in 5 coin tosses. Find the probability that the 3rd head occurs on the 5th coin toss.

8. Write diff and similarities between Binomial and Hypergeometric Binomial distribution [5] (81 Ba, 76 Ch)

Answer:
(Assuming “Hypergeometric Binomial distribution” means Hypergeometric Distribution).

Similarities:

  • Two Outcomes: Both distributions are used to model situations with two possible outcomes for each trial (e.g., success/failure, defective/non-defective).
  • Counting Successes: Both are concerned with the probability of obtaining a specific number of successes ($x$) in a certain number of trials ($n$).
  • Discrete Distributions: Both are discrete probability distributions.

Differences:

Basis Binomial Distribution Hypergeometric Distribution
Sampling Method Sampling is done with replacement. Sampling is done without replacement.
Population Size Assumes an infinite (or very large) population. Deals with a finite population.
Trial Independence The trials are independent. The probability of success ($p$) is constant for each trial. The trials are dependent. The probability of success changes with each draw.
Parameters Defined by two parameters: number of trials ($n$) and probability of success ($p$). Defined by three parameters: population size ($N$), number of successes in the population ($K$), and sample size ($n$).
Application Quality control where items are replaced or when the population is large enough that non-replacement is negligible. Quality control where items are not replaced (destructive testing), card games, surveys from a small population.

9. Why Negative binomial distribution tends to Poisson distribution? [2] (79 Ba)

Answer:
The Negative Binomial distribution tends to the Poisson distribution under specific limiting conditions. The Negative Binomial distribution models the number of failures ($k$) before the $r$-th success. If we let the number of successes $r$ become very large ($r \to \infty$) and the probability of success $p$ also become very large ($p \to 1$) in such a way that the mean number of failures $\lambda = r(1-p)$ remains a finite constant, the distribution of the number of failures approaches a Poisson distribution with mean $\lambda$. This is because the scenario models a large number of rare events (failures) over a very long sequence of trials.

10. Write approximation condition for approaching Hypergeometric to Binomial Distribution [2] (76 Ch, 80 Ba)

Answer:
The Hypergeometric distribution can be approximated by the Binomial distribution when the population size ($N$) is very large compared to the sample size ($n$).
The common rule of thumb is that the approximation is good if the sample size is no more than 5% of the population size, i.e., $$\frac{n}{N} \le 0.05$$ Under this condition, the effect of sampling without replacement is negligible, and the probability of success on each trial is nearly constant, similar to the Binomial model.

11. Write basic properties of Hypergeometric Distribution [4] (80 Ba)

Answer:
The basic properties of the Hypergeometric Distribution with parameters $N$ (population size), $K$ (number of success items in the population), and $n$ (sample size) are:

  1. Parameters: It is defined by the three parameters $N, K, n$.
  2. Mean (Expected Value): The mean number of successes is given by: $$E(X) = \mu = n\frac{K}{N}$$
  3. Variance: The variance is given by: $$Var(X) = \sigma^2 = n\frac{K}{N}\left(1-\frac{K}{N}\right)\left(\frac{N-n}{N-1}\right)$$ The term $\frac{N-n}{N-1}$ is called the finite population correction factor.
  4. Sampling: It models sampling without replacement from a finite population where the trials are dependent.

12. Define

a. Binomial distribution [1] (79 Ba)

Answer:
The Binomial distribution is a discrete probability distribution that describes the probability of obtaining exactly $x$ successes in a fixed number of $n$ independent Bernoulli trials, where the probability of success, $p$, is constant for each trial.

b. Negative Binomial distribution [1] (80 Ba)

Answer:
The Negative Binomial distribution is a discrete probability distribution that describes the probability that the $r$-th success in a sequence of independent Bernoulli trials occurs on the $x$-th trial.

c. Hypergeometric Distribution [1] (76 Ch)

Answer:
The Hypergeometric distribution is a discrete probability distribution that describes the probability of obtaining exactly $x$ successes in a sample of size $n$, drawn without replacement from a finite population of size $N$ that contains exactly $K$ successes.

Chapter 3

13. Describe conditions for probability density function [2] (78 Bh)

Answer:
A function $f(x)$ is a valid Probability Density Function (PDF) for a continuous random variable $X$ if it satisfies the following two conditions:

  1. The function must be non-negative for all possible values of $x$. $$f(x) \ge 0 \quad \text{for all } x$$
  2. The total area under the curve of the function over its entire range must be equal to 1. $$\int_{-\infty}^{\infty} f(x) \,dx = 1$$

14. Write area properties of normal distribution [4] (78 Ka)

Answer:
The area properties of the normal distribution, also known as the Empirical Rule (or the 68-95-99.7 rule), describe the percentage of data that falls within certain standard deviations ($\sigma$) from the mean ($\mu$).

  • Approximately 68.27% of the data lies within one standard deviation of the mean (i.e., in the range $[\mu – \sigma, \mu + \sigma]$).
  • Approximately 95.45% of the data lies within two standard deviations of the mean (i.e., in the range $[\mu – 2\sigma, \mu + 2\sigma]$).
  • Approximately 99.73% of the data lies within three standard deviations of the mean (i.e., in the range $[\mu – 3\sigma, \mu + 3\sigma]$).
  • The curve is symmetric about the mean $\mu$, so 50% of the area lies to the left of the mean and 50% lies to the right.

15. Define

a. Standard Normal Distribution [1] (78 Bh)

Answer:
The Standard Normal Distribution is a special case of the normal distribution where the mean is 0 ($\mu = 0$) and the standard deviation is 1 ($\sigma = 1$). Any normal random variable $X$ with mean $\mu$ and standard deviation $\sigma$ can be transformed into a standard normal variable $Z$ using the formula $Z = (X – \mu) / \sigma$.

b. Gamma Probability Distribution [1] (79 Ba)

Answer:
The Gamma distribution is a two-parameter family of continuous probability distributions often used to model waiting times. It is a generalization of the exponential distribution and can be thought of as the waiting time until the $\alpha$-th event occurs in a Poisson process. It is defined by a shape parameter $\alpha$ and a scale parameter $\beta$.

16. Write important properties of Gamma distribution [3] (80 Ba)

Answer:
Important properties of the Gamma distribution with shape parameter $\alpha$ and scale parameter $\beta$:

  1. Mean and Variance:
    • Mean: $E(X) = \alpha\beta$
    • Variance: $Var(X) = \alpha\beta^2$
  2. Additive Property: If $X_1, X_2, …, X_k$ are independent random variables, each following a Gamma distribution with the same scale parameter $\beta$ and shape parameters $\alpha_1, \alpha_2, …, \alpha_k$ respectively, then their sum also follows a Gamma distribution: $$\sum_{i=1}^{k} X_i \sim \text{Gamma}\left(\sum_{i=1}^{k} \alpha_i, \beta\right)$$
  3. Special Cases: The Gamma distribution generalizes other distributions. For example:
    • When $\alpha = 1$, it becomes the Exponential distribution.
    • When $\beta = 2$ and $\alpha = \nu/2$ (where $\nu$ is degrees of freedom), it becomes the Chi-squared ($\chi^2$) distribution.

17. Under what conditions, the Poisson distribution can be approximated using Normal distribution [1] (76 Ch)

Answer:
The Poisson distribution can be approximated by the Normal distribution when its mean, $\lambda$, is sufficiently large. A common rule of thumb is that the approximation is good if $\lambda \ge 20$. The approximating normal distribution would have a mean $\mu = \lambda$ and a variance $\sigma^2 = \lambda$.

Chapter 4

18. Define (With example)

a. Parameter, Statistics

Answer:

  • Parameter: A numerical value that describes a characteristic of an entire population. It is a fixed value, but usually unknown.
    Example: The average height of all adult males in Nepal ($\mu$).
  • Statistic: A numerical value that describes a characteristic of a sample. It is calculated from sample data and is a random variable that varies from sample to sample.
    Example: The average height of a sample of 100 adult males randomly selected in Nepal ($\bar{x}$).

b. Sample

Answer:
Sample: A subset of individuals or objects selected from a larger group (the population). The sample should ideally be representative of the population to allow for generalization of results.
Example: To study the voting preferences in a city of 1 million voters, a survey of 1,000 voters is a sample.

c. Sampling distribution of mean

Answer:
Sampling Distribution of the Mean: The probability distribution of all possible sample means ($\bar{x}$) that could be computed from all possible samples of a fixed size ($n$) drawn from a given population.
Example: If we repeatedly take samples of size 30 from the population of student heights at a university and calculate the mean height for each sample, the distribution of all these sample means is the sampling distribution of the mean.

d. Standard error of mean

Answer:
Standard Error of the Mean (SEM): The standard deviation of the sampling distribution of the sample mean. It measures the variability or precision of the sample mean as an estimate of the population mean. It is calculated as $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$, where $\sigma$ is the population standard deviation and $n$ is the sample size.
Example: If the standard deviation of student heights at a university ($\sigma$) is 5 cm, and we take a sample of 100 students ($n=100$), the standard error of the mean is $\sigma_{\bar{x}} = \frac{5}{\sqrt{100}} = 0.5$ cm.

e. Statistics Population

Answer:
Population (in Statistics): The entire collection of individuals, objects, or measurements about which information is desired. It is the whole group from which a sample is drawn.
Example: All the light bulbs produced by a factory, or all the registered voters in a country.

19. Distinguish between population and sample [2] (81 Ba)

Answer:

Basis Population Sample
Definition The entire group of individuals or items under study. A subset or a part of the population selected for study.
Characteristic A numerical characteristic is called a parameter (e.g., $\mu, \sigma$). A numerical characteristic is called a statistic (e.g., $\bar{x}, s$).
Size The number of items is denoted by $N$. It can be finite or infinite. The number of items is denoted by $n$. It is always finite.
Data Collection A complete enumeration of all items is called a census. Data collection from a subset is called a survey or sampling.
Objective To get information about its parameters. To make inferences and draw conclusions about the population.

20. Prove that sampling distribution of sample mean is unbiased estimator of the population mean. Also obtain the expression for standard error of sample mean when the population is infinitely large [4] (79 Ba)

Answer:
Part 1: Proof of Unbiasedness
An estimator is unbiased if its expected value is equal to the population parameter it is estimating. We need to prove that $E(\bar{x}) = \mu$.

Let $X_1, X_2, …, X_n$ be a random sample of size $n$ from a population with mean $\mu$ and variance $\sigma^2$. The sample mean $\bar{x}$ is defined as: $$\bar{x} = \frac{1}{n} \sum_{i=1}^{n} X_i$$ Now, let’s find the expected value of $\bar{x}$: $$E(\bar{x}) = E\left(\frac{1}{n} \sum_{i=1}^{n} X_i\right)$$ Using the property of expectation $E(aX) = aE(X)$: $$E(\bar{x}) = \frac{1}{n} E\left(\sum_{i=1}^{n} X_i\right)$$ Using the property $E(X+Y) = E(X) + E(Y)$: $$E(\bar{x}) = \frac{1}{n} \left(\sum_{i=1}^{n} E(X_i)\right)$$ Since each $X_i$ is drawn from the same population, the expected value of each observation is the population mean, i.e., $E(X_i) = \mu$ for all $i$. $$E(\bar{x}) = \frac{1}{n} \left(\sum_{i=1}^{n} \mu\right) = \frac{1}{n} (n\mu)$$ $$E(\bar{x}) = \mu$$ Since the expected value of the sample mean is equal to the population mean, the sample mean $\bar{x}$ is an unbiased estimator of the population mean $\mu$.

Part 2: Expression for Standard Error
The standard error of the sample mean is the standard deviation of its sampling distribution, i.e., $SE(\bar{x}) = \sqrt{Var(\bar{x})}$.

First, we find the variance of $\bar{x}$: $$Var(\bar{x}) = Var\left(\frac{1}{n} \sum_{i=1}^{n} X_i\right)$$ Using the property of variance $Var(aX) = a^2Var(X)$: $$Var(\bar{x}) = \frac{1}{n^2} Var\left(\sum_{i=1}^{n} X_i\right)$$ For an infinitely large population (or sampling with replacement), the random variables $X_i$ are independent. For independent variables, $Var(X+Y) = Var(X) + Var(Y)$. $$Var(\bar{x}) = \frac{1}{n^2} \left(\sum_{i=1}^{n} Var(X_i)\right)$$ Since each $X_i$ is from a population with variance $\sigma^2$, we have $Var(X_i) = \sigma^2$ for all $i$. $$Var(\bar{x}) = \frac{1}{n^2} \left(\sum_{i=1}^{n} \sigma^2\right) = \frac{1}{n^2} (n\sigma^2) = \frac{\sigma^2}{n}$$ The standard error of the sample mean is the square root of the variance: $$SE(\bar{x}) = \sigma_{\bar{x}} = \sqrt{\frac{\sigma^2}{n}} = \frac{\sigma}{\sqrt{n}}$$ This is the expression for the standard error of the sample mean for an infinite population.

21. Define

a. Central Limit Theorem [1] (78 Bh, 78 Ka, 79 Ba, 80 Ba, 80Bh, 81 Ba)

Answer:
The Central Limit Theorem (CLT) states that if you take a sufficiently large random sample (usually $n \ge 30$) from any population with a finite mean $\mu$ and variance $\sigma^2$, the sampling distribution of the sample mean ($\bar{x}$) will be approximately normally distributed, regardless of the shape of the original population’s distribution.

b. Sampling distribution [1] (79 Ba)

Answer:
A sampling distribution is the probability distribution of a statistic (such as the sample mean or sample proportion) that is obtained from all possible samples of a particular size ($n$) drawn from a specific population.

22. Write importance/applications of central limit theorem [4] (81 Ba)

Answer:
The Central Limit Theorem (CLT) is one of the most important theorems in statistics due to its wide-ranging applications:

  1. Foundation for Inference: It is the foundation for many inferential statistical procedures, such as hypothesis testing and constructing confidence intervals for the population mean.
  2. Justifies Use of Normal Distribution: It allows us to use statistical methods based on the normal distribution (like Z-tests and t-tests) even when the original population is not normally distributed, as long as the sample size is large enough.
  3. Simplifies Problem Solving: It simplifies problems by allowing us to work with a well-understood and predictable normal distribution instead of a potentially unknown or complex population distribution.
  4. Practical Applications in Quality Control: In industrial quality control, the CLT is used to create control charts for monitoring the mean of a process. Even if the individual measurements are not normal, the distribution of sample means will be, allowing for standardized process control.

Chapter 5

23. Define

a. Correlation coefficients [1] (78 Bh)

Answer:
The correlation coefficient (usually denoted by $r$) is a statistical measure that quantifies the strength and direction of the linear relationship between two quantitative variables. Its value ranges from -1 to +1, where +1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship.

b. Partial correlation (With eg) [2] (79 Bh)

Answer:
Partial correlation measures the degree of association between two variables after removing the effect of one or more other controlling variables.

Example: We might want to find the correlation between a child’s academic performance and their level of parental involvement. However, both of these might be influenced by the family’s socioeconomic status (SES). Partial correlation would allow us to measure the relationship between academic performance and parental involvement while holding SES constant, giving a clearer picture of their direct relationship.

24. What are two regression coefficient. State its important properties (and under what condition there exists only one regression line) [Write importance] [5] (76 Ch, 78 Bh, 78 Ka, 80 (+2) Ba, [+2] 79 Ba, [+2] 81 Ba)

Answer:
The two regression coefficients are:

  • Regression coefficient of Y on X ($b_{yx}$): It represents the change in the dependent variable Y for a unit change in the independent variable X.
  • Regression coefficient of X on Y ($b_{xy}$): It represents the change in the variable X for a unit change in the variable Y (when X is treated as dependent).

Important Properties:

  • The correlation coefficient ($r$) is the geometric mean of the two regression coefficients: $$r = \pm\sqrt{b_{yx} \cdot b_{xy}}$$
  • Both regression coefficients will always have the same sign (positive or negative). The sign of the correlation coefficient will be the same as their common sign.
  • If one of the regression coefficients is greater than 1, the other must be less than 1 (unless $r = \pm 1$, in which case both are 1).
  • The regression coefficients are independent of the change of origin but not of scale.

Condition for Only One Regression Line:
There exists only one regression line when the correlation between the two variables is perfect, i.e., when the correlation coefficient $r = +1 \text{ or } r = -1$. In this case, all data points lie on a single straight line, and the regression line of Y on X and the regression line of X on Y become identical.

Importance:
The primary importance of regression analysis is prediction and forecasting. It allows us to estimate or predict the value of a dependent variable based on the known value of an independent variable. It is widely used in economics, engineering, and social sciences to model relationships between variables.

25. Distinguish between correlation coefficient and regression coefficient and write its importance in field of engineering [5] (79 Ba)

Answer:
Distinction between Correlation and Regression Coefficients:

Basis Correlation Coefficient ($r$) Regression Coefficient ($b_{yx}$ or $b_{xy}$)
Meaning Measures the degree and direction of the linear association between two variables. Measures the average change in the dependent variable for a unit change in the independent variable.
Symmetry Symmetric. The correlation between X and Y is the same as between Y and X ($r_{xy} = r_{yx}$). Asymmetric. The regression of Y on X is not the same as X on Y ($b_{yx} \neq b_{xy}$), except in special cases.
Cause & Effect Does not imply a cause-and-effect relationship. Implies a dependency relationship, where one variable is assumed to influence the other.
Units It is a pure number and has no units. It has the units of the ratio of the dependent variable to the independent variable.
Range Lies between -1 and +1 inclusive. Can take any real value from $-\infty \text{ to } +\infty$.
Objective To find the strength of the linear relationship. To estimate the value of one variable based on the other (prediction).

Importance in Engineering:

  • Correlation: Used in engineering to identify relationships between variables. For example, to determine if there is a relationship between the hardness of a material and its tensile strength, or between operating temperature and machine efficiency.
  • Regression: Crucial for modeling and prediction. For example, creating a model to predict the lifespan of a component based on its usage hours, predicting the compressive strength of concrete based on the curing time, or modeling the relationship between process parameters and the quality of the final product in manufacturing.

Chapter 6

26. Define Hypothesis [1] (79 Ba)

Answer:
A hypothesis (or statistical hypothesis) is a claim, assumption, or statement about a population parameter (such as the mean or proportion). This claim is tested on the basis of evidence obtained from a sample of data.

27. Explain major steps in testing hypothesis

a. Single mean for small sample [5] (78 Bh)

Answer:
The major steps for testing a hypothesis about a single population mean ($\mu$) for a small sample ($n < 30$) using a t-test are:

  1. State the Hypotheses.
    • Null Hypothesis ($H_0$): A statement of no effect or no difference, e.g., $H_0: \mu = \mu_0$.
    • Alternative Hypothesis ($H_a$ or $H_1$): The claim to be tested, e.g., $H_a: \mu \neq \mu_0$ (two-tailed), $H_a: \mu > \mu_0$ (right-tailed), or $H_a: \mu < \mu_0$ (left-tailed).
  2. Set the Level of Significance ($\alpha$).

    This is the probability of making a Type I error. Common values are 0.05, 0.01, or 0.10.

  3. Choose the Test Statistic.

    Since the sample is small ($n < 30$) and the population standard deviation ($\sigma$) is unknown, we use the t-statistic: $$t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$$ Here, $\bar{x}$ is the sample mean, $s$ is the sample standard deviation, and the statistic follows a t-distribution with $df = n-1$ degrees of freedom.

  4. Determine the Critical Region.

    Based on $\alpha$ and the degrees of freedom, find the critical t-value(s) from the t-distribution table. This region defines the values of the test statistic that will lead to the rejection of $H_0$.

  5. Calculate the Test Statistic.

    Compute the value of the t-statistic using the sample data ($\bar{x}, s, n$).

  6. Make a Decision.

    If the calculated t-statistic falls into the critical region, we reject the null hypothesis ($H_0$). Otherwise, we fail to reject $H_0$.

  7. State the Conclusion.

    Interpret the decision in the context of the original problem.

b. Difference of means for infinite population [5] (79 Bh, 80 Bh)

Answer:
The question implies large samples, as an infinite population allows for large sample sizes where the Central Limit Theorem applies. This uses a Z-test for the difference between two means.

  1. State the Hypotheses.
    • Null Hypothesis ($H_0$): $H_0: \mu_1 = \mu_2$ or $H_0: \mu_1 – \mu_2 = 0$.
    • Alternative Hypothesis ($H_a$): $H_a: \mu_1 \neq \mu_2$, $H_a: \mu_1 > \mu_2$, or $H_a: \mu_1 < \mu_2$.
  2. Set the Level of Significance ($\alpha$).

    Choose a value for $\alpha$, typically 0.05.

  3. Choose the Test Statistic.

    For large samples ($n_1 \ge 30$ and $n_2 \ge 30$), we use the Z-statistic: $$Z = \frac{(\bar{x}_1 – \bar{x}_2) – (\mu_1 – \mu_2)_0}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$$ If population variances $\sigma_1^2$ and $\sigma_2^2$ are unknown, they can be replaced by sample variances $s_1^2$ and $s_2^2$. The hypothesized difference $(\mu_1 – \mu_2)_0$ is usually 0.

  4. Determine the Critical Region.

    Find the critical Z-value(s) from the standard normal distribution table based on $\alpha$. For example, for a two-tailed test at $\alpha=0.05$, the critical values are $Z = \pm 1.96$.

  5. Calculate the Test Statistic.

    Compute the Z-value using the sample data ($\bar{x}_1, \bar{x}_2, s_1, s_2, n_1, n_2$).

  6. Make a Decision.

    If the calculated $|Z|$ is greater than the critical $|Z|$, reject $H_0$. Otherwise, fail to reject $H_0$.

  7. State the Conclusion.

    Write a conclusion based on the decision, addressing the original research question.

28. What are errors of test of hypothesis? [2] (79 Bh)

Answer:
In hypothesis testing, there are two types of potential errors we can make:

  • Type I Error: This error occurs when we reject a true null hypothesis ($H_0$). The probability of committing a Type I error is denoted by $\alpha$, which is the level of significance of the test.
  • Type II Error: This error occurs when we fail to reject a false null hypothesis ($H_0$). The probability of committing a Type II error is denoted by $\beta$.
$H_0$ is True $H_0$ is False
Do Not Reject $H_0$ Correct Decision
(Prob: $1 – \alpha$)
Type II Error
(Prob: $\beta$)
Reject $H_0$ Type I Error
(Prob: $\alpha$)
Correct Decision (Power)
(Prob: $1 – \beta$)

Chapter 7

29. Explain major steps in testing hypothesis

a. For two population proportions / Paired test. [5] (78 Bh)

Answer:
The question mentions two different tests. Assuming the question means testing the difference between two population proportions ($p_1$ and $p_2$) for large samples:

  1. State the Hypotheses.
    • Null Hypothesis ($H_0$): $H_0: p_1 = p_2$ or $H_0: p_1 – p_2 = 0$.
    • Alternative Hypothesis ($H_a$): $H_a: p_1 \neq p_2$, $H_a: p_1 > p_2$, or $H_a: p_1 < p_2$.
  2. Set the Level of Significance ($\alpha$).
  3. Choose the Test Statistic.

    For large samples, the Z-statistic is used. Under $H_0$, we assume $p_1=p_2$, so we use a pooled estimate of the proportion, $\hat{p}$: $$\hat{p} = \frac{x_1 + x_2}{n_1 + n_2} = \frac{n_1\hat{p}_1 + n_2\hat{p}_2}{n_1 + n_2}$$ The test statistic is: $$Z = \frac{(\hat{p}_1 – \hat{p}_2) – 0}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}$$

  4. Determine the Critical Region.

    Find the critical Z-value(s) from the standard normal table corresponding to $\alpha$.

  5. Calculate the Test Statistic.

    Compute the value of Z using the sample data ($x_1, n_1, x_2, n_2$ or $\hat{p}_1, n_1, \hat{p}_2, n_2$).

  6. Make a Decision.

    Reject $H_0$ if the calculated Z-statistic falls in the critical region. Otherwise, fail to reject $H_0$.

  7. State the Conclusion.

    Interpret the result in the context of the problem.

b. Single proportion for large population [4] (78 Ka, 79 Ba, 80 Ba)

Answer:
This is a test for a single population proportion ($p$) using a large sample.

  1. State the Hypotheses.
    • Null Hypothesis ($H_0$): $H_0: p = p_0$ (where $p_0$ is the hypothesized proportion).
    • Alternative Hypothesis ($H_a$): $H_a: p \neq p_0$, $H_a: p > p_0$, or $H_a: p < p_0$.
  2. Set the Level of Significance ($\alpha$).
  3. Choose the Test Statistic.

    For a large sample (where $np_0 \ge 5$ and $n(1-p_0) \ge 5$), we use the Z-statistic: $$Z = \frac{\hat{p} – p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$ Where $\hat{p} = x/n$ is the sample proportion.

  4. Determine the Critical Region and Make a Decision.

    Find the critical Z-value from the standard normal table for the given $\alpha$. Calculate the Z-statistic and compare. If the calculated Z falls in the critical region, reject $H_0$; otherwise, do not reject $H_0$.

  5. State the Conclusion.

    State the final conclusion in simple, non-technical terms.

30. Write procedure for testing population proportion for large sample [5] (80 Ba)

Answer:
This is the detailed procedure for testing a hypothesis about a single population proportion ($p$) for a large sample:

  1. Formulate the Hypotheses:
    • State the null hypothesis, $H_0: p = p_0$, which represents the claim being tested.
    • State the alternative hypothesis, $H_a$, which represents what will be concluded if $H_0$ is rejected. This can be one of three forms:
      • Two-tailed: $H_a: p \neq p_0$
      • Right-tailed: $H_a: p > p_0$
      • Left-tailed: $H_a: p < p_0$
  2. Set the Significance Level:

    Choose a level of significance, $\alpha$, which is the maximum acceptable probability of rejecting a true null hypothesis. Common choices are 0.05, 0.01.

  3. Identify the Test Statistic and Check Assumptions:

    The appropriate test statistic is the Z-statistic. The assumptions for this test are that the sample is random and the sample size is large enough such that $np_0 \ge 5$ and $n(1-p_0) \ge 5$.

  4. Define the Rejection Rule:

    Determine the critical value(s) from the Z-distribution table based on $\alpha$.

    • For a two-tailed test, the rejection region is $Z < -Z_{\alpha/2}$ or $Z > Z_{\alpha/2}$.
    • For a right-tailed test, the rejection region is $Z > Z_{\alpha}$.
    • For a left-tailed test, the rejection region is $Z < -Z_{\alpha}$.
  5. Compute the Test Statistic:

    Collect sample data and calculate the sample proportion, $\hat{p} = x/n$, where x is the number of successes in a sample of size n. Calculate the Z-statistic using the formula: $$Z_{\text{calc}} = \frac{\hat{p} – p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$

  6. Make a Statistical Decision and Conclude:

    Compare the calculated Z-statistic ($Z_{\text{calc}}$) with the critical Z-value. If $Z_{\text{calc}}$ falls into the rejection region, reject the null hypothesis ($H_0$). If $Z_{\text{calc}}$ does not fall into the rejection region, fail to reject the null hypothesis ($H_0$). State the final conclusion in the context of the problem, explaining whether there is sufficient evidence to support the alternative hypothesis.

Additional Material

Define Box-Plot

Answer:
A box plot, also known as a box-and-whisker plot, is a standardized way of displaying the distribution of data based on a five-number summary: minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum.

The plot consists of:

  • A box extending from the first quartile (Q1) to the third quartile (Q3), representing the middle 50% of the data (the Interquartile Range or IQR).
  • A line inside the box marking the median.
  • Whiskers that extend from the box to the smallest and largest observations within a certain range (typically 1.5 times the IQR).
  • Outliers are plotted as individual points beyond the whiskers.

State BAYES Theorem

Answer:
Bayes’ Theorem describes the probability of an event based on prior knowledge of conditions that might be related to the event. It provides a way to update the probability of a hypothesis (H) given new evidence (E). The formula is: $$P(H|E) = \frac{P(E|H) \cdot P(H)}{P(E)}$$ Where:

  • $P(H|E)$ is the posterior probability: the probability of hypothesis H given the evidence E.
  • $P(E|H)$ is the likelihood: the probability of observing evidence E given that hypothesis H is true.
  • $P(H)$ is the prior probability: the initial probability of hypothesis H before observing the evidence.
  • $P(E)$ is the marginal probability: the total probability of observing the evidence E.

How Negative Binomial Distribution differ from Binomial Distribution?

Answer:
The key difference lies in what is being measured. Both deal with a sequence of independent Bernoulli trials (success/failure).

  • Binomial Distribution: Measures the number of successes in a fixed number of trials. For example, the probability of getting exactly 3 heads in 10 coin flips.
  • Negative Binomial Distribution: Measures the number of trials required to achieve a fixed number of successes. For example, the probability that the 3rd head occurs on the 10th coin flip.
Feature Binomial Distribution Negative Binomial Distribution
Random Variable Number of successes (k) Number of trials (n)
Fixed Parameter Number of trials (n) Number of successes (r)
Objective Find the probability of k successes in n trials. Find the probability that the r-th success occurs on the n-th trial.

What do you mean by Probability Mass Function and Probability Density Function?

Answer:
Both describe the probability distribution of a random variable, but for different types of variables.

  • Probability Mass Function (PMF): This is used for discrete random variables. It gives the probability that a discrete random variable is exactly equal to some value. For any value x, $f(x) = P(X=x)$. The sum of all probabilities in a PMF is equal to 1.
  • Probability Density Function (PDF): This is used for continuous random variables. It describes the relative likelihood for a random variable to take on a given value. The probability of the variable falling within a particular range is found by integrating the PDF over that range. The total area under the curve of a PDF is equal to 1.

Define Normal Distribution

Answer:
The Normal Distribution, also known as the Gaussian distribution, is a continuous probability distribution that is symmetric about its mean. It is graphically represented by a bell-shaped curve. It is one of the most important distributions in statistics because many natural phenomena tend to follow it.
The distribution is completely defined by its two parameters:

  • Mean ($\mu$): The center of the distribution.
  • Standard Deviation ($\sigma$): A measure of the spread or variability of the data.

Differentiate between point estimation and interval estimation

Answer:
Point Estimation and Interval Estimation are two ways to estimate an unknown population parameter using sample data.

  • Point Estimation: Provides a single value as the best guess for the population parameter. For example, using the sample mean ($\bar{x}$) to estimate the population mean ($\mu$). While simple, it’s very unlikely that the point estimate is exactly correct.
  • Interval Estimation: Provides a range of values within which the population parameter is likely to lie, along with a certain level of confidence. This range is called a confidence interval. For example, “We are 95% confident that the true population mean ($\mu$) lies between 15 and 25.” This is more informative as it accounts for the uncertainty in the estimation.

Describe the General procedure of the test of significance

Answer:
A test of significance (or hypothesis test) is a formal procedure for using sample data to evaluate a claim about a population. The general steps are:

  1. State the Hypotheses:
    • Null Hypothesis ($H_0$): A statement of no effect or no difference, which is assumed to be true until evidence suggests otherwise.
    • Alternative Hypothesis ($H_1$ or $H_a$): The claim that the researcher is trying to find evidence for.
  2. Set the Significance Level ($\alpha$): This is the probability of rejecting the null hypothesis when it is actually true (Type I error). Common values are 0.05, 0.01, and 0.10.
  3. Calculate the Test Statistic: A value computed from the sample data that is used to decide whether to reject the null hypothesis. The choice of test statistic (e.g., z-score, t-score, $\chi^2$-statistic) depends on the data and the hypothesis.
  4. Determine the P-value or Critical Region:
    • P-value Approach: The p-value is the probability of obtaining a test statistic at least as extreme as the one observed, assuming the null hypothesis is true.
    • Critical Region Approach: This is the range of values for the test statistic that would lead to rejecting the null hypothesis. The boundary of this region is the critical value.
  5. Make a Decision:
    • If $p\text{-value} \le \alpha$, you reject the null hypothesis ($H_0$).
    • If $p\text{-value} > \alpha$, you fail to reject the null hypothesis ($H_0$).
    Alternatively, if the test statistic falls into the critical region, you reject $H_0$.
  6. State the Conclusion: Interpret the decision in the context of the original research question.

What do you mean by Chi-square test?

Answer:
The Chi-square ($\chi^2$) test is a statistical hypothesis test used to determine if there is a significant association between two categorical variables. It compares the observed frequencies in the categories with the expected frequencies that would occur if there were no relationship between the variables (i.e., if the null hypothesis were true).
It is commonly used for:

  • Test of Independence: To determine if two categorical variables are independent of each other (e.g., is there a relationship between gender and voting preference?).
  • Test of Goodness-of-Fit: To determine if a sample’s distribution fits a claimed population distribution (e.g., does a die roll follow a uniform distribution?).

A large $\chi^2$ value suggests a significant difference between observed and expected frequencies, leading to the rejection of the null hypothesis.

Write down the objectives of Statistical Quality Control

Answer:
Statistical Quality Control (SQC) uses statistical methods to monitor and control the quality of products and services. Its main objectives are:

  1. To Improve Quality: To reduce variability in processes and products, leading to higher quality and consistency.
  2. To Reduce Costs: To minimize waste, rework, and inspection costs by identifying problems early in the process.
  3. To Ensure Specification Conformance: To verify that products meet design specifications and customer requirements.
  4. To Increase Productivity: To make processes more efficient by reducing defects and downtime.
  5. To Make Informed Decisions: To provide a data-driven basis for decisions regarding process improvements.

Define HISTOGRAM

Answer:
A histogram is a graphical representation of the distribution of numerical data. It’s a type of bar chart where the x-axis represents the data range divided into a series of intervals (or “bins”), and the y-axis represents the frequency (the number of data points) that fall into each bin.
The key features of a histogram are:

  • The bars are adjacent to each other, indicating that the data is continuous.
  • The area of each bar is proportional to the frequency of observations in that interval.
  • It provides a visual sense of the data’s central tendency, spread, and shape (e.g., symmetric, skewed).

What do you mean by Priori and Posteriori Probability?

Answer:
These terms relate to probabilities before and after considering new evidence, often used in the context of Bayes’ Theorem.

  • A Priori Probability (Prior Probability): This is the probability of an event determined before new information or evidence is taken into account. It is based on existing knowledge, logical reasoning, or past data. For example, the a priori probability of a fair coin landing on heads is 0.5.
  • A Posteriori Probability (Posterior Probability): This is the revised or updated probability of an event calculated after considering new evidence. It is calculated using Bayes’ theorem by combining the prior probability with the likelihood of the new evidence. For example, if we have a bag with two coins (one fair, one double-headed), the posterior probability that we picked the fair coin given that we flipped it and got heads would be different from the prior probability of 0.5.

Why is the Poisson distribution known as a Uniparametric distribution?

Answer:
The Poisson distribution is called a uniparametric distribution because it is completely described by a single parameter, denoted by lambda ($\lambda$).
This parameter $\lambda$ represents both the mean and the variance of the distribution.

  • Mean: $E(X) = \lambda$
  • Variance: $Var(X) = \lambda$
Once the value of $\lambda$ (the average rate of event occurrence) is known, the entire probability distribution is defined.

Define Normal curve and write down its properties. Also write the condition for normal approximation to the Binomial and Poisson Probability distribution

Answer:
Normal Curve
The Normal Curve is the graphical representation of the Normal Distribution. It is a symmetric, bell-shaped curve that describes the distribution of many types of data.

Properties of the Normal Curve

  • Bell-Shaped and Symmetric: The curve is perfectly symmetric around its center, which is the mean ($\mu$).
  • Unimodal: It has only one peak, which occurs at the mean.
  • Mean, Median, and Mode are Equal: The mean, median, and mode are all located at the center of the distribution.
  • Asymptotic: The curve approaches the horizontal axis but never touches it.
  • Area Under the Curve: The total area under the curve is equal to 1 (or 100%).
  • The Empirical Rule (68-95-99.7 Rule):
    • Approximately 68% of the data falls within 1 standard deviation ($\sigma$) of the mean ($\mu$).
    • Approximately 95% of the data falls within 2 standard deviations of the mean.
    • Approximately 99.7% of the data falls within 3 standard deviations of the mean.

Conditions for Normal Approximation

  • For Binomial Distribution: A binomial distribution with $n$ trials and success probability $p$ can be approximated by a normal distribution with mean $\mu = np$ and variance $\sigma^2 = np(1-p)$, provided that: $$np \ge 5 \text{ and } n(1-p) \ge 5$$ (Some statisticians use 10 as the threshold).
  • For Poisson Distribution: A Poisson distribution with parameter $\lambda$ can be approximated by a normal distribution with mean $\mu = \lambda$ and variance $\sigma^2 = \lambda$, provided that: $$\lambda \text{ is sufficiently large, typically } \lambda \ge 10.$$

State Central Limit Theorem

Answer:
The Central Limit Theorem (CLT) is a fundamental principle in statistics. It states that, for a sufficiently large sample size, the sampling distribution of the sample mean ($\bar{x}$) will be approximately normally distributed, regardless of the shape of the original population distribution.

Key points:

  • The mean of the sampling distribution will be equal to the population mean ($\mu_{\bar{x}} = \mu$).
  • The standard deviation of the sampling distribution (called the standard error) will be equal to the population standard deviation divided by the square root of the sample size ($\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$).
  • A sample size of $n > 30$ is often considered sufficiently large.

Define Estimation. Write characteristics of a good estimator?

Answer:
Estimation
In statistics, estimation is the process of using sample data to find an approximate value of an unknown population parameter (like the mean or proportion). The value obtained from the sample is called an estimate, and the formula or rule used to calculate it is called an estimator.

Characteristics of a Good Estimator
A good estimator should possess the following four properties:

  1. Unbiasedness: An estimator is unbiased if its expected value is equal to the true value of the population parameter it is trying to estimate. In other words, on average, it hits the target. $E(\hat{\theta}) = \theta$.
  2. Efficiency: The most efficient estimator is the one with the smallest variance among all unbiased estimators. A more efficient estimator is more likely to be close to the true parameter value.
  3. Consistency: An estimator is consistent if its accuracy increases as the sample size ($n$) increases. As $n$ approaches infinity, the estimate ($\hat{\theta}$) converges to the true parameter value ($\theta$).
  4. Sufficiency: An estimator is sufficient if it uses all the information available in the sample about the parameter being estimated. No other estimator can provide more information.

What do you mean by Non-Parametric test?

Answer:
A Non-Parametric test (or distribution-free test) is a type of hypothesis test that does not require the underlying population data to follow a specific distribution, such as the normal distribution.

These tests are used when:

  • The assumptions for parametric tests (like the t-test or ANOVA) are not met.
  • The data is ordinal (ranked) or nominal.
  • The sample size is very small.
  • There are significant outliers that cannot be removed.

Examples include the Mann-Whitney U test, Wilcoxon signed-rank test, Kruskal-Wallis test, and the Chi-square test.

What do you mean by Explained, Unexplained, and Total Variation?

Answer:
These terms are used in the context of regression analysis to describe how well a model fits the data.

  • Total Variation (SST – Total Sum of Squares): This measures the total variation of the dependent variable’s values around its mean. It represents the total amount of variability in the data that you are trying to explain. $$SST = \sum(y_i – \bar{y})^2$$
  • Explained Variation (SSR – Regression Sum of Squares): This measures the portion of the total variation in the dependent variable that is “explained” by the regression model (i.e., by the independent variable). $$SSR = \sum(\hat{y}_i – \bar{y})^2$$
  • Unexplained Variation (SSE – Error Sum of Squares): This is the portion of the total variation that is not explained by the model. It represents the random error or residuals—the difference between the observed values and the values predicted by the model. $$SSE = \sum(y_i – \hat{y}_i)^2$$

The fundamental relationship is:
Total Variation = Explained Variation + Unexplained Variation $$SST = SSR + SSE$$

Write axioms of probability

Answer:
The three axioms of probability, formulated by Andrey Kolmogorov, are the fundamental rules upon which all of probability theory is built. For any event A in a sample space S:

  1. Non-negativity Axiom: The probability of any event is a non-negative real number. $$P(A) \ge 0$$
  2. Unit Axiom: The probability of the entire sample space (a certain event) is 1. $$P(S) = 1$$
  3. Additivity Axiom: For any sequence of mutually exclusive (disjoint) events $A_1, A_2, A_3, \dots$ (meaning they cannot happen at the same time), the probability that one of them occurs is the sum of their individual probabilities. $$P(A_1 \cup A_2 \cup A_3 \cup \dots) = \sum P(A_i)$$

What are the parameters used in Negative binomial distribution?

Answer:
The Negative Binomial distribution is typically defined by two parameters:

  • The number of successes ($r$): This is a fixed, positive integer representing the number of successes you want to achieve.
  • The probability of success ($p$): This is the probability of a single success in each independent trial, where $0 < p < 1$.

Write conditions for normal approximation to binomial distribution?

Answer:
A binomial distribution with $n$ trials and probability of success $p$ can be approximated by a normal distribution when the sample size is large enough. The most common rule of thumb for this approximation to be valid is that both the expected number of successes and the expected number of failures are sufficiently large.
The conditions are: $$np \ge 5$$ $$n(1-p) \ge 5$$

Some textbooks may use a more stringent rule, such as $np \ge 10$ and $n(1-p) \ge 10$.

Define parameter and statistic

Answer:

  • Parameter: A parameter is a numerical value that describes a characteristic of an entire population. It is a fixed value, but it is usually unknown because it’s often impractical or impossible to measure the entire population.
    Examples: Population mean ($\mu$), population standard deviation ($\sigma$), population proportion ($p$).
  • Statistic: A statistic is a numerical value that describes a characteristic of a sample. It is calculated from sample data and is used to estimate the value of the corresponding population parameter.
    Examples: Sample mean ($\bar{x}$), sample standard deviation ($s$), sample proportion ($\hat{p}$).

A simple way to remember is: Parameter for Population and Statistic for Sample.

Explain clearly the major steps to be adopted by researchers in testing of a statistical hypothesis

Answer:
Testing a statistical hypothesis is a structured process that allows researchers to make data-driven conclusions about a population. Here are the major steps involved:

  1. Formulate the Research Question and Hypotheses:
    • Start with a clear research question.
    • Translate this question into a Null Hypothesis ($H_0$), which represents the default assumption (e.g., no effect, no difference), and an Alternative Hypothesis ($H_a$ or $H_1$), which is the claim the researcher wants to prove. The alternative can be one-tailed (specifying a direction, e.g., > or <) or two-tailed (specifying a difference, e.g., $\neq$).
  2. Determine the Appropriate Statistical Test:

    The choice of test depends on the research question, the type of data (e.g., categorical, continuous), the number of groups being compared, and whether the assumptions of a particular test are met (e.g., normality, independence). Examples include t-tests, ANOVA, chi-square tests, etc.

  3. Set the Significance Level ($\alpha$):

    The researcher must decide on a threshold for significance, known as the alpha level ($\alpha$). This is the maximum acceptable probability of making a Type I error (rejecting a true null hypothesis). Commonly used values are $\alpha=0.05$ (a 5% risk), $\alpha=0.01$, or $\alpha=0.10$. This should be chosen before collecting data.

  4. Define the Sampling Plan and Collect Data:
    • Determine the target population and the sampling method (e.g., random sampling).
    • Calculate the required sample size to ensure the study has enough statistical power to detect an effect if one truly exists.
    • Collect the data according to the plan.
  5. Calculate the Test Statistic and the P-value:

    Using the collected sample data, compute the value of the chosen test statistic (e.g., $t, z, \chi^2$). Based on this test statistic and its distribution, calculate the p-value. The p-value is the probability of observing data as extreme as, or more extreme than, what was actually collected, assuming the null hypothesis is true.

  6. Make a Statistical Decision:

    Compare the p-value to the pre-determined significance level ($\alpha$).

    • If $p\text{-value} \le \alpha$, the result is statistically significant. The researcher rejects the null hypothesis in favor of the alternative hypothesis.
    • If $p\text{-value} > \alpha$, the result is not statistically significant. The researcher fails to reject the null hypothesis. (Note: This does not mean $H_0$ is proven true, only that there is not enough evidence to reject it).
  7. Draw Conclusions and Interpret the Results:

    The final step is to interpret the statistical decision in the context of the original research question. Explain the practical significance of the findings. What do the results mean in the real world? Also, consider the effect size and confidence intervals to provide a more complete picture of the findings. Report any limitations of the study.

Scroll to Top