This is a problem from ISI MStat Examination, 2019. This tests one's familiarity with the simple and multiple linear regression model and estimation of model parameters and is based on the Invariant Regression Coefficient.
Suppose \( \{ (x_i,y_i,z_i):i=1,2,…,n \} \) is a set of trivariate observations on three variables:\(X,Y,Z \),, where \(z_i=0 \) for \(i=1,2,…,n-1 \) and \(z_n=1 \).Suppose the least squares linear regression equation of \(Y \) on \(X\) based on the first \(n-1 \) observations is \( y=\hat{\alpha_0}+\hat{\alpha_1}x \) and the least squares linear regression equation of \(Y \) on \( X \) and \(Z \) based on all the \( n \) observations is \(y=\hat{\beta_0}+\hat{\beta_1}x+\hat{\beta_2}z \) . Show that $\hat{\alpha_1}=\hat{\beta_1}$.
1.Knowing how to estimate the parameters in a linear regression model (Least Square sense)
2. Brief idea about multiple linear regression.
Based on the first \( n-1 \) observations, as \(z_i=0 \), so, we consider a typical linear regression model of \( Y \) on \( X \).
Thus,the least square estimate is given by \( \hat{\alpha_1}=\frac{\sum_{i=1}^{n-1} (x_i-\bar{x})(y_i-\bar{y})}{\sum_{i=1}^{n-1} (x_i-\bar{x})^2} \)
And in the second case, we have:
\( y_1=\beta_0+\beta_1 x_1+\epsilon_1 \)
\( y_2=\beta_0+\beta_1 x_2+ \epsilon_2 \)
\( \vdots \)
\( y_n=\beta_{0}+\beta_1 x_n+\beta_2+ \epsilon_n \)
Thus, the error sum of squares for this model is given by:
\( SSE=\sum_{i=1}^{n-1} (y_i-\beta_0-\beta_1 x_i)^2+(y_n-\beta_1 x_n -\beta_0 -\beta_2)^2 \) , as \( z_n=1 \).
By differentiating SSE with respect to \( \beta_2 \), at the optimal value, we must have:
\( \hat{\beta}_2 = y_n -\hat{\beta_1}x_n-\hat{\beta_0} \)
That is, the last term of SSE must vanish to attain optimality.
So, it is again equivalent to minimize
\( \sum_{i=1}^{n-1} (y_i-\beta_0-\beta_1 x_i)^2 \) with respect to \( \beta_{0} ,\beta_{1} \)
This, is nothing but the simple linear regression model again and thus, \( \hat{\beta_1}=\hat{\alpha_1} \) and furthermore, \( \hat{\beta_0}=\hat{\alpha_0} \).
Suppose you have two sets of independent samples. Let they be \( \{ (y_1,x_1), ...(y_{n_1},x_{n_1}) \} \) and \( \{ (y_{n_1 +1},x_{n_1 +1} ) ,...,(y_{n_1 + n_2} ,x_{n_1 + n_2} ) \} \).
Now you want to fit 2 models to these samples:
\(y_i=\beta_0 + \beta_1 x_i + \epsilon_i \) for \( i=1,2,..,n_1 \)
and
\(y_i=\gamma_0 + \gamma_1 x_i + \epsilon_i \) for \( i=n_1 +1 ,.. ,n_1 + n_2 \)
Can you write these two models as a single model?
After that ,considering all assumptions for linear regression to be true (If you are not aware of these assumptions you may browse through any regression book or search the internet), is it justifiable to infer \( \beta_1 = \gamma_1 \) ?

In 2025, 8 students from Cheenta Academy cracked the prestigious Regional Math Olympiad. In this post, we will share some of their success stories and learning strategies. The Regional Mathematics Olympiad (RMO) and the Indian National Mathematics Olympiad (INMO) are two most important mathematics contests in India.These two contests are for the students who are […]

Cheenta Academy proudly celebrates the success of 27 current and former students who qualified for the Indian Olympiad Qualifier in Mathematics (IOQM) 2025, advancing to the next stage — RMO. This accomplishment highlights their perseverance and Cheenta’s ongoing mission to nurture mathematical excellence and research-oriented learning.

Cheenta students shine at the Purple Comet Math Meet 2025 organized by Titu Andreescu and Jonathan Kanewith top national and global ranks.

Celebrate the success of Cheenta students in the Stanford Math Tournament. The Unified Vectors team achieved Top 20 in the Team Round.