## jeudi, décembre 08, 2011

### Simple linear regression in Scala

Here is how to compute Simple linear regression in Scala.
Class LinearRegression takes in n measurements from a List[(x: Double, y: Double)] and computes the line that best fits the data according to the least squares metric.
This Scala program is the scala translation of the java program available at http://introcs.cs.princeton.edu/java/97data/LinearRegression.java.html .
```class LinearRegression(val pairs: List[(Double,Double)]) {
val size = pairs.size
println("pairs = " + pairs)

// first pass: read in data, compute xbar and ybar
val sums = pairs.foldLeft(new X_X2_Y(0D,0D,0D))(_ + new X_X2_Y(_))
val bars = (sums.x / size, sums.y / size)

// second pass: compute summary statistics
val sumstats = pairs.foldLeft(new X2_Y2_XY(0D,0D,0D))(_ + new X2_Y2_XY(_, bars))

val beta1 = sumstats.xy / sumstats.x2
val beta0 = bars._2 - (beta1 * bars._1)
val betas = (beta0, beta1)

println("y = " + ("%4.3f" format beta1) + " * x + " + ("%4.3f" format beta0))

// analyze results
val R2 = correlation.ssr / sumstats.y2
val svar = correlation.rss / (size - 2)
val svar1 = svar / sumstats.x2
val svar0 = ( svar / size ) + ( bars._1 * bars._1 * svar1)
val svar0bis = svar * sums.x2 / (size * sumstats.x2)
println("R^2                 = " + R2)
println("std error of beta_1 = " + Math.sqrt(svar1))
println("std error of beta_0 = " + Math.sqrt(svar0))
println("std error of beta_0 = " + Math.sqrt(svar0bis))
println("SSTO = " + sumstats.y2)
println("SSR  = " + correlation.ssr)
}

def build(p: (Double,Double), bars: (Double,Double), betas: (Double,Double)): RSS_SSR = {
val fit = (betas._2 * p._1) + betas._1
val rss = (fit-p._2) * (fit-p._2)
val ssr = (fit-bars._2) * (fit-bars._2)
}
}