jeudi, décembre 08, 2011

Simple linear regression in Scala

Here is how to compute Simple linear regression in Scala.
Class LinearRegression takes in n measurements from a List[(x: Double, y: Double)] and computes the line that best fits the data according to the least squares metric.
This Scala program is the scala translation of the java program available at http://introcs.cs.princeton.edu/java/97data/LinearRegression.java.html .
class LinearRegression(val pairs: List[(Double,Double)]) { 
 val size = pairs.size
 println("pairs = " + pairs)

 // first pass: read in data, compute xbar and ybar
 val sums = pairs.foldLeft(new X_X2_Y(0D,0D,0D))(_ + new X_X2_Y(_))
 val bars = (sums.x / size, sums.y / size)

 // second pass: compute summary statistics
 val sumstats = pairs.foldLeft(new X2_Y2_XY(0D,0D,0D))(_ + new X2_Y2_XY(_, bars))

 val beta1 = sumstats.xy / sumstats.x2
 val beta0 = bars._2 - (beta1 * bars._1)
 val betas = (beta0, beta1)

 println("y = " + ("%4.3f" format beta1) + " * x + " + ("%4.3f" format beta0))

 // analyze results
 val correlation = pairs.foldLeft(new RSS_SSR(0D,0D))(_ + RSS_SSR.build(_, bars, betas))
 val R2 = correlation.ssr / sumstats.y2
 val svar = correlation.rss / (size - 2)
 val svar1 = svar / sumstats.x2
 val svar0 = ( svar / size ) + ( bars._1 * bars._1 * svar1)
 val svar0bis = svar * sums.x2 / (size * sumstats.x2)
 println("R^2                 = " + R2)
 println("std error of beta_1 = " + Math.sqrt(svar1))
 println("std error of beta_0 = " + Math.sqrt(svar0))
 println("std error of beta_0 = " + Math.sqrt(svar0bis))
 println("SSTO = " + sumstats.y2)
 println("SSE  = " + correlation.rss)
 println("SSR  = " + correlation.ssr)
}

object RSS_SSR {
 def build(p: (Double,Double), bars: (Double,Double), betas: (Double,Double)): RSS_SSR = {
  val fit = (betas._2 * p._1) + betas._1
  val rss = (fit-p._2) * (fit-p._2)
  val ssr = (fit-bars._2) * (fit-bars._2)
  new RSS_SSR(rss, ssr)
 }
}

class RSS_SSR(val rss: Double, val ssr: Double) {
 def +(p: RSS_SSR): RSS_SSR = new RSS_SSR(rss+p.rss, ssr+p.ssr)
}

class X_X2_Y(val x: Double, val x2: Double, val y: Double) {
 def this(p: (Double,Double)) = this(p._1, p._1*p._1, p._2)
 def +(p: X_X2_Y): X_X2_Y = new X_X2_Y(x+p.x,x2+p.x2,y+p.y)
}

class X2_Y2_XY(val x2: Double, val y2: Double, val xy: Double) {
 def this(p: (Double,Double), bars: (Double,Double)) = this((p._1-bars._1)*(p._1-bars._1), (p._2-bars._2)*(p._2-bars._2),(p._1-bars._1)*(p._2-bars._2))
 def +(p: X2_Y2_XY): X2_Y2_XY = new X2_Y2_XY(x2+p.x2,y2+p.y2,xy+p.xy)
}

mardi, novembre 29, 2011

Concrete Scala Map and SortedMap example

This is a concrete Scala Map and SortedMap example.
I wrote this post because I found it very difficult to find the right information when trying to get a concrete Map and SortedMap.
class StringOrder extends Ordering[String] {
 override def compare(s1: String, s2: String) = s1.compare(s2)
}
class MyParameter() {}


class ZeParameters(val pairs:List[(String,MyParameter)] = Nil) extends SortedMap[String,MyParameter] {
 /**** Minimal Map stuff begin ****/
 lazy val keyLookup = Map() ++ pairs
 override def get(key: String): Option[MyParameter] = keyLookup.get(key)
 override def iterator: Iterator[(String, MyParameter)] = pairs.reverseIterator
 override def + [B1 >: MyParameter](kv: (String, B1)) = {
  val (key:String, value:MyParameter) = kv
  new ZeParameters((key,value) :: pairs)
 }
 override def -(key: String): ZeParameters  = new ZeParameters(pairs.filterNot(_._1 == key))
 /**** Minimal map stuff end ****/
 /**** Minimal SortedMap stuff begin ****/
 def rangeImpl (from: Option[String], until: Option[String]): ZeParameters = {
  val out = pairs.filter((p: (String, MyParameter)) => {
   var compareFrom = 0
   from match {
    case Some(s) => compareFrom = p._1.compare(s)
    case _ =>
   }
   var compareUntil = 0
   until match {
    case Some(s) => compareUntil = p._1.compare(s)
    case _ =>
   }
   compareFrom>=0 && compareUntil<=0
  })
  new ZeParameters(out)
 }
 
 def ordering: Ordering[String] = new StringOrder
 /**** Minimal SortedMap stuff end ****/
}
Do not forget that you can also transform your map into a list and then use sortBy:
class ListSort {
  println(List((1.0,"zob"),(1.2,"zab"),(0.9,"zub")).sortBy{_._1})
}

jeudi, novembre 24, 2011

SW development / Agile SCRUM quotes

Ziv’s Law: Software Development is Inherently Unpredictable

Humphrey’s Law: Users Do Not Know What They Want Until They See Working Software

Conway’s Law: The Structure of the Organization Will Be Embedded in the Code

Wegner’s lemma: an interactive system can never be fully specified nor can it ever be fully tested

Langdon’s lemma: software evolves more rapidly as it approaches chaotic regions (taking care not to spill over into chaos)

jeudi, septembre 01, 2011

When programming in Java or Scala, I miss those C pre compiler macros __FILE__ , __LINE__ and __FUNC__ . I use them for logging where I am in my programs.

Well, I decided to have those in Scala, using Stack parsing after athrowing an interruption. I personally don't care if it's take time to execute.

There is one advantage compared to the C macros: you can get any upper level in the calling stack, which I sometimes find handy.


object util {
 val MatchFileLine = """.+\((.+)\..+:(\d+)\)""".r
 val MatchFunc = """(.+)\(.+""".r
 def main(args: Array[String]): Unit = { 
  println(util.tag(1))
  println(util.func(1))
 }
 def tag(i_level: Int): String = {
  val s_rien = ""
  try {
   throw new Exception()
  } catch {
   case unknown => unknown.getStackTrace.toList.apply(i_level).toString match {
    case MatchFileLine(file, line) => file+":"+line
    case _ => s_rien
   }
  }
 }

 def func(i_level: Int): String = {
  val s_rien = "functionNotFound"
  try {
   throw new Exception()
  } catch {
   case unknown => unknown.getStackTrace.toList.apply(i_level).toString match {
    case MatchFunc(funcs) => funcs.split('.').toList.last
    case _ => s_rien
   }
  } 
 }
}

class util() { }

vendredi, août 19, 2011

After 6 years, 5 specifics reasons for getting as many contacts as possible on LinkedIn


6 years ago, I started to be very active on social business networking. I registered on LinkedIn, Viadeo and Xing.

I then sent a huge number of invitations, using some programming techniques, gathering and guessing emails. And I got a fairly high number of contacts...

I now use Viadeo for France (6000+ 1st level contacts), LinkedIn for the world (20000+ 1st level contacts) and Xing for german speaking countries  (1000+ 1st level contacts).
Read my 2 most popular previous posts about business social networking:

Here are 5 specifics reasons to explain why I feel I made the right choice: getting as many contacts as possible:
  1. Through a chain of “weak contacts(*)” on LinkedIn I was able to hire a new employee in my team.
  2. Once, I received a mail from someone writing me: “I want to thank you for just having forwarded an introduction for me, and because of this simple action, I got a 200K$/year job”
  3. Recently, someone from my company HR, just asked me to advertize some job position on LinkedIn, knowing I had a lot of contacts.
  4. I regularly forward business opportunities to business development people in my company.
  5. Getting so many contacts demonstrates some persuasion ability. I got connected to 50% people through carefully crafted invitations. The 50% other part of my connections invited me.


As you can see, I did not get any personal advantage in having so many contacts (except for the 4th reason). Networking is something I consider primarily as a service to others: to get a job, hire someone or simply do business together.

Here are the reasons why you should network with people such as me who have huge networks:
  1. You will see more profiles.
  2. You will be seen by more people.
  3. I always answer positively to requests for help, even if the requester is a young student working in another part of the world.
  4. If a request comes through me, it will be forwarded quickly because:
    • I am accessing my networking sites daily.
    • I know how to use the networking tools.
    • You will probably need less networking hopping as I have access to more people.
  5. I don’t do spam, because I value my huge network too much to annoy people with mass mailing.


Glossary:
A “weak contact” is someone, You have never closely interacted with in the real life.