Sunday, June 8, 2014

Scala patmat huffman coding solutions

Odersky PatternMatch Huffman Coding Examples


1) Abstract classes in Scala can have methods and val members like Scala classes.

  abstract class CodeTree{
    val name="nothing"
    def weight:Int 
  }
  case class Fork(left: CodeTree, right: CodeTree, chars: List[Char], weight: Int)    extends CodeTree
  case class Leaf(char: Char, weight: Int) extends CodeTree

Weight is duplicated in both case call constructors and we have code duplication in the case statements when using pattern matching:

def weight(tree: CodeTree): Int = tree match{
   case Fork( _, _, _, weight) => weight
   case Leaf(_, weight)       => weight
}

we can push up weight into the abstract class. If we are designing by classes using a def or val doesn't matter. For case classes we have to use a def. 

def weight(tree: CodeTree): Int = {    tree.weight}

The second function can be implemented via pattern match. The types are different; for Fork we have a List[Char] and for Leaf we have a Char. 

def chars(tree: CodeTree): List[Char] = tree match{
   case Fork(_, _, chars, _) => chars
   case Leaf(char, _)        => List(char) 
  }

Functions makeCodeTree and string2Chars is given, where 2 CodeTrees are combined using the Fork case class. 

Function times takes a list of characters or Chars and creates a list of pairs. 
def times(chars: List[Char]): List[(Char, Int)] = ??? 

The coding pattern is similar to Java, iterate through the list of chars and use a HashMap to keep track of Characters and counts/Character. Simplify the problem to first going through the List using the recursive calls with an accumulator as a function parameter to track state.  Print out the characters to make sure we have traversed the list correctly. 

def testMe2(chars:List[Char]):Unit={
  def testMe2Acc(chars:List[Char]) = chars match{
    case Nil => println("testMe2 match empty list")
    case x::xs => println("testMe2 not empty List x:"+x+",xs:"+xs);  
  }
  testMe2Acc(chars)
}

val charList = List('a','b','c','d')
testMap2(charList)


Add a Map to the function to count the number of chars in the list. Print to verify everything is correct. 

def testMe3(chars:List[Char]):Unit={
  def testMe3Acc(chars:List[Char],acc:Map[Char,Int]):Unit = chars match{
    case Nil => println("testMe3 match empty list")
    case x::xs => println("testMe3 not empty List x:"+x+",xs:"+xs);println("acc:"+acc); if (acc contains x) {val tmpacc=acc.updated(x,acc(x)+1);println("contains acc after add+"+tmpacc);testMe3Acc(xs,tmpacc)} else {println("not contains adding");val tmpacc=acc+(x->1);print("acc after add:"+tmpacc);testMe3Acc(xs,tmpacc)};
  }
  testMe3Acc(chars,Map())
}

testMe3(List('a','b','c','a'))

Now change the return value from a Unit to a Map for the accumulator function. Leave the testMe3 function as a Unit then change the testMe3 Unit to a Map as the last step. We still haven't done anything with pairs. 


//change return type from Unit to Map
def testMe4(chars:List[Char]):Map[Char,Int]={
  def testMe4Acc(chars:List[Char],acc:Map[Char,Int]):Map[Char,Int] = chars match{
    case Nil => println("testMe4 empty list");println("Nil acc:"+acc);acc;
    case x::xs => println("testMe4 char:"+x);println("acc before insert:"+acc); if (acc contains x) {val tmpacc=acc.updated(x,acc(x)+1);println("tmpacc+"+tmpacc);testMe4Acc(xs,tmpacc)} else {println("not contains adding");val tmpacc=acc+(x->1);print("acc after add:"+tmpacc);testMe4Acc(xs,tmpacc)};
  }
  testMe4Acc(chars,Map())
}

val a=testMe4(List('a','b','c','a'))
println("testMe4 a:"+a);


Create a version with Pairs or tuples to replace the map of char->Ints. I did it the dumbest possible way using a mapping of char->Pair which creates wasted space. 

//make pairs, w/o return type
def testMe5(chars:List[Char]):Unit={
  def testMe5Acc(chars:List[Char],acc:Map[Char,(Char,Int)]):Unit = chars match{
    case Nil => println("testMe5 empty list");
    case x::xs => println("testMe5 char:"+x);println("acc before insert:"+acc); if (acc contains x) {val tmpacc=acc.updated(x,(x,acc(x)._2+1));println("tmpacc+"+tmpacc);testMe5Acc(xs,tmpacc)} else {println("not contains adding");val tmpacc=acc+(x->(x,1));print("acc after add:"+tmpacc);testMe5Acc(xs,tmpacc)};
  }
  testMe5Acc(chars,Map())
}

Now one more iteration to add the List return type back. 

def testMe6(chars:List[Char]):List[(Char,Int)]={
  def testMe6Acc(chars:List[Char],acc:Map[Char,(Char,Int)]):Map[Char,(Char,Int)] = chars match{
    case Nil => println("testMe6 empty list");acc
    case x::xs => println("testMe6 char:"+x);println("testMe6 acc before insert:"+acc); if (acc contains x) {val tmpacc=acc.updated(x,(x,acc(x)._2+1));println("testMe6 tmpacc+"+tmpacc);testMe6Acc(xs,tmpacc)} else {println("testMe6 not contains adding");val tmpacc=acc+(x->(x,1));print("testMe6 acc after add:"+tmpacc);testMe6Acc(xs,tmpacc)};
  }
  testMe6Acc(chars,Map()).values.toList
}

Remove the print statement and copy into the times function for the hw. 

Function makeOrderedLeafList. Simplify the problem to first sorting a List of Pairs or List of Ints. First try sorting a List of Ints then a List of pairs using function types. 

Here is an example for sorting a List of Ints. 
  
println( List(5,3,4,100).sortBy(x=>x))


 def makeOrderedLeafList(freqs: List[(Char, Int)]): List[(Char,Int)] =  {
    freqs.sortBy(Tuple2 => Tuple2._2)
 }
 val testList1 = List(('t', 2), ('e', 1), ('x', 3))
 println(makeOrderedLeafList(testList1)


Now modify the above to return a list of Leafs instead of pairs or Tuple2s. 

def makeOrderedLeafList(freqs: List[(Char, Int)]): List[Leaf] = {
    freqs.sortBy(Tuple2 => Tuple2._2).map(Tuple2 => Leaf(Tuple2._1, Tuple2._2))
}





Singleton returns a Boolean True if there is only 1 tree in the List. We are given a List of CodeTrees so we can check the length of the List

 def singleton(trees:List[CodeTree]):Boolean ={
    if (trees.length == 1) true
    else
     false
 }

To compare vs. the pattern matching soln: 

 def singleton2(trees: List[CodeTree]): Boolean = trees match{
    case x::xs =>if (xs==Nil) return true else return false
    case _ =>false
 }

Function Combine takes the first 2 entries in a sorted list of CodeTrees, combines them into a single tree and then adds this to the list


Encode, create recursive inner function first to test for branches and loop through the code tree nodes

  def encodeMe(tree:CodeTree,charList:List[Char]):List[Int]={
    def encodeMeAcc(tree:CodeTree,testMe:Char,acc:List[Int]):List[Int]=tree match{
      case Leaf(c:Char,w:Int) =>println("Leaf char:"+c+" weight:"+w);println("leaf acc:"+acc);acc
      case Fork(left:CodeTree, right:CodeTree, charList:List[Char], weight:Int)=>{
        println("Fork testMe:"+testMe+" chars(left):"+chars(left)+" chars(right):"+chars(right))
        if(chars(left).contains(testMe)){
          println("left!!! add 0 to list");
         val addMe = acc++List(0)
          println("addMe:"+addMe);
          encodeMeAcc(left,testMe,addMe)
        }else //if(chars(right).contains(testMe)){
          println("rigth!!! add 1 to list");
          val addMe = acc++List(1)
          println("addMe:"+addMe);
          encodeMeAcc(right,testMe,addMe)
        }
       
   
    }  def encodeMe(tree:CodeTree,charList:List[Char]):Unit={
    def encodeMeAcc(tree:CodeTree,testMe:Char):Unit=tree match{
      case Leaf(c:Char,w:Int) =>println("Leaf char:"+c+" weight:"+w)
      case Fork(left:CodeTree, right:CodeTree, charList:List[Char], weight:Int)=>{
        println("Fork testMe:"+testMe+" chars(left):"+chars(left)+" chars(right):"+chars(right))
        if(chars(left).contains(testMe)){
          println("left!!!");
        }else if(chars(right).contains(testMe)){
          println("rigth!!!");
          encodeMeAcc(right,testMe)
        }
       
      for(i<-charlist p="">        println("i:"+i)
       encodeMeAcc(tree,i)
     }
   
    }

   Create a test code tree with a fork node in it and verify you can see a list  of chars get tested by the char function implemented earlier to test both the right and left branches for the character you are testing.

//  encodeMe(Leaf('a',1),List('a','b'))
 //append the a and b to a list
  val tree=Fork(Leaf('a',2),Leaf('b',3),List('a', 'b'),5)
 // encodeMe(tree,List('a'))
  println("=================================");
  val em = encodeMe(tree,List('a','b'))
  println("em:"+em);
  println("================================")


 Now add the accumulator to the code and see you get a list which accumulates the values 0 and 1 as they are being added to the accumulator

def encodeMe(tree:CodeTree,charList:List[Char]):List[Int]={
 def encodeMeAcc(tree:CodeTree,testMe:Char,acc:List[Int]):List[Int]=tree match{
      case Leaf(c:Char,w:Int) =>println("Leaf char:"+c+" weight:"+w);println("leaf acc:"+acc);acc
      case Fork(left:CodeTree, right:CodeTree, charList:List[Char], weight:Int)=>{
        println("Fork testMe:"+testMe+" chars(left):"+chars(left)+" chars(right):"+chars(right))
        if(chars(left).contains(testMe)){
          println("left!!! add 0 to list");
         val addMe = acc++List(0)
          println("addMe:"+addMe);
          encodeMeAcc(left,testMe,addMe)
        }else //if(chars(right).contains(testMe)){
          println("rigth!!! add 1 to list");
          val addMe = acc++List(1)
          println("addMe:"+addMe);
          encodeMeAcc(right,testMe,addMe)
        }
       
   
    }
//     for(i<-charlist p="">//       println("i:"+i)
//       encodeMeAcc(tree,i,List[Int]())
 //    }
    charList.flatMap(x=>encodeMeAcc(tree,x,List[Int]()))
  }

  Note: you will have to use a flatMap vs. map because a map will reinitialize the list each time on each iteration. FlatMap doesn't do this. The for loop reinits the list each time also.

Clean up and simplify, change the List[Int] to List[Bit], etc....

def encodeMe(tree:CodeTree,charList:List[Char]):List[Int]={
 def encodeMeAcc(tree:CodeTree,testMe:Char,acc:List[Int]):List[Int]=tree match{
      case Leaf(c:Char,w:Int) =>;acc
      case Fork(left:CodeTree, right:CodeTree, charList:List[Char], weight:Int)=>{
        if(chars(left).contains(testMe)){
         val addMe = acc++List(0)
          encodeMeAcc(left,testMe,addMe)
        }else //if(chars(right).contains(testMe)){
          val addMe = acc++List(1)
          encodeMeAcc(right,testMe,addMe)
        }

    }
    charList.flatMap(x=>encodeMeAcc(tree,x,List[Int]()))
  }

Decode:
 //decode, first step iterate through the list
  def decodeA(tree: CodeTree, bits: List[Bit]): Unit = {
    def decodeAcc(tree: CodeTree, bits: List[Bit]):Unit = tree match{
      case Leaf(c:Char,w:Int) => println("leaf")
      case Fork(left:CodeTree, right:CodeTree, _,_ )=> bits match{
        case Nil=>println("Nil list")
        case x::xs=>println("x:"+x+" xs:"+xs); decodeAcc(tree, xs) 
        }
      }
    decodeAcc(tree,bits)
    }
    println("decode")
    decodeA(Fork(Leaf('a',2),Leaf('b',3),List('a', 'b'),5),List(0,1,0,1,0,1)) 
  

Test for 0 and 1, add the letter encoding to the accumulator.


 
    //decode, iterate through list, add accumulator
   def decodeB(ptree: CodeTree, bits: List[Bit]): List[Char] = {
    def decodeAcc(tree: CodeTree, bits: List[Bit],acc:List[Char]):List[Char] = tree match{
      case Leaf(c:Char,w:Int) => println("leaf c:"+c+" acc:"+acc ); decodeAcc(ptree,bits,List(c)++acc)
      case Fork(left:CodeTree, right:CodeTree, list:List[Char], w:Int )=> bits match{
        case Nil=>println("Nil"); acc
        case x::xs=>println("x:"+x+" xs:"+xs);
        if(x==0) {
          println("found 0 acc:"+acc)
          decodeAcc(left,xs,acc)
        }
        else{
          println("found 1 acc:"+acc)
          decodeAcc(right,xs,acc)
          }
        }
      }
    decodeAcc(tree,bits,List[Char]())
    }
    println("decode")
    val b = decodeB(Fork(Leaf('a',2),Leaf('b',3),List('a', 'b'),5),List(0,1,0,1,0,1))
    println("b:"+b);








No comments:

Post a Comment