Introduction to Greedy Algorithms: Huffman Codes

Similar documents
CSE 417 Algorithms. Huffman Codes: An Optimal Data Compression Method

Lecture l(x) 1. (1) x X

It is used when neither the TX nor RX knows anything about the statistics of the source sequence at the start of the transmission

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Lecture 5: Tuesday, January 27, Peterson s Algorithm satisfies the No Starvation property (Theorem 1)

1 Solutions to Tute09

Homework #4. CMSC351 - Spring 2013 PRINT Name : Due: Thu Apr 16 th at the start of class

Structural Induction

Binary and Binomial Heaps. Disclaimer: these slides were adapted from the ones by Kevin Wayne

CSE 21 Winter 2016 Homework 6 Due: Wednesday, May 11, 2016 at 11:59pm. Instructions

PRIORITY QUEUES. binary heaps d-ary heaps binomial heaps Fibonacci heaps. Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley

Design and Analysis of Algorithms 演算法設計與分析. Lecture 8 November 16, 2016 洪國寶

Algorithms PRIORITY QUEUES. binary heaps d-ary heaps binomial heaps Fibonacci heaps. binary heaps d-ary heaps binomial heaps Fibonacci heaps

AVL Trees. The height of the left subtree can differ from the height of the right subtree by at most 1.

Notes on Natural Logic

COMP Analysis of Algorithms & Data Structures

Finding Equilibria in Games of No Chance

Supporting Information

Copyright 1973, by the author(s). All rights reserved.

Design and Analysis of Algorithms

On the Optimality of a Family of Binary Trees Techical Report TR

Recitation 1. Solving Recurrences. 1.1 Announcements. Welcome to 15210!

SET 1C Binary Trees. 2. (i) Define the height of a binary tree or subtree and also define a height balanced (AVL) tree. (2)

Design and Analysis of Algorithms 演算法設計與分析. Lecture 9 November 19, 2014 洪國寶

NOTES ON FIBONACCI TREES AND THEIR OPTIMALITY* YASUICHI HORIBE INTRODUCTION 1. FIBONACCI TREES

> asympt( ln( n! ), n ); n 360n n

COMPUTER SCIENCE 20, SPRING 2014 Homework Problems Recursive Definitions, Structural Induction, States and Invariants

Priority Queues 9/10. Binary heaps Leftist heaps Binomial heaps Fibonacci heaps

Optimal Satisficing Tree Searches

PARELLIZATION OF DIJKSTRA S ALGORITHM: COMPARISON OF VARIOUS PRIORITY QUEUES

Lecture 14: Basic Fixpoint Theorems (cont.)

Fibonacci Heaps Y Y o o u u c c an an s s u u b b m miitt P P ro ro b blle e m m S S et et 3 3 iin n t t h h e e b b o o x x u u p p fro fro n n tt..

VARN CODES AND GENERALIZED FIBONACCI TREES

3/7/13. Binomial Tree. Binomial Tree. Binomial Tree. Binomial Tree. Number of nodes with respect to k? N(B o ) = 1 N(B k ) = 2 N(B k-1 ) = 2 k

BCJR Algorithm. Veterbi Algorithm (revisted) Consider covolutional encoder with. And information sequences of length h = 5

Ch 10 Trees. Introduction to Trees. Tree Representations. Binary Tree Nodes. Tree Traversals. Binary Search Trees

6.854J / J Advanced Algorithms Fall 2008

A relation on 132-avoiding permutation patterns

Sublinear Time Algorithms Oct 19, Lecture 1

Heaps. Heap/Priority queue. Binomial heaps: Advanced Algorithmics (4AP) Heaps Binary heap. Binomial heap. Jaak Vilo 2009 Spring

Design and Analysis of Algorithms. Lecture 9 November 20, 2013 洪國寶

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

1) S = {s}; 2) for each u V {s} do 3) dist[u] = cost(s, u); 4) Insert u into a 2-3 tree Q with dist[u] as the key; 5) for i = 1 to n 1 do 6) Identify

CSCI 104 B-Trees (2-3, 2-3-4) and Red/Black Trees. Mark Redekopp David Kempe

Heaps

2 all subsequent nodes. 252 all subsequent nodes. 401 all subsequent nodes. 398 all subsequent nodes. 330 all subsequent nodes

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Data Structures. Binomial Heaps Fibonacci Heaps. Haim Kaplan & Uri Zwick December 2013

CSE 100: TREAPS AND RANDOMIZED SEARCH TREES

Binary Decision Diagrams

Fundamental Algorithms - Surprise Test

1 Binomial Tree. Structural Properties:

Handout 4: Deterministic Systems and the Shortest Path Problem

LECTURE 2: MULTIPERIOD MODELS AND TREES

Advanced Algorithmics (4AP) Heaps

Binary Decision Diagrams

arxiv: v1 [math.co] 31 Mar 2009

An Optimal Algorithm for Calculating the Profit in the Coins in a Row Game

Algorithmic Game Theory and Applications. Lecture 11: Games of Perfect Information

On the Optimality of a Family of Binary Trees

Priority Queues. Fibonacci Heap

Lecture 2: The Simple Story of 2-SAT

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

0/1 knapsack problem knapsack problem

Successor. CS 361, Lecture 19. Tree-Successor. Outline

Single Machine Inserted Idle Time Scheduling with Release Times and Due Dates

Online Algorithms SS 2013

Practice Second Midterm Exam II

CSCE 750, Fall 2009 Quizzes with Answers

TABLEAU-BASED DECISION PROCEDURES FOR HYBRID LOGIC

Heaps. c P. Flener/IT Dept/Uppsala Univ. AD1, FP, PK II Heaps 1

UNIT 2. Greedy Method GENERAL METHOD

UNIT VI TREES. Marks - 14

The Limiting Distribution for the Number of Symbol Comparisons Used by QuickSort is Nondegenerate (Extended Abstract)

Gödel algebras free over finite distributive lattices

Maximum Contiguous Subsequences

MAT25 LECTURE 10 NOTES. = a b. > 0, there exists N N such that if n N, then a n a < ɛ

The Probabilistic Method - Probabilistic Techniques. Lecture 7: Martingales

CS4311 Design and Analysis of Algorithms. Lecture 14: Amortized Analysis I

CS360 Homework 14 Solution

ECE 586GT: Problem Set 1: Problems and Solutions Analysis of static games

MSU CSE Spring 2011 Exam 2-ANSWERS

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2017

Microeconomics of Banking: Lecture 5

useful than solving these yourself, writing up your solution and then either comparing your

CTL Model Checking. Goal Method for proving M sat σ, where M is a Kripke structure and σ is a CTL formula. Approach Model checking!

Heap Building Bounds

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

An effective perfect-set theorem

Q1. [?? pts] Search Traces

Smoothed Analysis of Binary Search Trees

High Frequency Trading Strategy Based on Prex Trees

Lie Algebras and Representation Theory Homework 7

The potential function φ for the amortized analysis of an operation on Fibonacci heap at time (iteration) i is given by the following equation:

A DNC function that computes no effectively bi-immune set

CEC login. Student Details Name SOLUTIONS

TR : Knowledge-Based Rational Decisions and Nash Paths

Stanford University, CS 106X Homework Assignment 5: Priority Queue Binomial Heap Optional Extension

Algorithms and Networking for Computer Games

CS364A: Algorithmic Game Theory Lecture #14: Robust Price-of-Anarchy Bounds in Smooth Games

Transcription:

Introduction to Greedy Algorithms: Huffman Codes Yufei Tao ITEE University of Queensland

In computer science, one interesting method to design algorithms is to go greedy, namely, keep doing the thing that gives us the best benefits at the current moment. Of course, just as in real life, greediness does not always serve us right after all, what seems to the best to do now may not be really the best from a global point of view. Nevertheless, there are problems where the greedy approach works well, sometimes even optimally! In this lecture, we will study one such problem which is also a fundamental problem in coding theory. Greedy algorithms will be explored further in COMP4500, i.e., the advanced version of this course. This lecture also serves as a preview for that course.

Coding Suppose that we have an alphabet Σ (like the English alphabet). The goal of coding is to map each alphabet to a binary string called a codeword so that they can be transmitted electronically. For example, suppose Σ = {a, b, c, d, e, f }. Assume that we agree on a = 000, b = 001, c = 010, d = 011, e = 100, and f = 101. Then, a letter such as bed will be encoded as 001100011. We can, however, achieve better coding efficiency (i.e., producing shorter digital documents) if the frequencies of the letters are known. In general, more frequent letters should be encoded with less bits. The next slide shows an example.

Example Suppose we know that the frequencies of a, b, c, d, e, f are 0.1, 0.2, 0.13, 0.09, 0.4, 0.08, respectively. If we encode each letter with 3 digits, then the average number of digits per letter is apparently 3. However, if we adopt the encoding of a = 100, b = 111, c = 101, d = 1101, e = 0, f = 1100, the average number of digits per letter is: 3 0.1 + 3 0.2 + 3 0.13 + 4 0.09 + 1 0.4 + 4 0.08 = 2.37. So in the long run, the new encoding is expected to save 1 (2.37/3) = 21% of bits!

Example You probably would ask: why not just encode the letters as: e = 0, b = 1, c = 01, a = 10, d = 10, f = 11 namely, encode the next frequent letter using as few bits as possible? The answer is: you cannot decode a document unambiguously! For example, consider the string 10: how do you know whether this is two letters be, or just one letter d? This issue arises because the codeword of a letter happens to be a prefix of the codeword of another letter. We, therefore, should prevent this, which has led to an important class of codes in coding theory: the prefix codes (actually prefix-free codes would have been more appropriate, but the name prefix codes has become a standard).

Example Consider once again our earlier encoding: a = 100, b = 111, c = 101, d = 1101, e = 0, f = 1100. Observe that the encoding is prefix free, and hence, allows unambiguous decoding. For example, what does the following binary string say? 10011010100110011011001101

The Prefix Coding Problem An encoding of the letters in an alphabet Σ is a prefix code if no codeword is a prefix of another codeword. For each letter σ Σ, let freq(σ) denote the frequency of σ. Also, denote by l(σ) the number of bits in the codeword of σ. Given an encoding, its average length is calculated as freq(σ) l(σ). σ Σ The objective of the prefix coding problem is to find a prefix code for Σ that has the smallest average length.

A Binary Tree View Let us start to attack the prefix coding problem (which may seem pretty hard at this moment). The first observation is that every prefix code can be represented as a binary tree T. Specifically, at each internal node of T, the edge to its left child corresponds to 0, and the edge to its right child corresponds to 1. Every letter σ Σ corresponds to a unique leaf node z, such that the sequence of the bits on the edges from the root to z spells out the codeword of σ.

Example Consider once again our earlier encoding: a = 100, b = 111, c = 101, d = 1101, e = 0, f = 1100. The following is the corresponding binary tree: 0 1 e 0 1 0 1 0 1 a c 0 1 b f d Think: Why must every letter be at the leaf? (Hint: prefix free)

Average Length from the Binary Tree Let T be the binary tree capturing the encoding. Given a letter σ of Σ, let us denote by d(σ) the depth of σ, which is the level of its leaf in T (i.e., how many edges the leaf is away from the root). Clearly, the average length of the encoding equals d(σ) freq(σ). σ Σ

Example 0 1 e 0 1 0 1 0 1 a c 0 1 b f d The depths of e, a, c, f, d, b are 1, 3, 3, 4, 4, 3, respectively. The average length of the encoding equals freq(e) 1 + freq(a) 3 + freq(c) 3 + freq(f ) 4 + freq(d) 4 + freq(b) 3.

Huffman s Algorithm Next, we will present a surprisingly simple algorithm for solving the prefix coding problem. The algorithm constructs a binary tree (which gives the encoding) in a bottom-up manner. Let n = Σ. At the beginning, there are n separate nodes, each corresponding to a different letter in Σ. If letter σ corresponds to a node z, define the frequency of z to be equivalent to freq(σ). Let S be the set of these n nodes.

Huffman s Algorithm Then, the algorithm repeats the following until S has a single node left: 1. Remove from S two nodes u 1, u 2 with the smallest frequencies. 2. Create a node v that has u 1, u 2 as children. Set the frequency of v to be the frequency sum of u 1 and u 2. 3. Insert v into S. When S has only node left, we have already obtained the target binary tree. The prefix code thus derived is called known as a Huffman code.

Example Consider our earlier example where the frequencies of a, b, c, d, e, f are 0.1, 0.2, 0.13, 0.09, 0.4, 0.08, respectively. At the beginning, S has 6 nodes: 10 20 13 9 40 8 a b c d e f The number in each circle represents the frequency of each node (e.g., 10 means 10%).

Example Merge the two nodes with the smallest frequencies 8 and 9. Now S has 5 nodes {a, b, c, e, u 1 }: 10 20 13 40 17 a b c e 8 f u 1 9 d

Example Merge the two nodes with the smallest frequencies 10 and 13. Now S has 5 nodes {b, e, u 1, u 2 }: u 2 23 20 40 17 u 1 b e 10 a 13 c 8 f 9 d

Example Merge the two nodes with the smallest frequencies 17 and 20. Now S has 5 nodes {e, u 1, u 3 }: u 2 23 40 37 u 3 e 10 a 13 c 17 20 b 8 f 9 d

Example Merge the two nodes with the smallest frequencies 23 and 37. Now S has 5 nodes {e, u 4 }: 40 e 60 u 4 23 37 10 a 13 c 17 20 b 8 f 9 d

Example Merge the two remaining nodes. Now S has a single node left. 100 40 e 60 23 37 10 a 13 c 17 20 b 8 f 9 d This is the final binary tree, from which the encoding can now be derived.

It should be fairly straightforward for you to implement the algorithm in O(n log n) time, where n = Σ. Think: Why do we say the algorithm is greedy? Next, we prove that the algorithm indeed gives an optimal prefix code, i.e., one that has the smallest average length among all the possible prefix codes.

Crucial Property 1 Lemma: Let T be the binary tree corresponds to an optimal prefix code. Then, every internal node of T must have two children. Proof: Suppose that the lemma is not true. Then, there is an internal node u with only one child node v. Imagine removing u as follows: If u is the root, simply make v the new root. Otherwise, make v a child node of the parent of u. The above removal generates a new binary tree whose average length is smaller than that of T, which contradicts the fact that T is optimal..

Crucial Property 2 Lemma: Let σ 1 and σ 2 be two letters in Σ with the lowest frequencies. There exists an optimal prefix code whose binary tree has σ 1 and σ 2 as two sibling leaves at the deepest level. Proof: Take an arbitrary prefix code with binary tree T. If σ 1 and σ 2 are indeed sibling leaves at the deepest level, then the claim already holds. Next, we assume that this is not the case. Suppose T has height h. In other words, the deepest leaves have depth h 1. Take an arbitrary internal node p at level h 2 by the previous lemma, p must have two leaves (at level h 1). Let σ 1 and σ 2 be the letters corresponding to those leaves.

Crucial Property 2 Proof (cont.): Now swap σ 1 with σ 1, and σ 2 with σ 2, which gives a new binary tree T. Note that T has σ 1 and σ 2 as sibling leaves at the deepest level. How does the average length of T compare with that of T? As the frequency of σ 1 is no higher than that of σ 1, swapping the two letters can only decrease the average length of the tree (i.e., as we are assigning a shorter codeword to a more frequent letter). Similarly, the other swap can only decrease the average length. It follows that the average length of T is no larger than that of T, meaning that T is optimal as well.

Optimality of Huffman Coding We are now ready to prove: Theorem: Huffman s algorithm produces an optimal prefix code. Proof: We will prove by induction on the size n of the alphabet Σ. Base Case: n = 2. In this case, the algorithm encodes one letter with 0, and the other with 1, which is clearly optimal. General Case: Assuming that the theorem holds for n = k 1 (k 3), next we show that it also holds for n = k.

Optimality of Huffman Coding Proof (cont.): Let σ 1 and σ 2 be two letters with the lowest frequencies. From Property 2, we know that there is an optimal prefix code whose binary tree T has σ 1 and σ 2 as two sibling leaves at the deepest level. Let p be the parent of σ 1 and σ 2. Construct a new alphabet Σ that includes all letters in Σ, except σ 1 and σ 2, but a letter p whose frequency equals f (σ 1 ) + f (σ 2 ). Let T be the tree obtained by removing leaf nodes σ 1 and σ 2 from T (thus making p a leaf). T gives a prefix code for Σ. Let T be the binary tree obtained by Huffman s algorithm on Σ. Since Σ = k 1, we know that T is optimal, meaning that avg length of T avg length of T

Optimality of Huffman Coding Proof (cont.): Now consider the binary tree T produced by Huffman s algorithm on Σ. Clearly, T extends T by simply putting σ 1 and σ 2 as child nodes of p. Hence: avg length of T = avg length of T + f (σ 1 ) + f (σ 2 ) avg length of T + f (σ 1 ) + f (σ 2 ) = avg length of T. This indicates that T also gives an optimal prefix code.