ECE 515
Information Theory


Assignment 1 - Due September 28, 2018 Solutions
  1. Consider two binary random variables X and Y with joint probability distribution p(x,y) given by
    p(0,0) = 1/2, p(0,1) = 1/4, p(1,0) = 0, and p(1,1) = 1/4.
    Find the values of H(X), H(Y), H(X|Y), H(Y|X), H(XY), and I(X;Y).
  2. A discrete memoryless source has symbols x1, ..., x6 with p(x1) = 1/2 and p(x2) = 1/4. Determine the upper and lower bounds on the entropy of this source.
  3. A jar contains 5 black balls and 10 white balls. Experiment X involves randomly drawing a ball out of the jar. Experiment Y involves randomly drawing a ball with the ball drawn in experiment X not replaced in the jar. One is interested in the colour of the drawn ball.
    (a) How much uncertainty does experiment X contain?
    (b) What is the uncertainty in experiment Y given that the first ball is black?
    (c) What is the uncertainty in experiment Y given that the first ball is white?
    (d) How much uncertainty does experiment Y contain?
  4. Let X be a random variable whose entropy H(X) is 8 bits. Suppose that Y(X) is a deterministic function that takes on a different value for each value of X.
    (a) What is H(Y)?
    (b) What is H(Y|X)?
    (c) What is H(X|Y)?
    (d) What is I(X;Y)?
    (e) Suppose now that the deterministic function Y(X) is not invertible, so that different values of X may correspond to the same value of Y(X). In this case, what can be said about H(Y), and also about H(X|Y)?
  5. A Division Series is a five game baseball series held in October that ends as soon as a team wins 3 games. Define a random variable X that represents the outcome of the series between two teams A and B. Three possible values of X are BBB, ABAA, and BBAAA (many more outcomes can occur). Let Y be the number of games played which range from 3 to 5. Assuming teams A and B are equally matched and that the games are independent, calculate H(X), H(Y), H(Y|X) and H(X|Y).
  6. In a country, 25% of the people are blond and 75% of all blond people have blue eyes. In addition, 50% of the people have blue eyes. How much information is received in each of the following cases:
    (a) if we know that a person is blond and we are told the colour (blue/not blue) of their eyes,
    (b) if we know that a person has blue eyes and we are told the colour (blond/not blond) of their hair,
    (c) if we are told both the colour of their hair and that of their eyes.

Assignment 2 - Due October 12, 2018 Solutions
  1. Let X and Y be random variables which are identically distributed but not necessarily independent. Define the correlation between X and Y as ρ = 1 - H(Y|X)/H(X).
    (a) Show that ρ = I(X;Y)/H(X).
    (b) Show that 0 ≤ ρ ≤ 1.
    (c) When is ρ = 0?
    (d) When is ρ = 1?
  2. Let XYZ be a discrete joint ensemble of three random variables. Prove the validity of the following inequalities and if true find conditions for equality:
    (a) I(XY;Z) ≥ I(Y;Z).
    (b) H(XY|Z) ≥ H(Y|Z).
    (c) I(X;Z|Y) ≥ I(Y;Z|X) + I(Z;X) - I(Y;Z).
    (d) H(XYZ) - H(YZ) ≤ H(XY) - H(Y).
  3. The following equiprobable messages are used to send data over a binary symmetric channel with crossover probability p:
    M1 = 0000 M2 = 0011 M3 = 0101 M4 = 0110
    M5 = 1001 M6 = 1010 M7 = 1100 M8 = 1111
    If the sequence Y=0000 is received,
    (a) how much information is provided about M1 by the first digit received?
    (b) how much additional information about M1 is provided by the second, then the third, and finally the fourth received digit?
  4. Course Notes Problem 1.3
  5. Course Notes Problem 1.7
  6. Relative entropy
    (a) Consider a random variable X with three possible outcomes {a, b, c}, and two distributions on X:


    Calculate H(p), H(q), D(p||q), and D(q||p), and verify in this case that D(p||q) ≠ D(q||p).
    (b) Part (a) shows that D(p||q) ≠ D(q||p) in general, but there could be distributions for which equality holds. Provide an example of distributions p(x) and q(x) on a binary alphabet such that D(p||q) = D(q||p) (other than the trivial case p(x) = q(x) for which D(p||q) = D(q||p) = 0).

Assignment 3 - Due November 2, 2018 Solutions
  1. A source S has six symbols with probabilities p1 to p6. The probabilities pi are ordered such that p1 = p2 = p3 > p4 > p5 = p6. Construct a binary Huffman code for this source.
  2. A source has 9 different symbols u1 to u9 with respective probabilities 1/4, 1/4, 1/8, 1/8, 1/16, 1/16, 1/16, 1/32 and 1/32. This source must be coded using an alphabet of three symbols a, b and c.
    (a) Determine the code and the efficiency of the code using the methods of Fano, Shannon, Huffman and Tunstall. Compare these efficiencies.
    (b) Determine a code and the efficiency of the code for the case when the symbol c may never be followed by another c in the encoder output. This must be a prefix code, and have the highest possible efficiency.
  3. A source outputs symbols x1 to x7 with probabilities p(x1) = p(x2) = 1/3, p(x3) = p(x4) = 1/9, and p(x5) = p(x6) = p(x7) = 1/27.
    (a) Determine a binary Huffman code for this source and find the efficiency.
    (b) Determine a ternary Huffman code for this source and find the efficiency.
    (c) Which code is preferred based on the efficiency?
    (d) If it costs $1.80 to send a symbol over a binary channel and $2.70 to send a symbol over a ternary channel, which code is preferred based on cost?
  4. Is the length of the binary Huffman codeword for a symbol xk with probability p(xk) always less than or equal to lk = ⌈ -log2 p(xk) ⌉ ?
  5. A source generates zeros and ones with probabilities p(0) = 0.8 and p(1) = 0.2. The sequence of symbols are encoded with the code symbols a, b, c, d and e according to the follwing table.

    messagecode symbol

    (a) Is this code uniquely decodable? Is it a prefix code?
    (b) What is the average amount of information obtained per code symbol?
    (c) What is the efficiency of this code?
  6. Which of the following codes cannot be a binary Huffman code for any probability distribution?
    (a) C1 = {0,10,11}
    (b) C2 = {00,01,10,110}
    (c) C3 = {01,10}
    (d) C4 = {1,10,100,1000,0000}
Assignment 4 - Due November 16, 2018 Solutions
  1. For a codeword alphabet J, let p(X) be the true pdf with compact code C and q(X) be the estimated pdf with compact code Ĉ. Show that
    H(p(X)) + D(p(X)||q(X))
    logb J
    ≤ L(Ĉ) <
    H(p(X)) + D(p(X)||q(X))
    logb J
    + 1.
  2. A source has symbols x1, x2 and x3 with probabilities 0.7, 0.2 and 0.1, respectively. Encode this source using an arithmetic code with the intervals assigned so that the larger probabilities are on the left. Find the smallest possible binary codeword corresponding to the symbol sequence x1 x1 x2 x3 x2 x1 x3 x1 . Compare this with the expected codeword length based on the entropy.
  3. A source has symbols x1, x2 and x3 with probabilities 0.2, 0.3 and 0.5, respectively. Decode the arithmetic code sequence .63215699 for these probabilities. Assume the symbols are arranged in numeric order (x1,x2,x3) in the range [0,1), and the number of encoded symbols is 16.
  4. Using the Lempel-Ziv algorithm in the course notes, encode the sequence 000101110010100101
  5. A discrete memoryless channel (DMC) has transition probability matrix

    | .978 .002 |
    | .020 .020 |
    | .002 .978 |

    (a) Determine the capacity of this channel.
    (b) The middle row of the transition probability matrix can be considered the probabilities of an erasure. Suppose an erasure is equally likely to be correct or in error. What is the capacity of the corresponding channel without erasures?
    (c) Suppose that the errors are all erasures. What is the capacity of the corresponding channel?
    (d) Compare the capacities of the three channels.
  6. Course Notes Problem 3.2
Assignment 5 - Due November 30, 2018 Solutions
  1. Consider the following channel.
    (a) Find the channel capacity.
    (b) If p=1/2, determine an input distribution that allows error-free communications at capcity.
  2. A discrete memoryless channel is additive modulo L if it has input and output alphabets 0, 1, ..., L - 1, and the output y is given by (x + z) modulo L. The noise variable z also takes the values 0, 1, ..., L - 1, and is statistically independent of the input.
    (a) Show that I(X;Y) = H(Y) - H(Z).
    (b) Find the capacity in terms of H(Z) and the corresponding input probabilities.
  3. Consider the construction of a binary code of length n = 6 with M codewords which can correct all single errors.
    (a) How many binary vectors have a distance less than or equal to 1 from a given codeword cm?
    (b) What is the maximum possible number of codewords M?
    (c) Assuming that this maximum value of M can be achieved, what is the corresponding code rate R?
    (d) Show that a code with M = 8 exists.
  4. Consider the binary code C composed of the following four code words C = {(00100),(10010),(01001),(11111)}
    (a) Is this code linear? Prove your answer.
    (b) What is the minimum distance of this code?
    (c) What is the maximum number of errors this code can correct?
    (d) What is the maximum number of errors this code can detect?
  5. Find the length, dimension, rate, and minimum distance for the code defined by the following generator matrix
    Determine a systematic generator matrix for this code and give a parity check matrix for this systematic generator matrix.
  6. The parity check matrix of a code with dmin=3 is given by
    (a) Determine a syndrome decoding table for this code and use it to decode the following received vectors
    (b) r = 110011
    (c) r = 111000