ECE 515
Information Theory


Assignment Submission


Assignments should be submitted on Brightspace https://bright.uvic.ca/d2l/home

Assignments

Assignment 1 - Due October 4, 2022 Solutions
  1. A discrete memoryless source X has symbols x1, x2, …, x9 with p(x1) = 1/2, and p(x2) = p(x3) = 1/8. Determine the upper and lower bounds on its entropy and the corresponding symbol probabilities.
  2. Consider an alphabet of 9 symbols with probabilities as follows: p(A) = 1/2, p(B) = 1/4, p(C) = 1/8, p(D) = 1/16, p(E) = 1/32, p(F) = 1/64, p(G) = 1/128, p(H) = 1/256, and p(I) = 1/256.
    a) If someone has selected one of these symbols and you need to discover which symbol it is by asking yes/no questions that will be truthfully answered, what is the most efficient sequence of asking such questions in order to discover the selected symbol?
    b) Show that each of these proposed questions is maximally informative.
    c) On average, how many such questions will need to be asked before the selected symbol is discovered?
  3. Course Notes Problem 1.1
  4. Consider two binary random variables X and Y with joint probability distribution p(xi,yj) given by
    p(0,0) = 1/3, p(0,1) = 1/3, p(1,0) = 0, and p(1,1) = 1/3.
    Find the value of H(X), H(Y), H(X|Y), H(Y|X), H(XY), and I(X;Y).
  5. Of a group of students, 25% are not suitable for the University of Victoria. As the result of an admission selection, however, only 75% of these unsuitable students are rejected and 50% of all students are rejected.
    (a) How much information does one receive if a student, who knows that he is not suitable for UVic, hears the results of the selection?
    (b) Answer the previous question if the selection is determined by tossing a coin.
    (c) Compare the results in (a) and (b) and give an explanation for the difference.
  6. The record of the UVic weatherperson is given in the table below.

    Actual
    PredictionRainNo Rain
    Rain1/83/16
    No Rain1/1610/16

    The values indicate the frequencies of the events. An engineering student notices that the weatherman is right only 12/16 of the time, but could be right 13/16 of the time by always predicting no rain. The student explains the situation and applies to be the new weatherperson, but the university turns him down. Explain why using information theory?
  7. The inhabitants of a town can be divided into two groups, C and H. Half the people in group C tell the truth, three-tenths lie, and two-tenths refuse to answer. In group H, three-tenths of the people tell the truthl, half lie, and two-tenths refuse to answer. Let p be the probability that a randomly selected person in the town will belong to group C. Let I = I(p) be the information conveyed about a person's truth-telling status by specifying the group they belong to. Find the maximum possible value of I and the percentage of people in group C corresponding to this maximum.
Assignment 2 - Due October 17, 2022 Solutions
  1. a) Consider a fair coin. When it is flipped, what is the mutual information between the top side and the bottom side of the coin?
    b) Consider a 6-sided fair die. When it is rolled, what is the mutual information between the top side and the front face (the side facing you)?
  2. Course Notes Problem 1.2
  3. Let ABC be a discrete joint ensemble of three random variables. Prove the validity of the following inequalities and if true find conditions for equality. Venn diagrams are not sufficient.
    a) I(AB;C) ≥ I(A;C)
    b) H(AB|C) ≥ H(A|C)
    c) I(A;C|B) ≥ I(C;B|A) - I(C;B) + I(A;C)
    d) H(ABC) - H(AB) ≤ H(AC) - H(A)
  4. Let X and Y be random variables with values x1, x2, ..., xM and y1, y2, ..., yL, respectively. Let Z = X + Y.
    a) Show that H(Z|X) = H(Y|X).
    b) Show that if X and Y are independent, H(Y) ≤ H(Z) and H(X) ≤ H(Z).
    c) Give an example in which H(X) > H(Z) and H(Y) > H(Z).
  5. The entropy of a discrete random variable is viewed as a measure of uncertainty. Verify this interpretation by proving that any transfer of probability from one symbol to another which makes the distribution more uniform will increase the entropy of the random variable.
  6. Consider a three sided coin with outcomes a, b, and c having unknown probabilities p(a), p(b), and p(c), respectively.
    (a) Devise a means of using two flips of this three sided coin to simulate flipping a fair two sided coin. This should maximize the number of bits generated per flip of the three sided coin.
    (b) Can 3 flips of the three sided coin be used to increase the number of bits generated per flip compared to 2 flips?
    (c) Can 4 flips of the three sided coin be used to increase the number of bits generated per flip compared to 2 flips?
  7. Consider the function φ(X,Y) = H(Y|X) + H(X|Y).
    (a) Determine whether this function satisfies the following properties of a metric. Venn diagrams are not sufficient.
            (i) φ(X,Y) ≥ 0
            (ii) φ(X,Y) = φ(Y,X)
            (iii) φ(X,Y) + φ(Y,Z) ≥ φ(X,Z)
    (b) Show that φ(X,Y) = 2H(XY) - H(X) - H(Y). Venn diagrams are not sufficient.
Assignment 3 - Due November 8, 2022 Solutions
  1. Let X, Y, Z be three random variables with a joint probability density function p(x, y, z). The relative entropy between the joint distribution and the product of the individual distributions is D(p(x, y, z)||p(x)p(y)p(z))
    a) Expand this in terms of entropies.
    b) When is this quantity zero?
  2. Consider a random variable X with four possible outcomes {x1, x2, x3, x4} and two probability distributions on X:
    p(x1) = .5, p(x2) = .25, p(x3) = .125, p(x4) = .125
    q(x1) = .5, q(x2) = a, q(x3) = b, q(x4) = c
    (a) Find the values of a, b, and c that minimize the cross entropy H(p,q) and the corresponding value of H(p,q). Show that this value is equal to H(p(X)) + D(p(X)||q(X)).
    (b) Suppose we know that c = 0.25. Find the values of a and b that minimize the cross entropy H(p,q) and the corresponding value of H(p,q). Show that this value is equal to H(p(X)) + D(p(X)||q(X)).
    (c) Compare the values of H(p,q) in (a) and (b) and provide an interpretation.
  3. Determine which of the following codes is uniquely decodable and which are prefix codes.
    (a) {0,10,11}
    (b) {0,01,11}
    (c) {0,01,10}
    (d) {0,01}
    (e) {110,01,10,11}
    (f) {00,01,10,11}
    (g) {0, 00, 000, 0000}
  4. Let X be a random variable with N equiprobable outcomes.
    a) Describe the optimal binary prefix code for this source (i.e. determine the codeword lengths), and compute the average codeword length L(C).
    b) Provide an optimal code for N = 9.
    c) For what values of N does L(C) = H(X)?
  5. Find an optimal binary prefix code for the infinite source with probabilities
    p = {.9, .09, .009, .0009, … }
  6. Consider two discrete memoryless sources. Source 1 has six symbols with probabilities 0.3,0.2,0.15,0.15,0.1 and 0.1. Source 2 has seven symbols with probabilities 0.3, 0.25, 0.15, 0.1, 0.1, 0.05 and 0.05. Construct a binary Huffman code and a ternary Huffman code for each source. Find the efficiency of each code. Are these codes unique in terms of the codeword lengths?
  7. For Source 1 in Problem 6, construct an optimal binary code (with minimum average codeword length) that satisfies the suffix condition. The suffix condition is that no codeword is the suffix of another codeword. Show that a code which satisfies the suffix condition is always uniquely decodable. Show that the minimum average codeword length for a code satisfying the suffix condition for a given source is the same as the average codeword length for the corresponding Huffman code.
Assignment 4 - Due November 21, 2022 Solutions
  1. Consider a discrete memoryless source X with four symbols {a, b, c, d} with probabilities p(a) = .1, p(b) = .3, p(c) = .5, p(d) = .1.
    (a) Determine the Tunstall code for this source with binary codewords of length 4. Assume a reserved codeword is not required. Compute the average bit rate and the code efficiency.
    (b) Determine the corresponding Huffman code for this source with sourcewords of length N=2 and compare the efficiency of the Huffman and Tunstall codes.
  2. A binary source produces statistically independent bits with p(x1) = 0.005 and p(x2) = 0.995, where x1 denotes a 1 and x2 denotes a 0. These bits are taken 100 at a time and a binary codeword is provided only for every sequence of 100 bits containing 3 or fewer 1's.
    (a) If the codewords all have the same length, find the smallest length required for the set of codewords.
    (b) What is the probability of obtaining a source sequence for which no codeword exists?
    (c) Using the Weak Law of Large Numbers, provide a bound on the probability of obtaining a sequence for which no codeword is available (Equation (2.6) in the course notes may be useful). Compare this result with that in part (b).
  3. Course Notes Problem 2.2.
  4. A binary source produces statistically independent bits with p(x1) = 0.88 and p(x2) = 0.12.
    (a) Design a fixed length source compaction code for this source with codeword length L=5. The code should be chosen to provide good compression (high rate R), with a small block decoding failure probability.
    (b) What is the probability of an atypical sourceword?
    (c) What is the actual code rate R?
  5. A source has symbols x1, x2 and x3 with probabilities 0.6, 0.3 and 0.1, respectively. This source is to be encoded with an arithmetic code with the intervals assigned so that the larger probabilities are on the left (x1,x2,x3). Find the smallest possible binary codeword corresponding to the symbol sequence x1 x3 x2 x3 x1 x1 x2 x1. Compare the length of this codeword with that of the corresponding Huffman codewords and the expected codeword length lu.
  6. A source has symbols x1, x2 and x3 with probabilities 0.8, 0.02 and 0.18, respectively.
    (a) Decode the arithmetic code sequence 1000011 for these probabilities. Assume the symbols are arranged in numeric order (x1,x2,x3) in the range [0,1) and the number of encoded symbols is 9.
    (b) Compare the codeword length with the codeword length considering the source entropy H(X) and the expected codeword length.
  7. Let X1, X1, … be i.i.d. random variables with pdf p(X). Find the limit as N → ∞ of (p(X1, X1, …, XN))1/N.
Assignment 5 - Due December 5, 2022 Solutions
  1. Using the Lempel-Ziv algorithm in the Course Notes, encode the binary sequence 0001 1010 0010 1000 1100 0100 0011 1001 0000. Begin with 4 bit codewords and expand to 5 bits if necessary.
  2. Consider the Lempel-Ziv algorithm in the Course Notes with 5 bit codewords and S1 = 0 and S2 = 1, i.e. the source is binary. Decode the following codeword sequence
    00101 00100 00110 01011 01100 00111 01001 10001 01101 10000 10101
    What can you say about the probability distribution of this source?
  3. A discrete memoryless channel (DMC) has transition probability matrix

    | .85 .05 |
    | .10 .10 |
    | .05 .85 |

    a) Determine the capacity of this channel.
    b) The middle row of the transition probability matrix can be considered the probabilities of an erasure. Suppose an erasure is equally likely to be correct or in error. What is the capacity of the corresponding channel without erasures?
    c) Suppose that the errors are all erasures. What is the capacity of the corresponding channel?
    d) Compare the capacities of the three channels.
  4. Course Notes Problem 3.2.
  5. Course Notes Problem 3.3, but use N=15.
  6. Course Notes Problem 3.4.
  7. Consider a quaternary code with codewords of length N = 13 and M = 1024 codewords. This code can correct three errors or less. Determine the code rate R in bits. Calculate the probability of error Pe if this code is used over a quaternary symmetric channel (QSC) with
    (a) p = 0.1
    (b) p = 0.01
    (c) p = 0.001
    (d) p = 0.0001