## Permuted

As you must have noticed, I did permute the list of topics as mentioned below. I have also made this change on the course website.

## Possible permutation

I am considering permuting the lecture plan a bit, so we cover the F_{2} algorithm and Johnson-Lindenstrauss next, and then return to the MEDIAN and SELECTION problems afterwards. Just a heads-up.

## Lecture notes

The first set of lecture notes, covering the basics and the first algorithm we saw (Misra-Gries; scanned PDF here) is now ready. This is also linked from the course website. I will add to these notes soon, to include the Flajolet-Martin and AMS algorithms for DISTINCT-ELEMENTS. Also, Radhika and Andrew will add to the notes to include Count-Min Sketch, CountSketch and the BJKST algorithm.

I’ll announce each significant update to the lecture notes on this blog.

Please notify me, or write a comment here, if you spot an error or a typo in the notes, or wish to say something related to all this. Let this blog be at least somewhat informal!

## The first homework set

It is ready at last! I have posted it to the course website (scroll down all the way). Future homework sets will be announced here as and when they get posted.

Problem 2 on this set refers to the finite field . If you are unfamiliar with this object, read this Wikipedia article. That already gives you enough information for now. If you want to learn more about finite fields, stop by my office sometime or pick up a suitable Mathematics textbook.

## Probability theory reference

By now, we have introduced the idea of randomized algorithms, and to analyse their performance, we have used some basic probability theory. To recap, the important concepts are:

- Probability spaces, events, union bound
- Random variables, expectation, linearity of expectation
- Independent events, independent random variables
- Variance, standard deviation

Most importantly, we have introduced the notion of *large deviation* inequalities, or *tail inequalities*. Briefly, these inequalities assert that a certain random variable is not too likely to deviate much from its expected value. The important inequalities (so far) are:

- Markov’s inequality: If is a random variable and a constant, then
.
- Chebyshev’s inequality: If , then
.
- Chernoff bound: If and are independent and identically distributed indicator random varibles, then
; .

Going forward, it will be vital to have these inequalities at your fingertips, ready for use. An excellent reference covering all of this material is the early chapters of the book *Randomized Algorithms*, by Motwani and Raghavan.

## Hello, world

I shall be posting information and updates regarding the course “CS 85/185: Data Stream Algorithms” to this blog. Students registered for the course are required to keep up with it, because important announcements will be made on it from time to time. Apart from announcements, I will try to use this blog to supplement the content of the lectures. Use the comments to initiate discussions on course topics. All students (including auditors) are welcome to comment.

In case something (such as commenting) does not work right, please bear with me; I am new to WordPress. You can always contact me via good old email.

leave a comment