## A quick survey of graph mining

We have now seen a number of algorithms for processing “graph streams.” This quick survey by McGregor is a great resource collecting together many state-of-the-art results (as of 2008). Thanks to Joe Cooley for bringing this to my attention.

## Next homework

The next homework set has been posted to the course website. Four problems as usual. This set builds on quite a few of the techniques we have learnt so far, so there’s going to be considerable work involved. Those taking the course for credit are urged to begin working on these problems right away.

## Notes further updated

The lecture notes have been updated some more. Thanks to all who have provided me timely scribe notes. Please read what other scribes have written and if this gives you ideas for improving your own notes, please send me a new version of your notes. It would be most welcome.

## Homework review

I plan to spend a good portion of the Nov 3 lecture discussing the homework, including some of HW1. Non-creditors might want to arrive mid-way into tomorrow’s class. We won’t start the planned material for the class (algorithms for matching) until 11:05am.

## LaTeX reference

No matter what your level of expertise with LaTeX, I think you will find this wikibook to be a useful reference or guide. Scribes, especially, take note!

## References galore

The last few lectures have covered algorithms for which the best reference remains the original research paper describing the algorithm. Here are the relevant pointers, collected. If I have forgotten to add a reference for a result recently covered in class, please notify me and I’ll update this post.

- The multi-pass exact median algorithm is due to Munro and Paterson.
- The approximate median algorithm for randomly ordered streams is due to Guha and McGregor.
- The quantiles algorithm is due to Greenwald and Khanna.
- The one-pass algorithm for empirical entropy is due to Chakrabarti, Cormode and McGregor.
- State-of-the-art results for spanner construction were obtained by Elkin.

For future work (including homework problems), do not be scared to dig into these papers, if you need to understand one of these algorithms at a deeper level than we managed to cover in class.

## New homework is up

Homework 2 is ready and is now available on the course website. This one is 100% about designing algorithms, and it is designed to get you thinking more deeply about what you have learned so far and coming up with creative solutions.

Start brainstorming ASAP, for these problems will not be solved in one sitting. You’ll need to sleep over them a couple of times and let the necessary ideas come to you over time.

## X-hour update

Just a heads-up that we will be using the X-hour, 3:00-4:10pm, next Wednesday (Oct 28). Also, Amit will be out of town next Tuesday (Oct 27), so we will not have regular class. Instead, you can use the opportunity to discuss HW2 with Chrisil.

## Reference for today’s lecture

Today’s lecture was more technically heavy than usual. I hope it didn’t bring back any “calculus nightmares” for anyone! In any case, here is Indyk’s paper, which includes the algorithms and ideas we discussed today, plus quite a bit more.

We’re not quite done discussing all the higher-level issues raised by this, and some other recent algorithms we have seen. Next Tuesday, I want to spend approximately the first half discussing these things, and then the second half talking about the MEDIAN problem. As you’ll note, we’re continuing to tweak the lecture plan slightly as we go. For instance, today we ended up discussing stable distributions earlier than planned.

## More lecture notes

I have added a new chapter to the lecture notes, which covers the Flajolet-Martin DISTINCT-ELEMENTS algorithm. I also made small tweaks to the earlier parts. As always, the link from the course website will get you to the latest version of the notes.

Please notify me if you find any typos or errors in the notes. With 20-odd eagle-eyed readers reading these, it would be a shame if something got away!

The LaTeX sources to these notes have been distributed (by email) to all registered students, who can use them while scribing. If you are a registered student and did not get my email about this, please notify me at once.

leave a comment