Current Forum: Homework 5 - Parts 1 and 2 |
Date: Thu Nov 8 2001 3:15 pm |
Author: White, David <white3@andrew.cmu.edu> |
Subject: part 2, detecting visited pages efficiently... |
|
|
So how concerned to we need to be with the efficiency of our code in making sure we don't visit the same site twice.
For instance, if you visit a page, you can store it in some vector list, then if from some other page you get linked to that same page, you can traverse that list to see if it has been visited already. The downside to this is that this is O(N^2). But if N is small ~100 it shouldn't be too bad. If N can be very big (ie. we're going public, major IPO with this websearch) we need to devise a better mechanism...hashtable?
If we can't use the O(N^2) approach, can we use the hashtable pre-defined in Java and the hash function pre-built for URL's?
Thanks
|
|