Discussion Board
Go to the previous messageGo to the following message
Current Forum: Homework 5 - Part 3
Date: Sat Nov 17 2001 1:59 am
Author: Bortz, Andrew S. <abortz@andrew.cmu.edu>
Subject: Re: Saving and Restoring from disk takes very long

I get exactly 229,353 bytes for my index for a 100 page crawl starting at http://www.cmu.edu. It does not include a graph structure, although that wouldn't take up that much space. It does get the full 100 pages (as in not ending early). I personally don't see 229k as being small at all. In fact, there are yet more ways I could optimize the data output.

5MB (or even 2MB) _does_ seems to me to be a bit excessive when you figure the _raw_ data being read in is 359,194 bytes on this particular crawl. We throw out a lot of the raw HTML, so a data structure that takes that much more space than the original pages seems grossly inefficient.
Post response

Go to the previous messageGo to the following message
Current Thread Detail:
Saving and Restoring from disk takes v...      Brands, Marc C.      Thu Nov 15 2001 7:25 pm       
Re: Saving and Restoring from disk ...      Liu, Limin Angela      Fri Nov 16 2001 4:45 pm       
Re: Saving and Restoring from di...      Brands, Marc C.      Fri Nov 16 2001 5:30 pm       
Re: Saving and Restoring from di...      Brands, Marc C.      Fri Nov 16 2001 6:21 pm       
Re: Saving and Restoring from disk ...      Bortz, Andrew S.      Fri Nov 16 2001 8:58 pm       
Re: Saving and Restoring from di...      Goodman, Brian J.      Sat Nov 17 2001 1:19 am       
Re: Saving and Restoring from...      Bortz, Andrew S.      Sat Nov 17 2001 1:59 am       
Re: Saving and Restoring f...      Goodman, Brian J.      Sat Nov 17 2001 2:35 am       
Re: Saving and Restorin...      Bortz, Andrew S.      Sat Nov 17 2001 2:42 am       
Re: Saving and Restoring from disk ...      Liu, Limin Angela      Sat Nov 17 2001 3:28 pm       
Re: Saving and Restoring from di...      Batra, Rohan      Mon Nov 19 2001 5:52 am       
Re: Saving and Restoring from...      Liu, Limin Angela      Mon Nov 19 2001 12:16 pm       
Re: Saving and Restoring f...      Batra, Rohan      Mon Nov 19 2001 8:43 pm       
Re: Saving and Restorin...      Liu, Limin Angela      Mon Nov 19 2001 8:54 pm       
Re: Saving and Restoring from di...      White, David      Mon Nov 19 2001 5:16 pm       
Re: Saving and Restoring from...      Liu, Limin Angela      Mon Nov 19 2001 8:01 pm       
Re: Saving and Restoring from...      Liu, Limin Angela      Mon Nov 19 2001 8:18 pm       

Back to previous screen