ok, i'm really sorry if i'm bothering anybody by posting (sometimes overly sarcastic messages) to the message board so much. if the ta's get these new posts emailed to them... sorry.
i got rid of the out of memory errors. turns out i'm stupid. but i have another question about this.
here's the thing, though. right now i'm using a TreeSet to store the line numbers because it automatically sorts and removes duplicates. a TreeSet requires more memory than a Vector, and the only way for me to parse the entire file with no errors is to use Vectors instead of TreeSets.
i realize that i can write my own sorting code, but is it ok to assume that since parser is reading the line numbers starting from the beginning of the file, then the words will be added to the trie with the line numbers already in order?
also, is it ok to have duplicate line numbers in the Vector? i think the homework writeup said no duplicate line numbers, but even if there are duplicates, it doesn't change the results of search().
alternatively, can i just use TreeSets and assume that the files won't be as big as the test file you gave us?
this seems like a reasonable request because it doesn't seem to me that, in a real application, anybody would write a program like this in java because of these memory (and java's speed) limitations. this course is called "fundamental algorithms and data structures" or something like that, so even if the file is small, as long as the implementation works it shows our understanding of the algorithms involved.
thanks,
jason |