I want to make a program that can search for a keyword and tell me what textfile has this keyword.
I have a bunch of docs:
doc1.txt, doc2.txt, doc3.txt
instead of inserting the whole content in the database as we usually do, i thought of this:
1. create a table 'keywords' such:
autonumber: id
char(20): keyword
2.write a program that extracts all the keywords(new keywords) from a given input and spit them in the keywords table
3.make another table 'textfiles'
autonumber: id
text: description
datetime: docdate
4. make another table 'textkeywords'
autonumber: id
number: textfile
number: keyword
5. make an interface to define textfiles (insert them into 'textfiles' tbl)
6.then parse all keywords (if any new keywords found, insert them into keywords table), otherwise add records into the textkeywords table that links a 'textfile' to a specific 'keyword'
now if i want to search for a keyword, I only select from the keywords table for the IDs of the matched keywords,
then query the textkeywords.keyword and pull all the textkeywords.textfile then query textfile.id to get full textfile information...
it might seem a hassle, but it is simple to implement...
but i was wondering if it is a good/fast way as i'm thinking.
all suggestions are welcome.