I have made the following observations, as a to do list for all. These are things we need to work on.
-directory structures for unix/windows/dos
it needs to create directories if needed
more bulletproof - can we find example of this kind of parser?
-integration of preprocessing after file read
need to do "gcc -C" preprocessing with leaving comments first
see: http://www.dis.com/gnu/gcc/gcc_14.html:
@gcctabopt{-C}
Do not discard comments. All comments are passed through to the output file, except for comments in processed directives, which are deleted along with the directive.
You should be prepared for side effects when using `-C'; it causes the preprocessor to treat comments as tokens in their own right. For example, comments appearing at the start of what would be a directive line have the effect of turning that line into an ordinary source line, since the first token on the line is no longer a `#'.
-parser quirks
bulletproof read from bracket start to bracket end
data structure creation (vector of vectors?)
-data structure stop word extraction
methods for addition and deletion
-integration!
it will take one week. This is for sure!
Right now, we have minimal file parsing, very minimal function break up, good file I/O, and wordnet functionality.
If you have any feedback, please post comments.
Hi there folks, well I am done with the word lookup in wordNet. The program called OpenUrl() takes in a string which is the word to be looked up on wordNet and returns a true or false depending on whether the thing is a word or not. It is as simple as that!! Test it for yourselves, just erase the main when integrating it with the rest of the code!!!
Well here is the code that opens the url fro .net enters a word and retrieves the results into a file. I need to do the file writing of the retrieved result and the parsing of it. I should be done tomorrow, check for it then.
All you guys have to do is say Openurl( "word to be retrieved"); and it does the rest. See ya.
This is everything. It loads files, it parses them, it adds the comments.
An explanation about parser:
1. First parseFile(File, File) is called. This opens the files... but the input file is actually completely read into the string fileString. fileIndex is the pointer to where the simulated reader is (you'll have to trace the code to get a feel for it... there are functions readChar, prevewChar, readString to help you).
2. Next dataMine() is called. This first reads in any pre-comments the file may have (comments at the top of the code before any other code). Then it loops through all the functions and reads in comments. Look at the FunctionData class to see how this is stored. The comments before and inside the function are added to the FunctionData (the name of the function is discovered in the middle of this, and after that it uses braces to find the end of the function).
Please trace through the code to see what is happening. Included is fork.c, from the linux kernel, that I used to test it out.
This is the newest source code for loading files. It works very well now. Right now the code loads all of the files from a directory AND its subdirectories, prints out the first line of text in the file, and opens an output file in the output directory. The program only reads '.c' files. It also hangs up when trying to open the output file in a SUB-directory, but I'll get to that later. It works fine if everything is just in one directory.
The Parser class contains static methods. The driver class (which contains the options from the command arguments) is also static, which is accessed by SCDM.driver.