How this glossary works internally


First, the work done off-line, away from the Web:

Years back I was a developer on a DOS program called Q&A, an easy-to-use database program that can store large fields of text. The glossary's content is kept there where it can be easily edited, searched, sorted, and so forth. There are fields in that database for term, definition, source [of definition], context, and a couple of others. In a definition I can put HTML codes for underlining, italics, tables, or hyperlinks, though they do not display actual underlining or italics there. Also a definition can have my own special codes for particular kinds of hyperlinks, like links to other entries in the glossary, to a bibliography or to lists generated at runtime (like the list you see if you enter 'finance' to the search box).

The records in the database can be printed out to a file econtent.dat. That output file is the one to which the live (cgi) program refers when it receives a query, and one could create such a file without having Q&A involved at all. This file is in something like html, but it also has those special codes for bibliographic references and so forth.

I have another program that reads that file and strips out my special codes to make a straight html version of the glossary that one sees when one presses the button labeled 'Content' from the first screen. (I have the idea that from that I'll someday make a printed version.) (Also that program alerts me if I made a syntax error in html which I often do, editing it in a straight DOS environment.)

Okay that covers the work done off line, basically to make econtent.dat and an analogous file for the bibliography. Those programs run on DOS. Both online and offline programs are in C. People often use Perl for this sort of thing, but I know C and have done little in Perl.

The online (cgi) program searches the econtent.dat file line by line for definitions matching the user's request, handling on its way any wildcards the user entered and any issues related to multiple matches to the user request. Having found the right definition it detects any special code signalling a special hyperlink (like a reference to another term) and substitutes the right html for that. Then it "prints" html out to the user, and its execution finishes. It runs for only one second, then exits. It gets restarted when another query arrives from the Web server.

That all may seem complicated, but there aren't very many pieces, and each piece has one job. I thought a long time about each one and how to approach it simply. I'm happy now to have most of the features a one-person glossary project needs. For a large scale version of this one would probably want a different database software to store the content (e.g. so that many remote 'experts' could edit it online simultaneously), and for searches to go through a serious database program, not line-by-line. But it turns out that for 1000 definitions or so, the line-by-line search is fast enough. I see how to upgrade this to a large scale, multi-person, multi-glossary project in principle. If anyone wants to get into this business with me, please let me know!

Up to main econterms page

Feedback to econterms@econterms.com