ERL2DB(1) ERL2DB(1) NAME erl2db - ERL download format to DBase import format con- version program. SYNOPSIS erl2db [-ahdDeirRsSvV] [-E editor-path] [-o output-file] [--] [file...] DESCRIPTION erl2db is a program to convert an Electronic Reference Library (ERL) download file to a DBase ASCII import file. The download file should be created with WinSPIRS and con- tain all fields, each field on a single line (see subsec- tion Example ERL datarecord). Abbreviated fieldidentifiers must be used (TI:, AU: etc.). The outputfile contains all fields of one record on a single line. It is suited to be read by DBase. This section is divided into the following subsections: Initialization, Options, Processing, Example ERL datarecord, Example erl2db output record, Example profile, Syntax of profile, Semantics of profile and Program exit status. Initialization When erl2db is run, it starts scanning the commandline parameters. Then erl2db looks for a profile in the current directory. If no profile is found there, erl2db looks for the system-wide profile in the directory where the program resides. If no profile has been found, erl2db issues a warning message: without a profile that contains an output definition, no output records are generated. erl2db pro- cesses the datafile(s) specified and outputs the converted records and statistic information. When no files are spec- ified, erl2db behaves as a filter program. Options erl2db can be executed with the following options: -a author, -h overview of options, -D print debug information on stderr, -d print debug information on stdout, -e edit address when it contains too many fields, 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 1 ERL2DB(1) ERL2DB(1) -ee force edit of addressfields, -i select files specified on commandline interac- tively, -R print number of record last processed (stderr), -r print number of record last processed (stdout), -S print statistic message (grand total, on stderr), -s print statistic message (grand total, on stdout), -ss print also statistic message for each record, -V print informative messages (filename, on stderr), -v print informative messages (filename, on stdout), -vv print also number of record ([d]) for each record processed, -vvv print also ERL fieldname ([ll]) for each field processed, -vvvv print also contents ([contents]) for each field processed, -- end option section, -E editor-path specify name or path of editor (implies -e), -o output-file specify name of outputfile. A %s in the argument for option -E can be used to specify the position of the name of the file with the address that is to be edited. Processing erl2db processes one record at a time. First it scans the various fields and does fieldspecific inputprocessing, like case conversion and word substitutution, as specified by the [Capitalize], [Title] etc. sections in the profile. Then it writes the output record as specified by the pro- file [Output] section and the record statistics as speci- fied by the profile [Statistics] section. Following are the most notable inputprocessing steps. Title The titleline is split into separate words. The words are 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 2 ERL2DB(1) ERL2DB(1) lookup-up in the dictionary filled with entries from the profile [Title] section and replaced when found. Finally words not present in the dictionary are capitalized if so specified in the profile [Capitalize] section. Words are separated by the following characters: space, tab and ".,:;/'!?*()[]{}<>. When the titleline is too long according to the corre- sponding output definition, it is truncated and ends with an ellipsis (...). Authors The format of the author fields is changed from e.g BOSCH- MK to Bosch,M.K.. The names are capitalized as specified in the profile [Capitalize] section. The author names can be retrieved in an alternate output format with the Authors_lt [Output] definition as: MK Bosch. In some instances, -(Reprint-Author) is appended to the name of an author. This name can be retrieved with the ReprintAuthor [Output] definition as: MK Bosch. If there is no such indication, ReprintAuthor yields the name of the first author. Address The address field is scanned upto the phrase (Reprint Address). The words are lookup-up in the dictionary filled with entries from the profile [Address] section and replaced when found. Finally words not present in the dic- tionary that are longer than two characters are capital- ized if so specified in the profile [Capitalize] section. Words are separated by, and do not contain the following characters: space, comma and semicolon. Journal The journalname is obtained from the ERL SO: field. It is changed from e.g. BIOCHEMISTRY-MOSCOW to Biochemistry Moscow. The words are lookup-up in the dictionary filled with entries from the profile [Journal] section and replaced when found. Finally words not present in the dic- tionary are capitalized if so specified in the profile [Capitalize] section. Words are separated by, and do not contain the following characters: dash and dot. Keywords The keywords are obtained from the ERL KW:, KA: and KP: fields. The words are lookup-up in the dictionary filled with entries from the profile [Keywords] section and replaced when found. Finally words not present in the dic- tionary are capitalized if so specified in the profile [Capitalize] section. Words are separated by, and do not contain the following characters: space and semicolon. Abstract The abstractline is split into words, separated by a 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 3 ERL2DB(1) ERL2DB(1) space. When the abstract is too long according to the cor- responding output definition, it is truncated and ends with an ellipsis (...). CC-Edition The format of the Current Contents edition is changed from e.g. CC-Life-Sciences to Life Sciences. The words of the Current Contents edition are capitalized if so specified in the profile [Capitalize] section. Words are separated by, and do not contain the following characters: space and dash. Example ERL datarecord An example of an ERL datarecord is shown below. AN: RX893-16 See Table of Contents TI: LACK OF BINDING COMPETITION BETWEEN DIURON AND PERFLUOROISOPROPYLDINITROBENZENE DERIVATIVES, NOVEL... AU: ZHARMUKHAMEDOV-SK; KLIMOV-VV; ALLAKHVERDIEV-SI AD: RUSSIAN ACAD SCI, INST SOIL SCI & PHOTOSYNTH, PUSHCHINO 142292 RUSSIA (Reprint Address) SO: BIOCHEMISTRY-MOSCOW. JUN 1995; 60 (6) : 723-728. PT: Article-Citation PY: 1995 IS: 0006-2979 LA: ENGLISH KA: PHOTOSYSTEM II; LIGHT INDUCED ELECTRON TRANSFER; ELECTRON TRANSFER INHIBITORS; DIURON; COMPETITIVE... KP: THYLAKOID MEMBRANE; HERBICIDE BINDING; REACTION CENTERS; FLUORESCENCE; PLASTOQUINONE; CHLOROPLASTS;... AB: Binding competition between Diuron and perfluoroisopropyldinitrobenzene (PFIPDNB)(3) derivatives, novel... JS: BIOCHEMISTRY-AND-BIOPHYSICS CC: CC-Life-Sciences RF: 32 REFS GA: RX893 UD: 9603 Example erl2db output record Here is the erl2db output record for the ERL datarecord shown above, when erl2db is used with the profile as shown in subsection Example profile below. Note that all infor- mation is contained in one line. "Zharmukhamedov","S.K.","Klimov","V.V.","Allakhverdiev","S.I.","","","","","","", "PHOTOSYSTEM II","LIGHT INDUCED ELECTRON TRANSFE","ELECTRON TRANSFER INHIBITORS", "DIURON","COMPETITIVE BINDING","THYLAKOID MEMBRANE","HERBICIDE BINDING", "REACTION CENTERS","FLUORESCENCE","PLASTOQUINONE", "Lack Of Binding Competition Between Diuron And Perfluoroisopropyldinitrobenzene Derivatives, Novel Inhibitors Of Electron Transfer In Photosystem II", "Biochemistry Moscow","English","J","60","723","728", "SK Zharmukhamedov", "Russian Acad Sci","Inst Soil Sci & Photosynth","Pushchino 142292 Russia","","","","", "1995", "Binding competition between Diuron and perfluoroisopropyldinitrobenzene (PFIPDNB)(3) derivatives, novel inhibitors of photosystem II (PS II), was 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 4 ERL2DB(1) ERL2DB(1) investigated by studying their effect on the electron transfer reactions in PS II. The inhibition of PS ","II reactions by the PFIPDNB derivatives was insensitive to Diuron at concentrations that exceed that of sdthe PFIPDNB derivatives by two orders of magnitude. The lack of the functional competition between these substances indicates that the binding sit","e for the PFIPDNB derivatives is different from that for Diuron, a known inhibitor of electron transfer in PS II.","","","","","","","","9603" Example erl2db log-messages Here is an example of the erl2db log-messages, obtained with option -vvv. [AN][TI][AU][AD][SO][PT][PY][IS][LA][KA][KP][AB][JS][CC][RF][GA][UD][2] Record #2: title complete abstract complete journal complete volume complete publication year complete begin page complete end page complete language complete 3 authors, 0 truncated, 0 skipped 15 keyword fields, 1 truncated, 5 skipped 3 address fields, 0 truncated, 0 skipped, not edited ... Grand total of: 1 file processed 72 records processed 0 titles truncated 2 abstracts truncated 0 journal names truncated 0 volume fields truncated 0 publication years truncated 0 begin pages truncated 0 end pages truncated 1 language field truncated 273 authors, 0 truncated, 3 skipped 810 keyword fields, 26 truncated, 162 skipped 305 address fields, 6 truncated, 0 skipped, 0 addresses edited Example profile The profile contains information on which fields must undergo case conversion (be capitalized), the specific spelling and case of words in title, keyword, journal and address fields, the specification of the format of the output record and the specification of the statistic 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 5 ERL2DB(1) ERL2DB(1) information that is to be printed (see also subsection Syntax of profile). Here is an example erl2db profile. # # bin/erl2db.pro - system wide profile for ERL to Dbase conversion program. # # # Capitalization of the following fields: # [Capitalize] Authors Title Journal Language Address # # Word spelling and capitalization, as used in title translation: # [Title] "II" = "II" # as in Photosystem II "EPR" = "EPR" # abbreviation # # Word spelling and capitalization, as used in journal translation: # [Journal] "AND" = "and" "ET" = "et" "THE" = "the" "OF" = "of" # # Word spelling and capitalization, as used in address translation: # [Address] "POB" = "POB" # as in POB 9504 "USA" = "USA" # as in NY 10032 USA # # output field format: # 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 6 ERL2DB(1) ERL2DB(1) [Output] Authors = "30, 10, 10" # nameWidth, initWidth, # Number = " " # authorArtn = "5" Keywords = "30, 10" # width, # Title = "250, 1" # width, # Journal = "100" # width Language = "7" # width ( 3? ) String = "J" # default Journal Volume = "8" # width BeginPage = "6" # width EndPage = "6" # width ReprintAuthor = "30" # width Address = "30, 7" # width, # PublicationYear = "4" # width Abstract = "250, 10" # width, # UpdateCode = "4" # width # # statistic fields to print: # [Statistics] File Record Title Abstract Journal Volume PublicationYear BeginPage EndPage Language Authors Keywords Address # # End of file # Syntax of profile The syntax of the profile is shown below. The contents of the profile are divided into sections. The definitions of these sections are case-sensitive. Comments are intro- duced by a # and extend to the end of the line. Comments and whitespace are ignored. To simplify the syntax of the profile, a single set of reserved words is used for all sections, though not all reserved words are meaningful in each section (e.g. 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 7 ERL2DB(1) ERL2DB(1) PublicationYear in section [Capitalize]). In these circum- stances, the use of a reserved word is silently ignored. Use of an invalid word in a section is signalled as an error though. definition ::= section* section ::= capitalize-section | title-section | keywords-section | journal-section | address-section | output-section | statistics-section capitalize-section ::= '[Capitalize]' enumeration-entry* title-section ::= '[Title]' dictionary-entry* keywords-section ::= '[Keywords]' dictionary-entry* journal-section ::= '[Journal]' dictionary-entry* address-section ::= '[Address]' dictionary-entry* output-section ::= '[Output]' definition-entry* statistics-section ::= '[Statistics]' enumeration-entry* enumeration-entry ::= reserved-word dictionary-entry ::= string '=' string definition-entry ::= reserved-word '=' string string ::= '"' [printable]* '"' reserved-word ::= 'Abstract' | 'Address' | 'Authors' | 'Authors_lt' | 'BeginPage' | 'CC-Edition' | 'EndPage' | 'File' | 'GenuineArticle' | 'ISSN' | 'Journal' | 'JournalSubject' | 'Keywords' | 'Language' | 'Number' | 'PublicationType' | 'PublicationYear' 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 8 ERL2DB(1) ERL2DB(1) | 'RecordType' | 'Record' | 'References' | 'ReprintAuthor' | 'String' | 'Title' | 'UpdateCode' | 'Volume' Semantics of profile The strings at the left side of the assignment in dictio- nary entries should be single words in accordance with the word-splitting mechanism for that dictionary section. The output format definitionstrings in the [Output] sec- tion come in three versions: "fieldwidth, fieldwidth, number-of-fields" Authors (name, initials). "fieldwidth, number-of-fields" Abstract, Authors_lt, Key- words, Title. "fieldwidth" all other output- fields The Number and String output definitions can be used to insert static numerical and string fields into the output record (see subsection Example profile). The File and Record reserved words are normally used in the [Statistics] section only. However, they also can be used in the [Output] section to provide the name of the file being processed and the number of the record respec- tively. Meaningful and correct statistic information can only be gathered for fields that are included in the [Output] sec- tion, except for the fields File and Record. Because it is checked if a field for which statistic information is requested is included in the [Output] section, the [Out- put] section should precede the [Statistics] section. Program exit status When a file cannot be found, or the file cannot be prop- erly processed, the program stops and issues an error mes- sage. The failure to process a file is reflected in the programs exit status (see DIAGNOSTICS below). 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 9 ERL2DB(1) ERL2DB(1) ENVIRONMENT COMSPEC command interpreter used to run the editor for address-editing. FILES erl2db.pro profile in current directory, bin\erl2db.pro system-wide profile, in same directory as erl2db.exe. DIAGNOSTICS erl2db can return the following exit values: 0 success: program execution has been successfully com- pleted, 1 commandline error: an invalid option is specified, 2 processing error: a file could not be opened or closed, an error occurred while writing to an output file, 3 interruption: the user interrupted the program, 4 internal error: an unexpected situation in program behaviour occurred. SEE ALSO mkdb(1), mkdbfix(1), EXAMPLE erl2db -RssvvvE "c:\bin\e %s 2" -o erl2db.out erl2db.inp > erl2db.log BUGS (to be determined.) AUTHOR M.J. Moene 23 Aug 1996epartment of Biophysics, Huygens Laboratorium 10