Documentation on Tagset of Collins College Bilingual: French - Italian

  • Harper Collins publishes a line of bilingual dictionaries in several languages. These dictionaries are available for research use. Collins requires a fee for their use; please contact CLR for additional information.
  • This is a document from Collins describing the tagging in the French - Italian College edition bilingual dictionary.
  • The tagging is similar to SGML, but proprietary to Collins. To understand the tags you can use a SGML handbook, along with the document below Harper Collins which discusses the Collins tagset. This document was created for the offset printing of the dictionary, not for linguisitc analysis.

  • A sample of the French - Italian College bilingual using these tags is available from the menu that got you here.

  • Tagset Documentation for the French - Italian College Bilingual

     
    
    
    
    TAGGING DOCUMENTATION UPDATED BY VM: 10/9/93
    
    UPDATED IN RESPONSE TO FIRST SAMPLE, DATED 3/9/93
    
    ALL UPDATES APPEAR IN CAPS, MARKED **
    
    ALL DATA TO APPEAR IN BLACK UNLESS OTHERWISE INSTRUCTED
    
                  B FORMAT FRENCH-ITALIAN DICTIONARY  08-07-93
    (bfitag)
    
    Tagged to Typeset Description
    
    <HWME>    Main entry headword positioned full out in HEADWORD
              BOLD. [see C, c] 
    
              **TO BE OUTPUT IN COLOUR
    
    <HWAD>    Headword add on ending to be positioned after the
              headword or alternative form, separated by a comma and
              a character space.  To be set in HEADWORD BOLD. [see
              abandonn‚]
    
              **ALL DATA (INCLUDING COMMA) TO BE OUTPUT IN COLOUR
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT. SAMPLE SHOWS THIS DATA
    APPEARING IN BRACKETS CLOSED UP TO HEADWORD
    
    <HWSB>    Headword substitute ending to be positioned after the
              headword or alternative form, separated by a comma and
              a character space. The contents of the tag should be
              preceded by a hyphen.  To be set in HEADWORD BOLD.
              [see cacciatore]
    
              **ALL DATA (INCLUDING COMMA) TO BE OUTPUT IN COLOUR
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT.  SAMPLE SHOWS THIS DATA
    APPEARING CLOSED UP TO HEADWORD WITH NO INTERVENING COMMA AND
    SPACE
    
    <HWCF>    Headword complete form to be positioned after the
              headword or alternative form, separated by a comma and
              a character space.  To be set in HEADWORD BOLD. [see
              amer]
    
              **ALL DATA (INCLUDING COMMA) TO BE OUTPUT IN COLOUR
    
    <HWHN>    Headword homonym number positioned close up against
              the headword, before any ending, as a superior digit
              and set in HEADWORD BOLD. [see abcŠs*]
    
              **ALL DATA TO BE OUTPUT IN COLOUR
    
    <PRON>    Phonetics surrounded by square brackets, following the
              headword string, <HW..>, to which they belong preceded
              by a character space. [see any entry]
    
    <LBTM>    Registered trademark contents of this field to be
              output as the (R) symbol, appearing close up after the
              headword and before the phonetics or as an element in
              the indicator label where it would be contained within
              the round brackets. To be set in the same typeface as
              the preceding data. [see cellofan]
    
              **PLEASE REDUCE SIZE OF REGISTERED TRADEMARK OUTPUT BY
              THIS TAG - TOO LARGE
    
    <HWAF>    Headword alternate form follows headword or phonetics
              or part of speech of headword, separated by a comma 
              and a character space from what preceded and set in
              HEADWORD BOLD. [see C, c]  
    
              **ALL DATA (INCLUDING COMMA) TO BE OUTPUT IN COLOUR
    
    <MISC>    Miscellaneous usually coming after the headword or
              phonetics or part of speech. The contents of this tag
              should be output in SECONDARY BOLD within roman round
              brackets. It should be separated from the preceding
              data by a character space and followed by a character
              space.  [see colla, colui]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY BOLD
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT AT HEADWORD con. TYPEFACES
    WRONG THROUGHOUT AND CLOSING BRACKET OMITTED.  CORRECTED SAMPLE
    PAGE ATTACHED.
    
    <HWIF>    Headword inflection used mainly in conjunction with
              following tag <IFGR>.  The contents of this tag should
              be set in SECONDARY BOLD within round brackets and
              should follow the contents of the next tag, where it
              exists, separated by a character space.  Where the tag
              <HWIF> is immediately followed by another <HWIF>, an
              italic "ou" for Fr-It, and an italic "o" for It-Fr,
              should be generated to separate the contents of the
              two tags. [see cassapanca] 
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY BOLD
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT - ORDER OF OUTPUT IS WRONG.
    DATA IN THIS TAG SHOULD BE PRECEDED BY DATA IN FOLLOWING TAG
    <IFGR> WITH NO INTERVENING COMMA. COMMA SHOULD ONLY BE OUTPUT IF
    <IFGR> IS IMMEDIATELY FOLLOWED BY ANOTHER <IFGR> (see below).
    DATA FROM <HWIF> AND <IFGR> SHOULD APPEAR IN THE SAME SET OF
    ROMAN BRACKETS.
    
    <IFGR>    Inflections grammar ALWAYS used in conjunction with
              previous tag <HWIF>.  Follows headword or phonetics of
              headword or part of speech, separated by a character
              space from what preceded, and set in ITALIC within the
              same round brackets as the contents of the <HWIF> tag. 
              Where the tag <IFGR> is immediately followed by
              another <IFGR> then the punctuation between these
              should be a comma. [see cassapanca]
    
              NB:  There may be one or more occurences of the above
                   two tags in any given entry.  Where an <HWIF>,
                   <IFGR> combination is followed by another <IFGR>,
                   then the intervening punctuation should be a
                   comma.
    
                   Occasionally there will be instances of phonetics
                   which follow the irregular form in the printed
                   dictionary.  In the tagged text these <PRON> will
                   occur after the <HWIF>, <IFGR> combination but
                   should appear inside the round brackets, after
                   the contents of <HWIF>, set in square brackets
                   and preceded by a character space, for printed
                   dictionary output.
    
    **NB: TAG NOT INCLUDED IN THIS SAMPLE
    
    <HWXT>    Headword extension may follow on directly from the
              headword, phonetics, the part of speech marker, or an
              indicator tag.  The contents of this field should be
              output in SECONDARY BOLD and preceded by an arrow
              symbol and a character space. [see cacciatora]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY AND USE
              ARROWHEAD SYMBOL INSTEAD OF ARROW
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT.  ALL DATA GOVERNED BY THIS
    TAG IS MISSING IN SAMPLE.
    
    <POSP>    Part of speech marker should be output in ITALIC,
              generally preceded and followed by a character space. 
              Where there is more than one occurence of this tag in
              succession, the intervening punctuation should be a
              comma. [see cabaret]
    
    <LBIN>    Indicator - general to be output in SLOPED ROMAN
              within round brackets, preceded and followed by a
              character space, unless following a <T...> or <X...>
              tag where it would be preceded by a semi-colon and
              character space. [see cabina, chiusa]
    
    **NB: TYPESETTER PLEASE CHECK TYPE SPECIFICATION.  ALL DATA
    GOVERNED BY THIS TAG IS APPEARING IN ITALIC INSTEAD OF SLOPED
    ROMAN AS SPECIFIED.
    
    <LBRR>    Indicator - register to be output in SLOPED ROMAN
              within round brackets.  Punctuation as for <LBIN>.
              [see cacca]
    
    **NB: TYPESETTER PLEASE CHECK TYPE SPECIFICATION.  ALL DATA
    GOVERNED BY THIS TAG IS APPEARING IN ITALIC INSTEAD OF SLOPED
    ROMAN AS SPECIFIED.
    
    <LBSF>    Indicator - subject field to be output in SLOPED ROMAN
              within round brackets.  Punctuation as for <LBIN>. NB
              see note on p12 re small caps. [see caccia, cadetto,
              cambio]
    
    <LBFF>    Indicator - fuller form to be output in SECONDARY BOLD
              within roman round brackets.  The tag should generate
              an ITALIC "aussi: " for Fr-It, and an ITALIC "anche:
              " for It-Fr, which should precede the contents of the
              tag, within the brackets.  Punctuation as for <LBIN>.
              [see calcolatore]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY BOLD
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT. ON FRENCH-ITALIAN aussi:
    DOES NOT APPEAR IN ITALIC AND ON ITALIAN-FRENCH, aussi: IS OUTPUT
    INSTEAD OF anche: AND DOES NOT APPEAR IN ITALIC.
    
              NB:  Where any of the above are immediately followed
                   by any other <LB..> tag, they should be grouped
                   within the same round brackets and separated by
                   a colon and a character space. [see chiudere,
                   chiuso]
    
                   Where any of the above contain upper case
                   lettering, this should be output in small italic
                   caps. 
    
                   Where an <LB..> tag has contents "000", this
                   indicates that a previous label is being
                   suppressed and that a colon should be generated
                   within the brackets of the following <LB..>
                   label.  The contents of the tag are for
                   formatting purposes only and should not be
                   output. [see cancelleria]
    
                   Where an <LB..> tag has contents "XXX", this
                   indicates that a semi-colon rather than a comma
                   should be the separating punctuation.  The
                   contents of the tag are for formatting purposes
                   only and should not be output. [see candore]
    
                   **EDITORS: PLEASE CHECK WORDING OF THIS
                   INSTRUCTION
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT.  CONTENTS OF THIS TAG
    SHOULD NOT BE OUTPUT.
    
    <LLEX>    Plural form of headword to be output in SECONDARY BOLD
              and separated from the preceding item by a semi-colon
              and a character space. [see calcinaccio]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY BOLD
    
    <LLXT>    Plural form extension to be output in SECONDARY BOLD
              and separated from what precedes by an arrow symbol
              and a character space. [see arriŠres]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY AND
              USE ARROWHEAD SYMBOL INSTEAD OF ARROW
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT. ALL DATA GOVERNED BY THIS
    TAG HAS BEEN OMITTED.
    
    <PHRS>    Phrase to be output in SECONDARY BOLD and separated
              from the preceding item by a semi-colon and a
              character space. [see C]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY
    
    <PHXT>    Phrase extension to be output in SECONDARY BOLD and
              separated from what precedes by an arrow symbol and a
              character space. [see all‚g‚ance*]
    
              *PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY AND USE
              ARROWHEAD SYMBOL INSTEAD OF ARROW
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT.  ALL DATA GOVERNED BY THIS
    TAG HAS BEEN OMITTED.
    
    <RFVB>    Reflexive verb to be output in SECONDARY BOLD and
              separated from the preceding item by a semi-colon and
              a character space, except when the preceding item is
              <POSP>, in which case it should be separated by a
              comma and a character space. [see cacciarsi,
              comportare*]
    
              **PLEASE OUTPUT IN COLOUR (DO NOT INCLUDE THE PRECEDING
              PUNCTUATION)
    
    <RFXT>    Reflexive extension to be output in SECONDARY BOLD and
              separated from what precedes by an arrow symbol and a
              character space. [see cambiarsi]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY AND USE
              ARROWHEAD SYMBOL NOT ARROW
    
    **NB: TYPSETTER PLEASE CHECK OUTPUT.  ALL DATA GOVERNED BY THIS
    TAG HAS BEEN OMITTED.
    
    <CCPD>    Compound (Romance languages) to be output in SECONDARY
              BOLD and separated from the preceding item by a space
              then a black triangle followed by a thin space. When
              preceded by <POSP> the contents of this tag should be
              preceded by an arrow symbol and a character space.
              [see cambio]
    
              **TO BE OUTPUT IN COLOUR. PLEASE DROP THE BLACK
              TRIANGLE AND INTERVENING THIN SPACE. PLEASE USE THE
              ARROWHEAD SYMBOL INSTEAD OF THE ARROW.
    
    <CCXT>    Compound extension to be output in SECONDARY BOLD and
              separated from what precedes by an arrow symbol and a
              character space. [see allemand*]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY AND USE
              THE ARROWHEAD SYMBOL INSTEAD OF THE ARROW
    
    NB:  TYPESETTER PLEASE CHECK OUTPUT. ALL DATA GOVERNED BY THIS
    TAG HAS BEEN OMITTED. THERE IS ALSO COMPUTER JUNK AT allemand,
    WHICH IS THE ONLY ENTRY IN THE SAMPLE TO CONTAIN THIS TAG -
    PLEASE CHECK.
    
    <MAIN>    Main entry for grouping purposes NOT FOR OUTPUT.  The
              contents of this tag should not be output.
    
    <MNHN>    Main entry homonym number NOT FOR OUTPUT.  The
              contents of this tag should not be output.
    
    <BFORMAT> This tag is used for grouping purposes and is NOT FOR
              OUTPUT.
    
    <COMMON>  This tag is used for grouping purposes and is NOT FOR
              OUTPUT.
    
    <TRAN>    Translation to be output in ROMAN.  Will normally be
              preceded by a character space and followed by either
              a comma, if the following tag, excluding any
              <TRAD>/<TRSB>/<TRCF>, <TGGR> or <TL..> tags, is
              <TRAN>/<TREQ>/<TRGL>, a semi-colon, if an indicator
              <LB..> tag follows or full stop if it is the end of
              the entry. [see C, c]
    
    **NB: TYPSETTER PLEASE CHECK OUTPUT.  FULL STOP HAS BEEN OMITTED
    AT THE END OF THE ENTRY.  PLEASE NOTE: A FULL STOP SHOULD ONLY
    BE OUTPUT IF THERE IS NO PUNCTUATION PRESENT IN THE LAST TAG OF
    THE ENTRY.  IF, HOWEVER, AN ELLIPSIS IS THE CLOSING PUNCTUATION
    IN THE LAST TAG OF THE ENTRY, THEN PLEASE OUTPUT A FULL STOP
    PRECEDED BY A SPACE.
    
    
    <TREQ>    Translation equivalent to be output in ROMAN and
              preceded by the cultural equivalent symbol, ÷, and a
              character space.  Punctuation should follow <TRAN>
              rules. [see C, c]
    
    **NB: SEE NOTE ABOVE RE: PUNCTUATION
    
    <TRGL>    Translation gloss to be output in GLOSS ITALIC.  Will
              normally be preceded by a character space.  Following
              punctuation should follow <TRAN> rules. [see camorra,
              canone]
    
    **NB: SEE NOTE ABOVE RE: PUNCTUATION
    
    <TRAD>    Translation ending to be used in conjunction with
              <TRAN> etc.  To be output in ROMAN within round
              brackets.  Its position in the translation is marked
              by the * symbol.  There should be no space preceding
              the opening round bracket but a character space should
              follow the closing bracket if it occurs in the middle
              of text.  There should be no preceding space but a
              following comma, if followed by another translation
              with no intervening <LB..> tag, no preceding space but
              a following semi-colon, if followed by an <LB..> tag
              or a bold item and no preceding space but a following
              full stop if it is the end of the entry.  One or more
              may be present, accompanying any given <TRAN>. [see
              cadente]
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT. DATA SHOULD BE OUTPUT IN
    ROUND BRACKETS CLOSED UP TO TRANSLATION. ALSO PLEASE NOTE: euse
    IS BEING OUTPUT SYSTEMATICALLY THROUGHOUT THE ITALIAN-FRENCH
    SAMPLE (TAG DOES NOT APPEAR ON FRENCH-ITALIAN) REGARDLESS OF THE
    DATA WE SUPPLIED.  SEE NOTE ABOVE RE: PUNCTUATION.
    
    <TRSB>    Translation ending to be used in conjunction with
              <TRAN> etc.  To be output in ROMAN preceded by a
              hyphen and contained within round brackets.  Its
              position in the translation is marked by the * symbol. 
              There should be no space preceding the opening round
              bracket but a character space should follow the end
              bracket if it occurs in the middle of text.  There
              should be no preceding space but a following comma, if
              followed by another translation with no intervening
              <LB..> tag, no preceding space but a following semi-
              colon, if followed by an <LB..> tag or a bold item and
              no preceding space but a following full stop if it is
              the end of the entry.  One or more may be present,
              accompanying any given <TRAN>. [see cacciatore]
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT.  DATA SHOULD BE OUTPUT
    PRECEDED BY A HYPHEN IN ROUND BRACKETS CLOSED UP TO TRANSLATION.
    ALSO PLEASE NOTE: euse IS BEING OUTPUT SYSTEMATICALLY THROUGHOUT
    THE ITALIAN-FRENCH AND a THROUGHOUT THE FRENCH-ITALIAN REGARDLESS
    OF THE DATA WE SUPPLIED.  SEE NOTE ABOVE RE: PUNCTUATION.
    
    <TRCF>    Translation complete form to be used in conjunction
              with <TRAN> etc.  To be output in ROMAN within round
              brackets. Its position in the translation is marked by
              the * symbol.  There should be no space preceding the
              opening round bracket but a character space should
              follow the closing round bracket if it occurs in the
              middle of text.  There should be no preceding space
              but a following comma, if followed by another
              translation with no intervening <LB..> tag, no
              preceding space but a following semi-colon, if
              followed by an <LB..> tag or a bold item and no
              preceding space but a following full stop if it is the
              end of the entry.  One or more may be present,
              accompanying any given <TRAN>. [see canuto] 
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT.  DATA SHOULD BE OUTPUT IN
    ROUND BRACKETS CLOSED UP TO TRANSLATION.  ALSO PLEASE NOTE: euse
    IS BEING OUTPUT SYSTEMATICALLY THROUGHOUT THE ITALIAN-FRENCH AND
    a THROUGHOUT THE FRENCH-ITALIAN REGARDLESS OF THE DATA WE
    SUPPLIED.  SEE NOTE ABOVE RE: PUNCTUATION.
    
    <TGGR>    Translation grammar used in conjunction with <TRAN> 
              etc.  To be output in ITALIC.  Its position in the
              translation is marked by the $ symbol and it is thus
              both preceded and followed by a character space if it
              occurs in the middle of text otherwise preceded by a
              space and followed by a comma, semi-colon or full stop
              as above for translation endings.  One or more may be
              present, accompanying any given <TRAN>. [see cabaret]
    
    **NB: SEE NOTE ABOVE RE: PUNCTUATION
    
    <TLIN>    Translation indicator - general used in conjunction 
              with <TRAN> etc.  To be output in SLOPED ROMAN within
              round brackets.  Its position in the translation is
              marked by the ">" symbol and it is thus both preceded
              and followed by a character space if it occurs in the
              middle of text, preceded by a space and followed by a
              comma, if followed by another translation with no
              intervening <LB..> tag, followed by a semi-colon, if
              followed by an <LB..> tag or a bold item and followed
              by a full stop if it is the end of the category or
              entry.  One or more may be present, accompanying any
              given <TRAN>. [see abribus]
    
    **NB: SEE NOTE ABOVE RE: PUNCTUATION
    
    <TLRR>    Translation indicator - register used in conjunction
              with <TRAN> etc.  To be output in SLOPED ROMAN within
              round brackets.  In all other respects, mirrors
              <TLIN>. [see cannonata]
    
    **NB: SEE NOTE ABOVE RE: PUNCTUATION
    
    <TLTM>    Translation trademark used in conjunction with <TRAN>
              etc. Contents of this field to be output as the (R)
              symbol appearing immediately after the translation or
              if found in conjunction with any of the other
              translation indicators, then within the same round
              brackets.  The position of this information should be
              indicated within the <TRAN> or equivalent field by the
              ">" symbol and should be punctuated in the same way as
              the <TLIN> tag. It should be set in the same typeface
              as the preceding data. [see cellofan]
    
              NB:  Where any of the above are immediately followed
                   by any other <TL..> tag, they should be grouped
                   within the same round brackets and separated by
                   a colon and a character space.
    
              **PLEASE REDUCE SIZE OF REGISTERED TRADEMARK SYMBOL IN
              THIS TAG - TOO LARGE
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT - DATA OMITTED AND SYMBOL
    REPEATED. SEE CORRECTED PAGE ATTACHED. SEE NOTE ABOVE RE:
    PUNCTUATION.
    
    <XROF>    Cross reference using "de " (F-I) or "di " (I-F)
              should be output in SECONDARY BOLD preceded by an
              ITALIC "de " or "di ". [see capovolto]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT. ON ITALIAN-FRENCH de IS
    BEING OUTPUT INSTEAD OF di.  OK ON FRENCH-ITALIAN.  de/di SHOULD
    APPEAR IN ITALIC, NOT ROMAN.
    
    <XREQ>    Cross reference using "= " to be output in SECONDARY
              BOLD preceded by the equals sign and a character
              space. [see ciste]
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY
    
    <XRSE>    Cross reference using "voir " (F-I) or "vedi " (I-F)
              to be output in SECONDARY BOLD preceded by an ITALIC
              "voir " or "vedi ". [see caddi] 
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT. ON ITALIAN-FRENCH voir IS
    BEING OUTPUT INSTEAD OF vedi.  OK ON FRENCH-ITALIAN. voir/vedi
    SHOULD APPEAR IN ITALIC, NOT ROMAN.
    
    <XRSA>    Cross reference using "voir aussi " (F-I) or "vedi
              anche " (I-F) to be output in SECONDARY BOLD preceded
              by an ITALIC "voir aussi " or "vedi anche ". [see
              cento] 
    
              **PLEASE OUTPUT IN PHRASE BOLD, NOT SECONDARY
    
    **NB: TYPESETTER PLEASE CHECK OUTPUT. ON ITALIAN-FRENCH voir
    aussi IS BEING OUTPUT INSTEAD OF vedi anche. OK ON FRENCH-
    ITALIAN.  voir aussi/vedi anche SHOULD APPEAR IN ITALIC , NOT
    ROMAN.
    
    <XRHN>    Cross reference homonym number positioned close up
              against the cross reference as a superior digit. 
              Where appropriate, this will immediately follow one of
              the above <XR..> tags. It should be set in the same
              typeface as the preceding data. [see come]
    
              NB:  Where two or more cross references of the same
                   kind occur in succession, with the possible
                   intervention of an <XRHN> tag, the preceding
                   equals sign (<XREQ>) or italic "voir/vedi"
                   (<XRSE>), "de/di" (<XROF>) or "voir aussi/vedi
                   anche" (<XRSA>) should be output for the first
                   occurence of these only.  Where there is a
                   mixture of cross reference tags then the
                   preceding italic information should be output as
                   appropriate.  A semi-colon should be used to
                   separate multiple occurences of cross references.
    
                         ********************
    
    
    The following tags are other types of main entries.  The tags
    which follow these are specific to that type of entry, but that
    apart the above descriptions apply.
    
    <HWAE>    Headword abbreviation positioned full out in HEADWORD
              BOLD.
    
    <HWXP>    Headword expansion to be output in SLOPED ROMAN,
              preceded by the equals sign and a character space and
              contained within round roman brackets. This should be
              separated from the preceding item by a character
              space.
    
              **PLEASE DO NOT OUTPUT BRACKETS ROUND DATA GOVERNED BY
              THIS TAG IF IT IS NOT FOLLOWED BY A <TR..> TAG
    
    <TRXP>    Translation expansion to be output in SLOPED ROMAN,
              preceded by the equals sign and a character space and
              contained within round roman brackets.  This should be
              separated from the preceding item by a character
              space.
    
    **NB: SEE NOTE ABOVE RE: PUNCTUATION. TAG NOT INCLUDED IN THIS
    SAMPLE.
    
    <HWKE>    Headword keyword positioned full out in HEADWORD BOLD
              preceded by MOT-CL for Fr-It and PAROLA CHIAVE for
              It-Fr as KEYWORD OPENING MARKER. The whole entry
              should be followed by KEYWORD CLOSING MARKER. [see …,
              avoir, chi]  
    
              **SEE TYPE SPECIFICATION FOR COLOUR LAYOUT OF KEYWORDS
    
    <CAT2>    Category marker meaning level to be output as CATEGORY
              MARKER. The first occurence of this tag under <HWKE>
              should be preceded and followed by a character space.
              Where the following tag is one of the <..XT> group,
              the data should be followed by an arrow symbol with a
              character space on either side. Second and subsequent
              occurences of <CAT2> should form a new paragraph
              aligned to the main text. [see keywords]
    
              **PLEASE USE ARROWHEAD SYMBOL INSTEAD OF ARROW.  SEE
              TYPE SPECIFICATION FOR COLOUR LAYOUT OF CATEGORY 
              MARKERS
    
                         ********************
    
    Further Information
    
    {}             The presence of these within a field where the
                   default typeface is other than italic, indicates
                   that the contents of the {} brackets should be
                   output in italic.
    
    ****           A single row of asterisks represents the end of
                   a particular entry and the contents of the last
                   tag should be immediately followed by a full
                   stop, with no intervening space.
    
    Punctuation    All punctuation which is not explicit but is
                   generated by the tag sequence or other markers,
                   should be in ROMAN and closed up to the preceding
                   item and followed by a character space.
    
    **NB: EVERY ENTRY SHOULD CLOSE WITH A FULL STOP UNLESS CLOSING
    PUNCTUATION IS EXPLICIT IN THE LAST TAG IN THE ENTRY.  IF THE
    CLOSING PUNCTUATION IN THE LAST TAG IS AN ELLIPSIS, PLEASE
    GENERATE A SPACE AND A FULL STOP.
    
    NB:            For all indicators, ie. most <LB..> and <TL..>
                   type tags, all characters which appear in upper
                   case should be output in small caps. It is common
                   to find a mixture of lower case and upper case
                   characters within these fields, eg, 
                   
                   <LBSF> gu‚rison, d‚cision, PHOTO
    
                   but the upper case characters should always be  
                   output in small caps.  
    
    
    Accents        See attached sheet for a list of ASCII substitute
                   characters and how they should be ouput in the
                   text. [see A, a*]
    
    Swung dash     Swung dash replacement has been implemented on
                   this text. Wherever a swung dash occurs it is
                   shown by ~ in the data and should be output in the
                   same typeface as the data surrounding it.
    
    Substitute characters have been used for those characters which
    have no ASCII value. Please replace the substitute characters
    with the characters in the right hand column.
    
    
    ASCII (substitute)       ASCII value         printed character
    character
    
    û                             251                 A
    
    ë                             235                 A
    
    Æ                             198                 A
    
    è                             232                 E
    
    î                             238                 E
    
    â                             226                 E
    
    ô                             244                 I
    
    õ                             245                 I
    
    Ñ                             209                 I
    
    ´                             180                 O
    
    æ                             230                 U
    
    Á                             193                 U
    
    ö                             246                 
    
    á                             225
    
    ¯                             175
    
    Ð                             208
    
    
    
    \(oe) represents the oe lower case ligature      and \(OE)
    represents the oe upper case ligature