NOMENCLATURE
Chemical and Biochemical Nomenclature
The recognized authority for the names of chemical compounds is Chemical Abstracts (CAS), https://www.cas.org/ and its indexes. The Merck Index Online (https://www.rsc.org/merck-index) is also an excellent source. For guidelines on the use of biochemical terminology, consult Biochemical Nomenclature and Related Documents (Portland Press, London, United Kingdom, 1992), available at https://www.qmul.ac.uk/sbcs/iupac/bibliog/white.html and the instructions for authors of the Journal of Biological Chemistry.
For enzymes, use the recommended (trivial) name assigned by the Nomenclature Committee of the International Union of Biochemistry (IUB) as described in Enzyme Nomenclature (Academic Press, Inc., New York, NY, 1992) and its supplements and at https://www.qmul.ac.uk/sbcs/iubmb/enzyme/. If a nonrecommended name is used, place the proper (trivial) name in parentheses at first use in the abstract and text. Use the EC number when one has been assigned. Authors of papers describing enzymological studies should review the standards of the STRENDA Commission for information required for adequate description of experimental conditions and for reporting enzyme activity data at https://www.beilstein-institut.de/en/projects/strenda/guidelines.
Nomenclature of Microorganisms
Binary names, consisting of a generic name and a specific epithet (e.g., Escherichia coli), must be used for all microorganisms. Names of categories at or above the genus level may be used alone, but specific and subspecific epithets may not. A specific epithet must be preceded by a generic name, written out in full the first time it is used in a paper. Thereafter, the generic name should be abbreviated to the initial capital letter (e.g., E. coli), provided there can be no confusion with other genera used in the paper. Names of all bacterial taxa (kingdoms, phyla, classes, orders, families, genera, species, and subspecies) are printed in italics and should be italicized in the manuscript; strain designations and numbers are not. Vernacular (common) names should be in lowercase roman type (e.g., streptococcus, brucella). For Salmonella, genus, species, and subspecies names should be rendered in standard form: Salmonella enterica at first use, S. enterica thereafter; Salmonella enterica subsp. arizonae at first use, S. enterica subsp. arizonae thereafter. Names of serovars should be in roman type with the first letter capitalized: Salmonella enterica serovar Typhimurium. After the first use, the serovar may also be given without a species name: Salmonella Typhimurium, S. Typhimurium, or Salmonella serovar Typhimurium. For other information regarding serovar designations, see Antigenic Formulae of the Salmonella Serovars, 9th ed. (P. A. D. Grimont and F.-X. Weill, WHO Collaborating Centre for Reference and Research on Salmonella, Institut Pasteur, Paris, France, 2007; see http://www.scacm.org/free/
Antigenic%20Formulae%20of%20the%20Salmonella%20Serovars%202007%209th%20edition.pdf. For a summary of the current standards for Salmonella nomenclature and the Kaufmann-White criteria, see the article by Brenner et al. (J Clin Microbiol 38:2465–2467, 2000), the opinion of the Judicial Commission of the International Committee on Systematics of Prokaryotes (Int J Syst Evol Microbiol 55:519–520, 2005), and the article by Tindall et al. (Int J Syst Evol Microbiol 55:521–524, 2005).
The spelling of bacterial names should follow the Approved Lists of Bacterial Names (Amended) & Index of the Bacterial and Yeast Nomenclatural Changes (V. B. D. Skerman et al., ed., American Society for Microbiology, Washington, DC, 1989) and the validation lists and notification lists published in the International Journal of Systematic and Evolutionary Microbiology (formerly the International Journal of Systematic Bacteriology) since January 1989. In addition, two sites on the World Wide Web list current approved bacterial names: Prokaryotic Nomenclature Up-to-Date (http://www.dsmz.de/bacterial-diversity/prokaryotic-nomenclature-up-to-date.html) and List of Prokaryotic Names with Standing in Nomenclature (http://www.bacterio.net/). If there is reason to use a name that does not have standing in nomenclature, the name should be enclosed in quotation marks in the title and at its first use in the abstract and the text and an appropriate statement concerning the nomenclatural status of the name should be made in the text. "Candidatus" species should always be set in quotation marks.
Since the classification of fungi is not complete, it is the responsibility of the author to determine the accepted binomial for a given organism. Sources for these names include The Yeasts: a Taxonomic Study, 5th ed. (C. P. Kurtzman, J. W. Fell, and T. Boekhout, ed., Elsevier Science, Amsterdam, Netherlands, 2011), and Dictionary of the Fungi, 10th ed. (P. M. Kirk, P. F. Cannon, D. W. Minter, and J. A. Stalpers, ed., CABI International, Wallingford, Oxfordshire, United Kingdom, 2008); see also http://www.speciesfungorum.org/Names/Fundic.asp.
Names used for viruses should be those approved by the International Committee on Taxonomy of Viruses (ICTV) and reported on the ICTV Virus Taxonomy website (https://talk.ictvonline.org/). In addition, the recommendations of the ICTV regarding the use of species names should generally be followed: when the entire species is discussed as a taxonomic entity, the species name, as with other taxa, is italic and has the first letter and any proper nouns capitalized (e.g., Tobacco mosaic virus, Murray Valley encephalitis virus). When the behavior or manipulation of individual viruses is discussed, the vernacular (e.g., tobacco mosaic virus, Murray Valley encephalitis virus) should be used. If desired, synonyms may be added parenthetically when the name is first mentioned. Approved generic (or group) and family names may also be used.
Microorganisms, viruses, and plasmids should be given designations consisting of letters and serial numbers. It is generally advisable to include a worker's initials or a descriptive symbol of locale or laboratory, etc., in the designation. Each new strain, mutant, isolate, or derivative should be given a new (serial) designation. This designation should be distinct from those of the genotype and phenotype, and genotypic and phenotypic symbols should not be included. Plasmids are named with a lowercase "p" followed by the designation in uppercase letters and numbers. To avoid the use of the same designation as that of a widely used strain or plasmid, check the designation against a publication database such as Medline.
Genetic Nomenclature
To facilitate accurate communication, it is important that standard genetic nomenclature be used whenever possible and that deviations or proposals for new naming systems be endorsed by an appropriate authoritative body. Review and/or publication of submitted manuscripts that contain new or nonstandard nomenclature may be delayed by the editor or the Journals Department so that they may be reviewed.
Bacteria. The genetic properties of bacteria are described in terms of phenotypes and genotypes. The phenotype describes the observable properties of an organism. The genotype refers to the genetic constitution of an organism, usually in reference to some standard wild type. The guidelines that follow are based on the recommendations of Demerec et al. (Genetics 54:61–76, 1966).
(i) Phenotype designations must be used when mutant loci have not been identified or mapped. They can also be used to identify the protein product of a gene, e.g., the OmpA protein. Phenotype designations generally consist of three-letter symbols; these are not italicized, and the first letter of the symbol is capitalized. It is preferable to use Roman or Arabic numerals (instead of letters) to identify a series of related phenotypes. Thus, a series of nucleic acid polymerase mutants might be designated Pol1, Pol2, and Pol3, etc. Wild-type characteristics can be designated with a superscript plus (Pol+), and, when necessary for clarity, negative superscripts (Pol–) can be used to designate mutant characteristics. Lowercase superscript letters may be used to further delineate phenotypes (e.g., Strr for streptomycin resistance). Phenotype designations should be defined.
(ii) Genotype designations are also indicated by three-letter locus symbols. In contrast to phenotype designations, these are lowercase italic (e.g., ara his rps). If several loci govern related functions, these are distinguished by italicized capital letters following the locus symbol (e.g., araA araB araC). Promoter, terminator, and operator sites should be indicated as described by Bachmann and Low (Microbiol Rev 44:1–56, 1980), e.g., lacZp, lacAt, and lacZo.
(iii) Wild-type alleles are indicated with a superscript plus (ara+ his+). A superscript minus is not used to indicate a mutant locus; thus, one refers to an ara mutant rather than an ara– strain.
(iv) Mutation sites are designated by placing serial isolation numbers (allele numbers) after the locus symbol (e.g., araA1 araA2). If only a single such locus exists or if it is not known in which of several related loci the mutation has occurred, a hyphen is used instead of the capital letter (e.g., ara-23). It is essential in papers reporting the isolation of new mutants that allele numbers be given to the mutations. For Escherichia coli, there is a registry of such numbers: the Coli Genetic Stock Center (http://cgsc2.biology.yale.edu/). For the genus Salmonella, the registry is the Salmonella Genetic Stock Centre (http://people.ucalgary.ca/~kesander/). For the genus Bacillus, the registry is the Bacillus Genetic Stock Center (http://www.bgsc.org/).
(v) The use of superscripts with genotypes (other than + to indicate wild-type alleles) should be avoided. Designations indicating amber mutations (Am), temperature-sensitive mutations (Ts), constitutive mutations (Con), cold-sensitive mutations (Cs), production of a hybrid protein (Hyb), and other important phenotypic properties should follow the allele number [e.g., araA230(Am) hisD21(Ts)]. All other such designations of phenotype must be defined at the first occurrence. If superscripts must be used, they must be approved by the editor and defined at the first occurrence in the text.
Subscripts may be used in two situations. Subscripts may be used to distinguish between genes (having the same name) from different organisms or strains; e.g., hisE. coli or hisK-12 for the his gene of E. coli or strain K-12, respectively, may be used to distinguish this gene from the his gene in another species or strain. An abbreviation may also be used if it is explained. Similarly, a subscript is also used to distinguish between genetic elements that have the same name. For example, the promoters of the gln operon can be designated glnAp1 and glnAp2. This form departs slightly from that recommended by Bachmann and Low (e.g., desC1p).
(vi) Deletions are indicated by the symbol Δ placed before the deleted gene or region, e.g., ΔtrpA432, Δ(aroP-aceE)419, or Δ(hisQ-hisJo)1256. Similarly, other symbols can be used (with appropriate definition). Thus, a fusion of the ara and lac operons can be shown as Φ(ara-lac)95. Likewise, Φ(araB'-lacZ+)96 indicates that the fusion results in a truncated araB gene fused to an intact lacZ gene, and Φ(malE-lacZ)97(Hyb) shows that a hybrid protein is synthesized. An inversion is shown as IN(rrnD-rrnE)1. An insertion of an E. coli his gene into plasmid pSC101 at zero kilobases (0 kb) is shown as pSC101 Ω(0kb::K-12hisB)4. An alternative designation of an insertion can be used in simple cases, e.g., galT236::Tn5. The number 236 refers to the locus of the insertion, and if the strain carries an additional gal mutation, it is listed separately. Additional examples, which utilize a slightly different format, can be found in the papers by Campbell et al. and Novick et al. cited below. It is important in reporting the construction of strains in which a mobile element was inserted and subsequently deleted that this fact be noted in the strain table. This can be done by listing the genotype of the strain used as an intermediate in a table footnote or by making a direct or parenthetical remark in the genotype, e.g., (F–), ΔMu cts, or mal::ΔMu cts::lac. In setting parenthetical remarks within the genotype or dividing the genotype into constituent elements, parentheses and brackets are used without special meaning; brackets are used outside parentheses. To indicate the presence of an episome, parentheses (or brackets) are used (λ, F+). Reference to an integrated episome is indicated as described above for inserted elements, and an exogenote is shown as, for example, W3110/F'8(gal+).
For information about the symbols in current use, consult Berlyn (Microbiol Mol Biol Rev 62:814–984, 1998) for E. coli K-12, Sanderson and Roth (Microbiol Rev 52:485–532, 1988) for Salmonella serovar Typhimurium, Holloway et al. (Microbiol Rev 43:73–102, 1979) for the genus Pseudomonas, Piggot and Hoch (Microbiol Rev 49:158–179, 1985) for Bacillus subtilis, Perkins et al. (Microbiol Rev 46:426–570, 1982) for Neurospora crassa, and Mortimer and Schild (Microbiol Rev 49:181–213, 1985) for Saccharomyces cerevisiae. For yeasts, Chlamydomonas spp., and several fungal species, symbols such as those given in the Handbook of Microbiology, 2nd ed. (A. I. Laskin and H. A. Lechevalier, ed., CRC Press, Inc., Cleveland, OH, 1988), should be used.
Conventions for naming genes. It is recommended that (entirely) new genes be given names that are mnemonics of their function, avoiding names that are already assigned and earlier or alternative gene names, irrespective of the bacterium for which such assignments have been made. Similarly, it is recommended that, whenever possible, orthologous genes present in different organisms receive the same name. When homology is not apparent or the function of a new gene has not been established, a provisional name may be given by one of the following methods. (i) The gene may be named on the basis of its map location in the style yaaA, analogous to the style used for recording transposon insertions (zef) as discussed below. A list of such names in use for E. coli has been published by Rudd (Microbiol Mol Biol Rev 62:985–1019, 1998). (ii) A provisional name may be given in the style described by Demerec et al. (e.g., usg, gene upstream of folC). Such names should be unique, and names such as orf or genX should not be used. For reference, the Coli Genetic Stock Center's database includes an updated listing of E. coli gene names and gene products. It is accessible on the Internet (http://cgsc.biology.yale.edu/index.php). A list can also be found in the work of Riley (Microbiol Rev 57:862–952, 1993). For the genes of other bacteria, consult the references given above.
For prokaryotes, gene names should not begin with prefixes indicating the genus and species from which the gene is derived. (However, subscripts may be used where necessary to distinguish between genes from different organisms or strains, as described in section v of "Bacteria" above.) For eukaryotes, such prefixes may be used for clarity when discussing genes with the same name from two different organisms (e.g., ScURA3 versus CaURA3); the prefixes are not considered part of the gene name proper and are not italicized.
Locus tags. Locus tags are systematic, unique identifiers that are assigned to each gene in GenBank. All genes mentioned in a manuscript should be traceable to their sequences by the reader, and locus tags may be used for this purpose in manuscripts to identify uncharacterized genes. In addition, authors should check GenBank to make sure that they are using the correct, up-to-date format for locus tags (e.g., uppercase versus lowercase letters and the presence or absence of an underscore, etc.). Locus tag formats vary between different organisms and also may be updated for a given organism, so it is important to check GenBank at the time of manuscript preparation.
"Mutant" versus "mutation." Keep in mind the distinction between a mutation (an alteration of the primary sequence of the genetic material) and a mutant (a strain carrying one or more mutations). One may speak about the mapping of a mutation, but one cannot map a mutant. Likewise, a mutant has no genetic locus, only a phenotype.
"Homology" versus "similarity." For use of terms that describe relationships between genes, consult the articles by Theissen (Nature 415:741, 2002) and Fitch (Trends Genet 16:227–231, 2000). "Homology" implies a relationship between genes that have a common evolutionary origin; partial homology is not recognized. When sequence comparisons are discussed, it is more appropriate to use the term "percent sequence similarity" or "percent sequence identity," as appropriate.
Strain designations. Do not use a genotype as a name (e.g., "subsequent use of leuC6 for transduction"). If a strain designation has not been chosen, select an appropriate word combination (e.g., "another strain containing the leuC6 mutation").
Viruses. The genetic nomenclature for viruses differs from that for bacteria. In most instances, viruses have no phenotype, since they have no metabolism outside host cells. Therefore, distinctions between phenotype and genotype cannot be made. Superscripts are used to indicate hybrid genomes. Genetic symbols may be one, two, or three letters. For example, a mutant strain of λ might be designated λ Aam11 int2 red114 cI857; this strain carries mutations in genes cI, int, and red and an amber-suppressible (Am) mutation in gene A. A strain designated λ att434 imm21 would represent a hybrid of phage λ that carries the immunity region (imm) of phage 21 and the attachment (att) region of phage 434. Host DNA insertions into viruses should be delineated by square brackets, and the genetic symbols and designations for such inserted DNA should conform to those used for the host genome. Genetic symbols for phage λ can be found in reports by Szybalski and Szybalski (Gene 7:217–270, 1979) and Echols and Murialdo (Microbiol Rev 42:577–591, 1978).
Eukaryotes. FlyBase (http://flybase.org/) is the genetic nomenclature authority for Drosophila melanogaster. WormBase (https://www.wormbase.org/#01-23-6) is the genetic nomenclature authority for Caenorhabditis elegans. When naming genes for Aspergillus species, the nomenclature guidelines posted at http://www.aspergillusgenome.org/Nomenclature.shtml should be followed, and the Aspergillus Genome Database (http://www.aspgd.org/) should be searched to ensure that any new name is not already in use. The Saccharomyces Genome Database (https://www.yeastgenome.org/) and the Candida Genome Database (http://www.candidagenome.org/) are authorities for Saccharomyces cerevisiae and Candida albicans genetic nomenclature, respectively. For information about the genetic nomenclature of other eukaryotes, see the Instructions to Authors for Molecular and Cellular Biology.
Transposable elements, plasmids, and restriction enzymes. Nomenclature of transposable elements (insertion sequences, transposons, and phage Mu, etc.) should follow the recommendations of Campbell et al. (Gene 5:197–206, 1979), with the modifications given in section vi of "Bacteria" above. The Internet site where insertion sequences of eubacteria and archaea are described and new sequences can be recorded is https://www-is.biotoul.fr.
The system of designating transposon insertions at sites where there are no known loci, e.g., zef-123::Tn5, has been described by Chumley et al. (Genetics 91:639–655, 1979). The nomenclature recommendations of Novick et al. (Bacteriol Rev 40:168–189, 1976) for plasmids and plasmid-specified activities, of Low (Bacteriol Rev 36:587–607, 1972) for F' factors, and of Roberts et al. (Nucleic Acids Res 31:1805–1812, 2003) for restriction enzymes, DNA methyltransferases, homing endonucleases, and their genes should be used whenever possible. The nomenclature for recombinant DNA molecules constructed in vitro follows the nomenclature for insertions in general. DNA inserted into recombinant DNA molecules should be described by using the gene symbols and conventions for the organism from which the DNA was obtained.
Tetracycline resistance determinants. The nomenclature for tetracycline resistance determinants is based on the proposal of Levy et al. (Antimicrob Agents Chemother 43:1523–1524, 1999). The style for such determinants is, e.g., Tet B; the space helps distinguish the determinant designation from that for phenotypes and proteins (TetB). The above-referenced article also gives the correct format for genes, proteins, and determinants in this family.