VINAY'S MOLECULAR BIOLOGY GLOSSARY




VINAY KUMAR

If u want to contact to support me,so e-mail me at : winay32_1988@yahoo.co.in or winay32_1988@rediffmail.com

HELLO,

THIS DICTIONARY MAY HELP PEOPLE WHO ARE INTERESTED IN THE FIELD OF GENETICS, BIOTECHNOLOGY, BIOINFORMATICS, BIOLOGY OR BIOCHEMISTRY etc.

LET'S FROM HERE

A


Adenine (A): A purine base; one of the four molecules containing nitrogen present in DNA and RNA. it binds to thymine (T) or uracil (U), by 2 hydrogen bonds .Designated by the letter A.
Adenosine Triphosphate (ATP): A nucleotide triphosphate that upon hydrolysis results in energy available for such processes as muscle contraction and synthesis of macromolecules, including proteins and carbohydrates. (A-P-P-P)
Aerobic :Requiring oxygen for growth.
Affinity. The binding power of an antibody with an antigen.
AFLP: Amplified fragment length polymorphism. A highly sensitive method for detecting polymorphisms in DNA. DNA first undergoes restriction enzyme digestion, and a subset of DNA fragments is then selected for PCR amplification and visualization.

Agarose. The neutral gelling fraction of agar (a polysaccharide extracted from certain seaweed) commonly used in gel electrophoresis.
Algorithm: A step-by-step process for solving a problem.
Allele. From “allelomorph,” one of a series of alternative forms of a gene (or VNTR) at a specific locus in a genome. Or One of the alternative forms of a gene that can exist at a single locus.
Allogamy: Transfer of pollen (i.e. pollination) from the anther of the flower of one plant to the stigma of the flower of a genetically different plant. Also called crossbreeding,
outcrossing, and xenogamy. See also Outbreeding.
Allele frequency: The proportion of a particular allele among the chromosomes carried by individuals in a population.
1Copyright © 2005 by author, who wishes to thank Dr. Raymond F. Gesteland for his editorial assistance.
Allele-specific oligonucleotide. A short DNA sequence, usually 18-20 nucleotides, that can hybridize with either disease-causing or normal DNA sequences but not both.
Allogenic. Of the same species but with a different genotype.
Alu. A major group of dispersed repetitive DNA sequences; a family of repeat DNA sequences, cleaved by the restriction enzyme Alu I, dispersed throughout the genomes of many animal species. A human being has approximately 500,000 copies at 300 base pairs each.
Amino acid. The major building blocks of polypeptides; any of a class of 20 molecules combined to form proteins in living things; the building blocks of proteins coded by triplets of bases in the DNA blueprint: alanine (Ala), arginine (Arg), asparagine (Asn), aspartic acid (Asp), cysteine (Cys), glutamine (Gln), glutamic acid (Gla), glycine (Gly), histidine (His), isoleucine (Ile), leucine (Leu), lysine (Lys), methionine (Met), phenylalanine (Phe), proline (Pro), serine (Ser), threonine (Thr), tryptophan (Trp), tyrosine (Tyr), valine (Val). For example, the mRNA transcript code AUG is for the amino acid methionine, whereas CGU, CGC, CGA, CGG, AGA and AGG code for arginine, UUG codes for leucine and GGG codes for glycine. In certain situations, UGA codes for selenocysteine or the 21st amino acid.
The sequence of amino acids in a protein and hence protein function are determined by the genetic code.
AMOVA: The analysis of molecular variance is a method for studying molecular variation within a species.

Amplification. An increase in the number of copies of a specific DNA sequence, usually by PCR.
Amplified fragment length polymorphism (AMP-FLP). PCR-amplified restriction fragment lengths consisting of a variable number of tandem repeats. Anaerobic. Growing in the absence of oxygen.
Antibody. A protein produced for body defense in response to an antigen.
Anticodon. A three-nucleotide sequence in a tRNA molecule that undergoes complementary base pairing with an mRNA codon.
Antigen. From “antibody generator,” a molecule, usually a protein, capable of stimulating an antibody response for body defense.
© 2005 Susan A. Ehrlich
Antisense. The strand of DNA that would produce a mirror image (antisense) messenger RNA that is opposite in sequence to one directing protein synthesis. Antisense technology is used to selectively turn off production of certain proteins.
Antiserum. Blood serum containing specific antibodies against an antigen used to confer immunity to a disease.
Autorad. From “autoradiogram,” the resultant x-ray film after having been exposed to a radioactive source. A DNA probe tagged with a radioactive isotope such as radioactive phosphorus (32P) will expose an x-ray film where the probe hybridizes to complementary sequences on the blot in contact with the film.
Autosome. A chromosome not involved in sex determination. The diploid human genome consists of 46 chromosomes, 22 pair of autosomes and one pair of sex chromosomes (X and Y).

Asexual reproduction: The formation of new individuals from the cell(s) of a
single parent. It does not entail recombination or mixing of parental forms.
Autogamy:

•Transfer of pollen (pollination) from the anther of a flower to the stigma of the same flower or sometimes to that of a genetically identical flower (as of the same plant or clone)
• The ability of many plant species to naturally and successfully fertilize within one individual. Also called self-pollination.





B



Bacteriophage or “phage”. A virus that reproduces in bacteria.
Bacterium. Any of a large group of microscopic organisms with a simple cell structure, some of which manufacture their own food, some of which live as parasites on other organisms and some of which live on decaying matter.
Bands. Visibly darkened areas on autorads that represent the location of DNA fragments on a gel; alternating dark and light areas visible on chromosomes after certain types of stains are used.
Band shifting. The phenomenon in which DNA fragments in one lane of an electrophoresis gel migrate more rapidly than fragments in a second lane.
Base. One of four DNA nucleotides, adenine (A), cytosine (C), guanine (G) or thymine (T). In RNA, it is uracil (U) instead of thymine.
Base analog. A molecule that can mimic the chemical behavior of one of the DNA bases. Base analogs are a type of mutagen, some of which are used in chemotherapy.
Base pair (bp). Two complementary nucleotides joined by hydrogen bonds binding DNA or RNA complementary strands; base-pairing occurs between A and T (DNA) or U (RNA) and between C and G.
© 2005 Susan A. Ehrlich
Base-pair substitution. The replacement of one base pair by another, a type of mutation.
Base sequence. The order of bases in a DNA or RNA molecule.
Bin. In VNTR profiling, a range of base pairs (DNA fragment lengths). When a database is divided into fixed bins, the proportion of bands within each bin is determined and the relevant proportions are used in estimating the profile frequency. In a floating bin method of estimating a profile frequency, the bin is centered on the base-pair length of the allele in question and the width of the bin can be defined by the laboratory’s matching rule, e.g., " 5% of band size. Binning is grouping VNTR alleles into sets of similar sizes because the alleles’ lengths are too similar to differentiate.
Biochemical marker or biomarker. A substance whose detection indicates a biochemical (chemical, molecular or physical) activity or change in the body, its tissues or cells; it may be monitored as a quantifiable indicator in the assessment of a disorder or condition.
Blastocyst. The 4-5-day-old ball of undifferentiated cells from which a prospective embryo develops. Biodiversity: The totality of genes, species, and ecosystems in a given region,
be it a microhabitat or the world. Also called biological diversity.
Breeding: The propagation and genetic manipulation by hybridization or deliberate self-crossing of plants, for the purpose of selecting improved offspring.
Breeding system: The system by which a species reproduces. There are several natural systems in plants, for example, see also Outbreeding and Inbreeding.





C

Characterization: Assessment of plant traits that are highly heritable, easily seen by the eye, equally expressed in all environments, and usable for distinguishing phenotypes.
Collection (of plant genetic resources):
• The gathering together of domesticates (landraces, old and modern cultivars and breeding lines), and related wild or weedy species.
• The material gathered by the act of collecting, is termed a collection.
Conservation: The management of human use of the biosphere so that it may yield the greatest sustainable benefit to current generations while maintaining its potential to meet the needs and aspirations of future generations. Thus,conservation is positive, embracing preservation, maintenance, sustainable use,restoration, and enhancement of the natural environment.







D

Angstrom unit. Named after nineteenth-century Swedish physicist Anders Dngstrom, 1D equals 1 ten-billionth meter or there are 254 million Ds per inch.

D-loop. A portion of the mitochondrial genome known as the “control region” or “displacement loop” instrumental in the regulation and initiation of mtDNA gene products.
Daughter cells. Cells resulting from the division of a parent cell.
Degenerate genetic code. More than one RNA nucleotide triplet of bases coding for the same amino acid.
Degradation. The breaking down of macromolecules by chemical, enzymatic or physical means.
Deletion. The loss of one or more nucleotides from a DNA strand; may result in a gene mutation. A deletion map is a description of a specific chromosome using defined mutations as markers.
© 2005 Susan A. Ehrlich
Denaturation. The process of separating the complementary double strands of DNA, as by heating, to form single strands in preparation for hybridization with biological probes.
© 2005 Susan A. Ehrlich
Deoxyribonucleic acid (DNA). The molecule of heredity, DNA is composed of deoxyribonucleotide building blocks, each containing a base (adenine (A), thymine (T), cytosine (C) or guanine (G)), a deoxyribose sugar (S), and a phosphate group (P). The DNA molecule forms a double helix with the nucleotides of each strand held together through phosphate molecules at the 3' and 5' carbons of the sugar and the antiparallel complementary strands held by hydrogen (H) bonds between the base pairs, A=T and C=G. The two chains twine coaxially in a right-handed screw, one up and the other down; the diameter is twenty Ds in diameter, and a complete turn of the screw is 34Ds, each base pair flat in the middle with 3.4 Ds, a tenth of a revolution, separating each pair. During replication, the hydrogen bonds break; the complementary strands separate, and each acts as a template for the production of its complement.
3' 5'
# #
P P
# #
S-A=T-S
# #
P P
# #
S-C=G-S
# #
P P
# #
3' 5'
DNAase (deoxyribonuclease). An enzyme capable of cleaving DNA into small fragments.
DNA band. A DNA fragment or allele on a Southern-blot autorad. With reference to an identity profile, a band is a tandem repeat DNA sequence (allele) produced by cleaving a genome into fragments with a restriction enzyme having recognition sites flanking the allele and usually at millions of other genome locations.
DNA blot. A membrane (usually nylon) with covalently bound single-stranded DNA.
© 2005 Susan A. Ehrlich
DNA fingerprinting. The use of restriction enzymes to measure the genetic variations among individuals.
DNA haplotype. A pattern of DNA polymorphisms.
DNA hybridization. The formation of a double-stranded nucleic acid molecule from two separate strands; also applies to a molecular technique that uses one nucleic acid strand to locate another.
DNA identification analysis. The characterization of one or more features of an individual’s genome by developing a DNA fragment band (allele) pattern. If a sufficient number of different-size bands are analyzed, the resultant bar-code profile will be unique for each individual except identical twins.
DNA looping. The formation of looped structures in DNA; sometimes permits the interaction of various regulatory elements.
DNA mismatch repair. A type of DNA repair in which nucleotide mismatches, i.e., violations of the A-T, C-G complementary base-pairing rule, are corrected by specialized repair enzymes.
DNA polymerase. An enzyme that catalyzes the linking of deoxyribonucleotide triphosphates using DNA as a template.
DNA probe. A piece of nucleic acid labeled with a radioactive isotope, dye or enzyme used to locate a particular nucleotide sequence or gene on a DNA molecule.
DNA profile. The alleles at each locus revealing the population variation in genome sequences; a series of DNA polymorphisms, usually VNTRs or microsatellites, typed in an individual. Because these polymorphisms are highly variable, the combined genotypes are useful in identifying individuals for forensic purposes.
DNA repair. A process in which mistakes in the DNA sequence are altered to recreate the original sequence.
DNA replication. The use of existing DNA as a template for the synthesis of new DNA strands.
DNA sequence. The order of DNA bases along a DNA molecule.
© 2005 Susan A. Ehrlich
Depurination. The process of partial DNA hydrolysis by acid at purine (adenine and guanine) sites, resulting in the cleavage of large DNA fragments into smaller pieces. This process improves Southern transfer.
Derivative chromosome. A chromosome that is the result of a translocation of one part of a chromosome to another.
Dialysis. The process of separating different-size molecules in solution by means of their differential transfer across a porous membrane; commonly used to remove salt from solutions of macromolecules.
Dideoxy method. A technique for sequencing DNA in which dideoxynucleotides that terminate replication are incorporated into the replicated DNA strands.
Differentiation. The process of biochemical and structural changes by which cells become specialized in form and function.
Digested DNA. DNA cleaved by the action of restriction enzymes or other DNAses.
Diploid. Having two copies of each chromosome. See haploid.
Discordant. When two individuals do not share the same trait.
Disomy. The presence in a cell of two chromosomes derived from a single parent and none from the other parent.
Dispersed repetitive DNA. A class of repeated DNA sequences in which single repeats are scattered throughout the genome.
Dizygotic. Twins produced from two separate zygotes (fraternal twins).
Domain. A discrete portion of a protein with its own function.
Dominant. Allele the effect of which is the same in single copy (heterozygotes) as in double copy (homozygotes).
Dominant negative. A type of mutation in which the altered protein product in a heterozygote forms a complex with the normal protein product produced by the homologous normal gene, thus disabling it. © 2005 Susan A. Ehrlich
Dot blot. A DNA analysis system in which sample DNA is directly pipetted onto a membrane, as opposed to the Southern-blot procedure of enzymatic digestion, electrophoresis and Southern transfer.
Double helix. A term often used to describe the “twisted ladder” shape that two linear strands of DNA assume when bonded together.
Double-stranded DNA breaks. A type of DNA breakage in which both strands are broken at a specific location.
Duplication. The presence of an extra copy of chromosome material.

Diploid: A full set of genetic material, consisting of paired chromosomes, with one chromosome from each parental set. Most animal cells, except the gamete,have a diploid set of chromosomes.(cf. Haploid)
Domestication: The evolution of plants or animals either naturally or through
artificial selection, to forms more useful to man, e.g. nonshattering seed.Ex situ conservation.:
• A conservation method that entails removing germplasm resources (seeds,pollen, sperm, individual organisms) from their original habitat or natural environment.
• Keeping components of biodiversity alive outside their original habitat or natural environment.(cf. in situ conservation)

E

Electrophoresis. The process of separating charged molecules, e.g., negatively charged DNA fragments in a porous medium such as agarose, by the application of an electric field. The DNA migrates through the medium at different rates according to length.
Electroporation. The use of an electric filed to create reversible small holes in a cell wall or membrane through which foreign DNA can pass; this DNA can then integrate into the cell’s genome.
Embryonic stem cells. Cells that can give rise to any type of differentiated cell.
Endogamy. Reproduction by the fusion of gametes of similar ancestry.
Endonuclease. An enzyme that cleaves the phosphodiester bond within a nucleotide chain.
Enhancer. Regulatory DNA sequence that interacts with specific transcription factors to increase the transcription of genes.
Enzyme. A protein that catalyzes (speeds up) a specific chemical reaction without being changed or consumed in the process.
Ethidium bromide. A molecule that can intercalate into DNA double helices used to identify the presence of DNA in a sample by its fluorescence under ultraviolet light.
Eukaryote. A multi-cellular organism having true membrane-bound nuclei containing chromosomes that undergo mitosis.
© 2005 Susan A. Ehrlich
Euploid. Cells whose chromosome number is a multiple of (in human beings 23).
Exogenous DNA. DNA originating outside an organism.
Exon. Portion of a gene that encodes amino-acid sequences and is retained after the primary mRNA transcript is spliced. Exons consist of the code signals for (1) the initiation of RNA transcription and ribosomal accommodation, (2) termination of translation and addition of poly-A tail.
Exon trapping. Method for isolating exons in a fragment of genomic DNA by using an in vitro cell system to artificially splice out the introns.
Exonuclease. An enzyme that digests DNA strands starting at their termini.
Expanded repeat. A type of mutation in which a tandem dinucleotide, trinucleotide, tetranucleotide ... repeat increases in number, e.g., Huntington disease.
Expression. The manifestation of a gene.
Extra-chromosomal inheritance. Cytoplasmic inheritance of DNA via organelles such as mitochondria or plasmids in female gametes. The human egg, for example, transmits approximately 10kbp mitochondria.
Extra-nuclear DNA. DNA located in organelles such as mitochondria and plasmids; also referred to as cytoplasmic DNA and its inheritance as maternal or cytoplasmic since the organelles are transmitted only from the female via gamete cytoplasm.





F

False match. Two samples of DNA that have different profiles could be declared to match if, instead of measuring the distinct DNA in each sample, there is an error in handling or preparing samples such that the DNA from a single sample is analyzed twice or the alleles are too similar to be distinguished.
Founder Effect. Arises when a new and isolated environment is invaded by only a few members of a species, which then multiply rapidly, the result of which is that there is a sharp loss of genetic variation compared with the parent population. The new population then may be distinctively different, genetically and phenotypically. In extreme cases, founder effect may lead to the speciation and subsequent evolution of new species.
© 2005 Susan A. Ehrlich
Frameshift mutation. An alteration of DNA in which a duplication or deletion occurs in a coding region that is not a multiple of three base pairs, thus altering the frame of readout.
Fusion. The joinder of the membrane of two cells, thereby creating a daughter cell that contains the components and hence some of the same properties from each parent.
Fusion gene. A gene that results from a combination of two genes or parts of two genes.







G

Gamete. The haploid germ cell or reproductive cell, ova or sperm.
Gel. Semisolid porous matrix, usually agarose or acrylamide, used in electrophoresis to separate molecules based on its sieving properties.
Gel electrophoresis. The process of sorting DNA fragments by size by applying an electric current to a gel. The different-sized fragments move at different rates through the gel.
Gene. The fundamental unit of heredity; a set of nucleotide base pairs or sequence of DNA nucleotides on a chromosome containing the sequence encoding a specific function.
Gene amplification. The increase within a cell of the number of copies of a given gene.
Gene expresssion. The process through which a gene’s encoded information is converted into cell-operating structures. Expressed genes include both those that are transcribed into mRNA and then protein and those that are not translated into protein, e.g., transfer and ribosomal RNA.
Gene family. A group of genes that are similar in DNA sequence and have evolved from a single common ancestral gene; may or may not be located in the same chromosome region.
Gene flow. The exchange of genes between different organisms usually through generational inheritance.
Gene frequency. The relative frequency of an occurrence of a particular allele in a population.
Gene mapping. Determination of the relative positions of genes on a DNA molecule (chromosome or plasmid) and of the distance in linkage units or physical units between them.
© 2005 Susan A. Ehrlich
15
Gene product. The biochemical material resulting from the expression of a gene used to measure how active a gene is.
Gene sequencing. The determination of the sequence of nucleotide bases in a strand of DNA.
Gene therapy. The replacement of a defective gene in an organism suffering from a genetic disease.
Genetic code. The combination of 64 mRNA codons that specify the 20 amino acids.
Genetic drift. An evolutionary process in which gene frequencies change as a result of random fluctuations in the transmission of genes from one generation to the next. Drift is greater in smaller populations.
Genetic mapping. The ordering of genes on chromosomes according to recombination frequency.
Genetics. The study of the patterns, processes and mechanisms of inheritance of biological characteristics.
Genome. All of the genetic materials in the chromosomes of a particular organism. Its haploid size is generally given as the total number of base pairs.
Genome scan. A gene-mapping approach in which many markers from the human genome are tested for linkage with a disease phenotype.
Genomic sequence. The order of bases that constitute a particular fragment of DNA in a genome.
Genomics. The study of genes and their functions.
Genotype. The genetic makeup of an organism; the particular forms (alleles) of a set of genes possessed by an organism; an individual’s allelic constitution at a locus.
Genotype frequency. The proportion of individuals in a population that carries a specific genotype.
© 2005 Susan A. Ehrlich
Genotype, single-locus. The alleles an organism possesses at a particular site in its genome.
Genotype, multi-locus. The alleles an organism possesses at several sites in its genome.
Germ cell. Sex cell, the haploid ova or sperm.
Germ-line. Cells responsible for the production of gametes.
Guanine (G). A purine base; one of the four molecules containing nitrogen present in DNA and RNA. Designated by the letter G, it binds to C.

Gene: The basic physical and functional unit of heredity, which passes
information from one generation to the next. It is a segment of DNA that includes
a transcribed section and a regulatory element, which allows its transcription.
Gene flow: The exchange of genetic material between populations. This may be
used in the sense of plant reproduction (i.e. due to the dispersal of gametes and
zygotes) or due to human influences, such as the introduction of new crop
varieties by farmers.
Genetic distance: The degree of relatedness between subgroups or populations
as measured by various statistics.
Genetic diversity: Variation in the genetic composition of individuals within or
among species; the heritable genetic variation within and among populations.
Genetic drift: The unpredictable changes in allele frequency that occur in small
populations.
Genetic erosion: Loss of genetic diversity between and within populations of the
same species over time, or reduction of the genetic base of a species.
Genetic marker: An allele, a band in a gel or trait that serves experimentally as a
probe to identify an individual or one of its characteristics.
Genome: The entire complement of genetic material in an organism

H

Hae III. A particular restriction enzyme, derived from Haemophilus influenza.
Haploid. A single set of chromosomes present in the sperm and egg cells, 23 being the haploid number in a human being. When a sperm cell fertilizes an egg cell, the number of chromosomes double (the diploid number).
Haplotype. From haploid genotype, the allelic constitution of multiple loci on a single chromosome.
Hardy-Weinberg Principle. Specifies equilibrium relationship between gene frequencies and genotype frequencies in a population, e.g., in a large random intrabreeding population not subjected to excessive selection or mutation the gene and genotype frequencies will remain constant over time; a condition in which the allele frequencies within a large random intrabreeding population are unrelated to patterns of mating. In this condition, the occurrence of alleles from each parent will be independent and have a joint frequency estimated by the product rule.
Helper cells. See packaging cells.
Hemizygous. A gene present in only a single copy, most commonly referring to genes on the single male X chromosome but can refer to other genes in the haploid state.
Heredity. The transmission of characteristics from one generation to the next.
Heritability. The proportion of population variance in a trait that can be ascribed to genetic factors.
Heterogeneity, allelic. Describes conditions in which different alleles at a locus can
© 2005 Susan A. Ehrlich
produce variable expression of a disease. Depending on phenotype definition, allelic heterogeneity may cause two distinct diseases as in Duchenne and Becker muscular dystrophy.
Heterogeneity, locus. Describes diseases in which mutations at distinct loci can produce the same disease phenotype, e.g., retinitis pigmentosa, osteogenesis imperfecta.
Heterologous. Refers to segments of DNA derived from different sources.
Heteroplasty. The condition in which some copies of mitochondrial DNA in the same individual have different base pairs at certain points.
Heterozygosity. The presence of different alleles at a locus or loci on homologous chromosomes.
Heterozygote. An individual who has two different alleles at a locus.
High-resolution banding. Chromosome banding using prophase or prometaphase chromosomes which are more extended than metaphase chromosomes and thus yield more bands and greater resolution.
Histone. The protein core around which DNA is wound in a chromosome.
Homologies. Similarities in DNA or protein sequences between molecules.
Homologous. DNA or amino-acid sequences that are recognizably similar to one another; the chromosome pairs found in diploid organisms. The human being has 22 homologous pairs of autosomes plus two sex chromosomes per nucleus. The members of each pair have an identical sequence of genes; however, the alleles at corresponding loci may be identical (homozygous) or different (heterozygous).
Homologs. Chromosomes that are homologous.
Homozygote. An individual in whom the two alleles at a locus are the same.
Hormone. A chemical or protein that acts as a messenger or stimulatory signal, relaying instructions to modulate certain physiological activities.
Housekeeping genes. Genes whose protein products are required for general cellular maintenance or metabolism and therefore expressed in all cells.
© 2005 Susan A. Ehrlich
Human artificial chromosome. A synthetic chromosome consisting of a centromere and telomeres and an insert of human DNA that can be 5-10 Mb in size.
Hybridization. The pairing of complementary strands of DNA, or DNA and RNA, by matching at base-pair sites. For example, a primer with the sequence AGGTCT would bond with the complementary sequence TCCAGA on a DNA fragment.
Hydrogen bond. A relatively weak bond between a hydrogen (H) atom, covalently bound to a nitrogen (N) or oxygen (O) atom and another atom. These bonds can be broken by increasing temperature.
Hypervariable region. A segment of a chromosome characterized by considerable variation in the number of alleles at a locus or loci.

Haploid: A single set of chromosomes (half the full set of genetic material),
present in each egg and sperm cell of animals and in each egg and pollen cell of plants (Gk. haploos, single).(cf. Diploid)
Haplotype: A specific allelic constitution at a number of loci within a defined linkage block.
Hardy-Weinberg equilibrium: The stable frequency distribution of genotypes,AA, Aa, and aa, in the proportions p², 2pq, and q², respectively (where p and q are the frequencies of the alleles, A and a), that is a consequence of random mating in the absence of mutation, migration, natural selection or random drift.
Heterozygote: A diploid individual that has different alleles at one or more genetic loci (Gk. heteros, other)(cf. Homozygote)
Homozygote: A diploid individual that has identical alleles at one or more genetic loci (Gk. homos, same)(cf. Heterozygote)

I

Imprinting, genomic. Process in which genetic material is expressed differently when inherited from the mother than when inherited from the father, e.g., Angelman’s syndrome (mother) versus Prader-Willi syndrome (father).
In situ hybridization. Molecular gene-mapping technique in which labeled probes are hybridized to stained metaphase chromosomes and then imaged to reveal the position of the probe.
In vitro. Means “in glass” and refers to a biological process performed outside a living organism, e.g., in the laboratory.
In vivo. Refers to a biological process within a living organism.
Independent assortment. One of Mendel’s fundamental principles: Alleles at different loci are transmitted independently of one another, although this has proven to be true only if they are sufficiently distant from one another.
Inducer. A molecule or substance that increases the rate of expression of a specific gene.
Insertion. Addition of one or more nucleotides into a DNA strand. This may result in a gene mutation.
Intergenic. Nucleotide sequences located between genes.
© 2005 Susan A. Ehrlich
Interphase. Portion of the cell cycle that alternates with meiosis or mitosis. DNA is replicated and repaired during this phase.
Intron. A sequence of DNA between two exons within a gene that is transcribed into mature mRNA but excised and degraded prior to translation and does not code for a protein. A number of introns of variable length may be separating the exons of a gene.
Inversion. A structural rearrangement of a chromosome in which two breaks occur followed by the reinsertion of the chromosome segment but in reversed order. Paracentric does not include the centromere. Pericentric includes the centromere.
Isochromosome. A structural chromosome rearrangement caused by the division of a chromosome along an axis perpendicular to the usual axis of division resulting in chromosomes with either two short arms or two long arms.

Inbreeding: The mating of genetically related individuals or between relatives.Breeding through a succession of parents belonging to the same stock. Also called endogamy or self-breeding.(cf.Outbreeding)
in situ conservation: A conservation method that attempts to preserve the genetic integrity of gene resources by conserving them within the evolutionary dynamic ecosystems of the original habitat or natural environment.(cf. ex situ conservation)
Isozyme: Multiple forms of an enzyme whose synthesis is controlled by more than one gene.


J



K

Karyotype. A display of chromosomes ordered according to length and banding pattern.
Kilobase (kb). Unit of length for DNA fragments equal to 1000 nucleotides or base pairs.

L

Lamarckism. A (discredited) theory that adaptations to the environment will cause heritable changes.
Ligase. An enzyme used to join DNA or RNA segments.
Linkage. The proximity of two or more markers on a chromosome; the closer the markers, the lower the probability that they will be separated during DNA replication or repair and the greater the probability that they will be inherited together.
Linkage disequilibrium. A specific allele of one locus being associated or linked to a specific allele or marker of another locus on the same chromosome with a greater frequency than expected by chance.
Linkage equilibrium. A condition in which the occurrence of alleles at different loci is independent.
Linkage map. A map of the relative positions of loci on a chromosome determined on the basis of how often the loci are inherited together and measured in cM.
© 2005 Susan A. Ehrlich
Localize. Determination of the original locus of a gene or other chromosome marker on a chromosome.
Locus (plural: loci). The specific physical location of a gene or other chromosome marker on a chromosome; the DNA at that position.
Long interspersed elements (LINEs). A class of dispersed repetitive DNA in which each repeat is relatively long, up to 7 kb.
Long-range restriction map. Depiction of the positions on a chromosome of restriction enzyme cutting sites used as markers of specific areas along the chromosome; details the positions on the DNA molecule that are cut by particular restriction enzymes.

Locus (pl. loci): The specific place on a chromosome where a gene or particular piece of DNA is located.

M

Macrorestriction map. Map depicting the order of and distance between the sites at which restriction enzymes cleave chromosomes.
Major gene. A single locus responsible for a trait.
Manifesting heterozygote. An individual who is heterozygous for a recessive trait but displays the trait. Most commonly used to describe females heterozygous for an X-linked trait who display the trait due to X inactivation.
Marker. A DNA sequence or gene of known location on a chromosome and phenotype that is used as a point of reference in mapping other loci.
Mass spectroscopy. The separation of molecules according to their molecular mass. In the version for analyzing DNA, small quantities of PCR-amplified fragments are irradiated with a laser to form gaseous ions that in an electric field traverse a fixed distance. Heavier ions have longer times of flight, and the process is known as “matrix-assisted laser desorption-ionization time-of-flight mass spectroscopy” or MALDI-TOF-MS.
Match. The presence of the same allele or alleles in two samples. Two DNA profiles are declared to match when they are indistinguishable in genetic type. For loci with discrete alleles, two samples match when they display the same set of alleles. For RFLP testing of VNTRs, two samples match when the pattern of the bands is similar and the positions of the corresponding bands at each locus fall within a preset distance (the match window).
© 2005 Susan A. Ehrlich
21
Megabase (Mb). Unit of DNA equal to one million nucleotides and approximately equal to one cM.
Meiosis. Cell-division process in which haploid gametes are formed from diploid germ cells.
Meiotic failure. Aberrant meiosis in which a diploid gamete is produced rather than the normal haploid gamete.
Melt. The process of disrupting the hydrogen bonds linking complementary DNA strands producing the two single strands.
Mendelian. Referring to Gregor Mendel, describes a trait that is attributable to a single gene, assorting according to Mendel’s laws: Segregation: during meiosis only one member of each homologous chromosome pair is transferred to a specific gamete; independent assortment: during meiosis the members of the different homologous chromosome pairs sort independently when transferred to a specific gamete, for example, AA’ and BB’ homologous chromosome pairs could give rise to AB, AB’, A’B or A’B’ possible gametes in a ratio of 1:1:1:1.
Messenger RNA (mRNA). RNA molecule formed from the transcription of DNA. Prior to intron splicing, mRNA is termed a primary transcript; after splicing, the mature transcript proceeds to the cytoplasm where it is translated into an amino acid sequence.
Metabolism. All of the biochemical activities that are carried out by an organism to maintain life.
Metacentric. A chromosome in which the centromere is located approximately in the middle of the chromosome arm.
Metaphase. A stage of mitosis and meiosis in which homologous chromosomes are arranged along the equatorial plane or metaphase plate of the cell, the stage at which chromosomes are maximally condensed and most easily visualized.
Methylation (me). One form of methylation, the most common in mammals, involves the conversion of cytosine to 5-methyl cytosine. Methylation can prevent cleavage of DNA at a restriction enzyme restriction site, e.g., Hpa cleaves at CCGG but not at CmeCGG. Methylation of DNA alters its transcriptional potential.
Microdeletion. A chromosome deletion too small to be visible under a microscope, e.g.,
© 2005 Susan A. Ehrlich
22
Prader-Willi syndrome, DiGeorge syndrome.
Microsatellite. An STR; a type of satellite DNA that consists of small repeat units (usually 2-5 bp) occurring in tandem.
Microsatellite repeat polymorphism. A type of genetic variation in populations consisting of differing numbers of microsatellite repeat units at a locus.
Minisatellites. A VNTR; a type of satellite DNA that consists of tandem repeat units that are each 20-70 bp in length. A variation in the number of minisatellite repeats is the basis of VNTR polymorphisms.
Mismatch. Also known as mispairing. Bases that do not match in “complementary” DNA strands. Depending on the blot wash stringency conditions, some mismatches can be tolerated between hybridized sample and probe DNA complementary regions. For example:G-ACCTTTG-TGGAAACT
Missense mutation. A codon change resulting in a different amino acid in a protein.
Mitochondrion (plural: mitochondria). A DNA-containing cytoplasmic structure (organelle) of nucleated (eukaryotic) cells that is the site of the energy-producing reactions within the cell, the sites of ATP production. Mitochondria contain their own DNA (mtDNA) inherited only from the mother.
Mitosis. The process whereby a somatic cell nucleus after chromosomal replication divides to form two identical nuclei.
Mobile elements. DNA sequences capable of inserting themselves into other locations in the genome; transposons.
Modifier gene. A gene that alters the expression or activity of a gene at another locus.
Monoclonal. A group of cells consisting of a single clone, i.e., all cells are derived from the same single ancestral cell.
Monogenic. Describing a single-gene or Mendelian trait.
© 2005 Susan A. Ehrlich
23
Monomorphic. A gene or DNA characteristic almost always found in only one form in a population.
Monomorphic bands. Each different-size monomorphic fragment is detected by cleaving genomic DNA with a specific restriction enzyme and hybridizing with a specific monomorphic probe. These fragments provide markers for use in quality control, especially as related to band shifts.
Monosomy. An aneuploid condition in which a specific chromosome is present in only a single copy, giving a human being a total of 45 chromosomes.
Monozygotic. Twins produced from a single zygote which later splits; the genomes are therefore identical.
Morphogenesis. The developmental process or formation of a cell, organ or organism.
Mosaic. Two or more genetically different cell lines in a single individual.
Multi-allele. Refers to a number of different possible alleles at a specific locus.
Multi-factorial trait or disease. A trait or disease resulting from the interaction of multiple genetic and environmental factors.
Multi-locus. Refers to a number of different loci or positions in the genome.
Multi-locus probe. A (rarely used) probe that marks multiple sites (loci). RFLP analysis using a multi-locus probe will yield an autorad showing a pattern of 30 or more bands.
Multi-plexing. Typing several loci simultaneously, thereby increasing sequencing speed.
Multi-point mapping. A type of genetic mapping in which the recombination frequencies among three or more loci are estimated simultaneously.
Mutation. Any heritable change in DNA sequence.

Marker: An identifiable physical location on a chromosome whose inheritance
can be monitored (e.g. gene, restriction enzyme site or RFLP marker).
Mating system: The pattern of mating between individuals of a population,
including such factors as extent of inbreeding, pair-bonding, and number of
simultaneous mates. The mating system is of major importance in determining
both the genetic structure and evolutionary potential of natural populations.
Microsatellite DNA: A type of repetitive DNA based on very short repeats such
as dinucleotides, trinucleotides or tetranucleotides. Also called simple sequence
repeats (SSRs).
Migration: Movement of individuals between otherwise reproductively isolated
populations.
4
Multiple alleles: The existence of several known alleles of a gene.
Mutation: The term to describe an abrupt change of phenotype that is inherited.
Any permanent and heritable change in DNA sequence. Types of mutations
include point mutations, deletions, insertions, and changes in number and
structure of chromosomes.

N

Natural selection. An evolutionary process in which individuals with favorable genotypes produce relatively greater numbers of surviving offspring that compete well.
© 2005 Susan A. Ehrlich
24
Neutral mutation. Any change in the sequence of genomic DNA that does not affect to an extent that is detectable in an individual organism.
Noncoding DNA. DNA that does not encode a product or protein.
Nonsense mutation. A type of mutation in which an mRNA stop codon is produced resulting in the premature termination of translation.
Northern blotting. A gene expression assay in which mRNA on a blot is hybridized with a labeled probe.
Nuclease. An enzyme that breaks nucleic acids into their constituent nucleotides by cleaving the chemical bonds.
Nucleic acid. A nucleotide polymer of which DNA and RNA are the major types. Nucleic acid has three constituents: a sugar, ribose, built from a pentagonal ring of carbon atoms; a phosphate or phosphorus atom surrounded by four oxygen atoms; and one of the five bases – guanine, adenine, cytosine or thymine/uracil. In DNA, as opposed to RNA, the ribose lacks a fringe oxygen atom – thus de-oxy-ribose nucleic acid. The phosphates link and space the sugars, the third carbon of one sugar ring to the fifth carbon of the ring beyond, etcetera, and the base is linked to the sugar.
Nucleoside. A unit of nucleic acid composed of ribose or deoxyribose and a purine or pyrimidine base.
Nucleosome. A structural unit of chromatin in which 140-50 bp of DNA are wrapped around a core unit of eight histone molecules.
Nucleotide. A unit of nucleic acid composed of phosphate, ribose or deoxyribose, and a purine or pyrimidine base.
Nucleotide excision repair. A type of DNA repair in which altered groups of nucleotides are removed and replaced with properly pairing nucleotides.
Nucleus. The cellular organelle in eukaryotes that contains the genetic material.

O

Obligate carrier. An individual who necessarily possesses a disease-causing gene but may or may not be affected with the disease phenotype.
© 2005 Susan A. Ehrlich
25
Oligonucleotide. A DNA sequence of fewer than 100 nucleotides often used as a primer or a probe in PCR.
Oncogene. A gene that can transform cells into a highly proliferative state causing cancer.
Oogenesis. The process in which ova are produced.
Operator. The region of a chromosome, adjacent to the operon, where a repressor or enducer protein binds to modulate transcription of the operon.
Operon. A sequence of genes responsible for synthesizing a set of enzymes needed for a pathway of biosynthesis of a molecule. An operon is controlled by an operator.

Outbreeding: An allogamous mating system in which mating is between
individuals that are less closely related than are average pairs chosen from the
population at random. Also called exogamy or cross-breeding.
(cf. Inbreeding)
Outcrossing: see Allogamy.


P

pH. A measure of the acidity or alkalinity of a solution.
Packaging cells and helper cells. Packaging cells are those in which replication-deficient viruses are placed so that the replication machinery of the cells can encapsulate foreign genes. Helper cells provide missing functions of defective viruses so that the cells can make viral copies.
Palindrome. A DNA sequence the complementary sequence of which is the same if read backwards, e.g., 5' AATGCGCATT 3'.
TTACGCGTAA
Panmixia. Describes a population in which individuals mate at random with respect to a specific genotype.
Pathogen. A disease-causing agent such as a virus or bacterium.
Pattern formation. The spatial arrangement of differentiated cells to form tissues and organs during embryonic development.
Penetrance. In a population, the proportion of individuals possessing a disease-causing genotype who express the disease phenotype. When this proportion is less than 100%, the disease genotype is said to have reduced or incomplete penetrance.
Peptide. Two or more amino acids joined by a peptide bond.
Pharmacogenetics.is the study of the hereditary basis for differences in a population’s
© 2005 Susan A. Ehrlich
response to a drug. Pharmacogenomics describes the science of using genomic technology to identify drug targets and study genetic variations that have an impact on the safety and efficacy of drugs.
Phenocopy. A phenotype that resembles the phenotype produced by a specific gene but is due instead to a different, typically non-genetic, factor.
Phenotype. The physical make-up of an individual as defined by genetic and non-genetic factors.
Physical mapping. The determination of physical distances between genes using cytogenetic and molecular techniques; the mapping of the locations as separated by base pairs of identifiable DNA landmarks, e.g., genes, restriction enzyme cutting sites, regardless of inheritance. For the human genome, the lowest-resolution physical map is the banding patterns on the different chromosomes.
Phylogeny: Evolutionary history of a species. A diagram illustrating the deduced evolutionary history of populations of related organisms.

Plasmid. An extra-chromosomal circular double-stranded DNA element native to certain bacteria, capable of replication. Plasmids are used as vehicles to replicate cloned (recombinant) DNA sequences.
Pleiotropy. Describes genes that have multiple phenotypic effects, e.g., Marfan syndrome, cystic fibrosis.
Pluripotent cells. Having the capacity to become any kind of cell or tissue in the body. Embryonic stem cells and cells of the inner cell mass are pluripotent.
Polarity. Direction, e.g., definition of anterior versus posterior in axis specification or 3' versus 5' of nucleic acids.
Poly-A tail. The adenine (A) nucleotide polymer often attached to the 3' end of primary-mRNA.
Polyclonal. Derived from different types of cells.
Polygenic. Describes a trait caused by the combined additive effects of alleles of multiple genes, e.g., certain types of heart disease, some cancers, diabetes. Although a polygenic disorder is inherited, it depends on the simultaneous presence of several alleles and therefore the hereditary patterns usually are more complex than those of single-gene disorders.
Polymerase. The general term for enzymes that carry out the synthesis of nucleic acids.
© 2005 Susan A. Ehrlich
Polymerase Chain Reaction (PCR). A technique for amplifying a large number of copies of a specific DNA sequence flanked by two oligonucleotide primers. The DNA is alternately heated and cooled in the presence of DNA polymerase and free nucleotides so that the specified DNA segment is denatured, hybridized with primers and extended by DNA polymerase.
Polymerase, DNA or RNA. Enzymes that catalyze the synthesis of nucleic acids on preexisting nucleic acid templates, assembling RNA from ribonucleotides or DNA from deoxyribonucleotides.
Polymorphism. A difference in DNA sequence among individuals. A locus in which two or more alleles have gene frequencies greater than 0.01 in a population is considered a useful polymorphism for genetic linkage analysis. When this criterion is not fulfilled, the locus is considered to be monomorphic.
Polypeptide. A series of amino acids linked together by peptide bonds.
Polyploidy. Chromosome abnormality in which the number of chromosomes in a human cell is a multiple of 23 greater than two, the diploid number, e.g., triploidy in which the individual has three copies of each chromosome.
Positional cloning. A technique to identify genes based on their location on a chromosome.
Post-translational modification. Various types of additions and alterations of a polypeptide that take place after mRNA is translated into a polypeptide.
Prehybridization. The process of incubating a DNA blot with a hybridization solution containing complex DNA to in part block cross-hybridizing sites. This precedes the addition of the labeled probe.
Primary transcript. The pre-mRNA molecule directly after its transcription from DNA. A mature mRNA transcript is formed from the primary transcript when the introns are spliced out and poly-A is added.
Primer. An oligonucleotide that flanks either side of a DNA fragment and provides a point for complementary nucleotides to attach and replicate the DNA strand in the PCR process.
© 2005 Susan A. Ehrlich
28
Primer extension. Part of the PCR process in which DNA polymerase extends the DNA sequence beginning at an oligonucleotide primer.
Probe. Single-stranded DNA or RNA of a specific base sequence, labeled either radioactively (usually used for RFLP analysis) or biochemically (usually used for PCR-based analysis) that is used to detect the complementary base sequence by hybridization.
Prokaryote. A unicellular organism lacking a membrane-bound nucleus, e.g., bacteria.
Promoter. A DNA site to which RNA polymerase will bind and initiate transcription.
Protein. A large molecule composed of one or more chains of amino acids in a specific order, the order determined by the base sequence of nucleotides in the gene coding for the protein. Proteins are required for the structure, function and regulation of the body cells, tissues and organs, and each protein has unique functions, e.g., hormones, enzymes and antibodies.
Protein electrophoresis. A technique in which amino acid variations are identified on the basis of charge differences that cause differential mobility of polypeptides in an electrical field.
Proteomics. Each cell produces thousands of proteins, each of which has a specific function. The collection of proteins in a cell is the proteome, and, unlike the genome, which is constant irrespective of cell type, the proteome varies from one cell type to the next. The science of proteomics seeks to identify the protein profile of each cell type, assess protein differences and uncover not only each protein’s specific function but how each protein interacts with other proteins.
Pseudogene. A gene that is highly similar in sequence to another gene or genes but has been rendered transcriptionally or translationally inactive by mutations.
Pulsed-field gel electrophoresis. A type of electrophoresis suitable for large DNA fragments. The fragment is moved through a gel by alternating pulses of electricity across fields that are 90 degrees in orientation from one another.
Punnett Square. A grid developed by English geneticist Reginald Crundall Punnett (1875-1967) specifying the genotypes that can arise from the gametes contributed by a mating pair of individuals, i.e., a table that shows the possible genetic outcomes of a mating.
© 2005 Susan A. Ehrlich
29
Purine. The two DNA and RNA bases, adenine and guanine, that consist of two carbon-nitrogen rings.
Pyrimidine. The bases cytosine and thymine in DNA and cytosine and uracil in RNA that consist of single carbon-nitrogen rings.

Polymorphism: The appearance of different forms associated with various
alleles of one gene or homologous of one chromosome.
Population: In genetics, a group of individuals who share a common genepool
and have the potential to interbreed.
Population genetics: The quantitative study and measurement of populations in
statistical terms, e.g. the study of genetic phenomena in terms of standard
statistical parameters such as frequency tables and distributions, means, variance
and standard deviations.
Purine: A nitrogen-containing, double-ring, basic compound that occurs in nucleic
acids. The purines in DNA and RNA are adenine and guanine.
Pyrimidine: A nitrogen-containing, single-ring, basic compound that occurs in nucleic acids. The pyrimidines in DNA are cytosine and thymine. The pyrimidines in RNA are cytosine and uracil.

Q

R

Random match. A match in the DNA profiles of two samples of DNA when one is drawn at random from the population.
Random-match probability. The chance of a random match. As often used, this refers to the probability of a true match when DNA being compared to evidentiary DNA comes from a person drawn at random from the population. A random-true-match probability reveals the probability of a true match when samples of DNA come from unrelated persons.
Random mating. Like panmixia, the members of a population are said to mate randomly with respect to particular genes when the choice of mates is independent of the alleles.
Receptor. A cell-surface component that binds to extra-cellular molecules.
Recessive. A trait phenotypically expressed only when an allele is present in the homozygous state. The recessive allele is masked by a dominant allele when the two occur together in a heterozygote.
Recombinant DNA. DNA formed by the union of two heterologous DNA molecules, e.g., the ligation of a human growth hormone gene into a plasmid.
Recombinant DNA technology. Procedure used to join DNA sequences in an in vitro cell-free system. Under appropriate conditions, a recombinant DNA molecule can enter a cell and replicate there either autonomously or after it has become integrated into a cellular chromosome.
Recombination. Combinations of genes in offspring different from those in the parents due to independent assortment and during crossing-over.
Recombination frequency. The proportion of meioses in which recombinants between two loci are observed used to estimate genetic distances between loci.
Reduction division. The first stage of meiosis (meiosis I) in which the chromosome
© 2005 Susan A. Ehrlich
30
number is reduced from diploid to haploid.
Redundancy, genetic. The existence of alternate genetic mechanisms or pathways that can compensate when another mechanism or pathway is disabled.
Regulatory gene. A gene that acts to control the expression of other genes.
Regulatory sequence. A DNA base sequence that controls gene expression.
Repetitive DNA. DNA sequences that are found in multiple copies in the genome, dispersed or repeated in tandem.
Repetitive sequence. A repeated series of bases in a DNA molecule.
Replication. The process in which the double-stranded DNA molecule is duplicated.
Replication bubble. Replication structures that occur in multiple locations on a chromosome allowing replication to proceed more rapidly.
Replication origin. The point at which replication begins on a DNA strand. In eukaryotes, each chromosome has numerous replication origins.
Replicon. A segment of DNA that can replicate independently.
Repressor. A protein that binds to an operator adjacent to a structural gene, inhibiting transcription of that gene.
Restriction digest. Process in which DNA is exposed to a restriction enzyme causing it to be cleaved into restriction fragments.
Restriction enzyme. An enzyme that breaks DNA in highly sequence-specific locations.
Restriction enzyme cutting site. A specific nucleotide sequence of DNA at which a particular restriction enzyme cuts the DNA.
Restriction enzyme, endonuclease. A protein that recognizes specific, short nucleotide sequences and cuts DNA uniquely at those sites.
Restriction fragment. A fragment of DNA resulting from digestion with a restriction enzyme.
© 2005 Susan A. Ehrlich
31
Restriction Fragment Length Polymorphism (RFLP). Variations in DNA sequence in populations detected by digesting DNA with a restriction endonuclease, electrophoresing the resulting restriction fragments, transferring the fragments to a solid medium (blot) and hybridizing the DNA on the blot with a labeled probe. The polymorphism arises from the loss or creation of a restriction enzyme site.
Restriction site. A DNA sequence marking the location at which a specific restriction endonuclease cuts DNA into fragments also known as a recognition site..
Restriction site polymorphism (RSP). A variation in DNA sequence that is due to the presence or absence of a restriction site. This type of polymorphism is the basis for most traditional RFLPs.
Retrovirus. A type of RNA virus that can reverse transcribe its RNA into DNA for insertion into the genome of a host cell, useful as a vector for gene therapy.
Reverse banding (R-banding). A chromosome banding technique in which chromosomes are heated in a phosphate buffer; produces dark and light bands in patterns that are the reverse of those produced by G-banding.
Reverse dot blot. A detection method used to identify SNPs in which DNA probes are affixed to a membrane and amplified DNA is passed over the probes to see if it contains the complementary sequence.
Reverse transcriptase (RNA-dependent DNA polymerase). An enzyme that transcribes RNA into DNA (hence reverse).
Ribonucleic acid (RNA). Single-stranded nucleic acid molecules composed of a sugar (ribose), phosphate group and a series of bases: adenine, guanine, cytosine and uracil. The types of RNA include mRNA, rRNA and tRNA.
Ribosomal RNA (rRNA). RNA molecules that together with specific proteins form the subunits of ribosomes.
Ribosome. The site of translation of mature mRNA into amino-acid sequences.
RNAase. An enzyme capable of degrading RNA.
© 2005 Susan A. Ehrlich
32
RNA interference. A process that uses RNA sequence to block gene expression.
RNA polymerase. Enzyme that binds to a promoter site and synthesizes mRNA from a DNA template.
Ribosome. A cellular component containing protein and RNA that is involved in protein synthesis.
Ribozyme. An mRNA molecule that has catalytic activity.
Ring chromosome. A structural chromosome abnormality formed when both ends of a chromosome are lost and the new ends fuse together.

RAPD: Random amplified polymorphic DNA. A technique for amplifying
anonymous stretches of DNA, using PCR with arbitrary primers.
Recombination: The production of a DNA molecule with segments derived from
more than one parental DNA molecule. In eukaryotes, this is achieved by the
reciprocal exchange of DNA between non-sister chromatids within a homologous
pair of chromosomes during prophase of the first meiotic division. Recombination
allows the chromosomes to rearrange their genetic material, thereby increasing
the potential of genetic diversity. Also known as crossing-over.
Regeneration (of genetic resources collections):
• The process of restoring a whole plant from individual cells by manipulating
an in vitro culture.
• The growing of a sample of seeds from an accession to replenish the viability
of the original accession. It is usually done when the viability of the original
material drops to less than 85%.
5
Restriction enzyme: An endonuclease that will recognize a specific sequence
and cut the DNA chain at that point.
RFLP: Restriction fragment length polymorphism. Variation between individuals
as detected by differences in DNA fragment sizes after restriction digestion.


S

Satellite DNA. Repetitive DNA located in telomeres and centromeres.
Sensitivity. The proportion of affected individuals who are correctly identified by a test; true positives.
Sequence-specific oligonucleotide (SSO) probe also known as an allele-specific oligonucleotide probe. Oligonucleotide probes used in a PCR-associated detection technique to identify the presence or absence of certain base-pair sequences identifying different alleles. The probes are visualized by an array of dots rather than the electrophoretograms associated with RFLP analysis.
Sequence tagged site (STS). DNA sequences of several hundred bp in size flanked by PCR primers. Because the chromosome location has been established, the site is useful as an indicator of a physical position on the genome.
Sequencing. Determination of the order of base pairs in a DNA or RNA molecule or the order of amino acids in a protein.
Sex chromosomes (X and Y chromosomes). Chromosomes that are different in the two sexes and involved in sex determination. A female has two X chromosomes in diploid cells, and a male has an X chromosome and a Y chromosome. The sex chromosomes comprise the 23rd chromosome pair in a karyotype.
Sex-influenced. A trait the expression of which is modified by the gender of the
© 2005 Susan A. Ehrlich
individual possessing the trait.
Sex-linked. A trait that is expressed in only one gender.
Short interspersed elements (SINEs). A class of dispersed repetitive DNA in which each repeat is relatively short.
Short tandem repeats (STR). Multiple copies of an identical DNA sequence arranged in direct succession in a particular region of a chromosome.
Shotgun method. A method involving randomly sequencing tiny cloned pieces of the genome with no prior knowledge of from where on a chromosome the piece originated. In directed strategies, the sequenced pieces of DNA come from adjacent stretches of a chromosome.
Signal transduction. Process in which biochemical messages are transmitted from the cell surface to the nucleus.
Silencer. A DNA sequence that binds to specific transcription factors to decrease or repress the activity of certain genes.
Silent substitution. DNA sequence change that does not change the amino-acid sequence because of the degeneracy of the genetic code.
Single-copy DNA. DNA sequences that occur only once in the haploid genome.
Single-gene disorder or trait. A feature or disease that is caused by a mutant allele of a single gene, e.g., Duchenne muscular dystrophy, retinoblastoma, sickle-cell disease.
Single-locus probe. A probe that only marks a specific site (locus). RFLP analysis using a single-locus probe will yield an autorad showing one band if the individual is homozygous, two bands if heterozygous.
Single nucleotide polymorphism (SNP). A substitution, insertion or deletion of a single base pair at a given point in the genome.
Single-strand conformation polymorphism (SSCP). A technique for detecting variation in DNA sequence by running single-stranded DNA fragments through a non-denaturing gel. Fragments with differing secondary structure (conformation) caused by sequence variation will migrate at different rates.
© 2005 Susan A. Ehrlich
Sister chromatids. The two identical strands of a duplicated chromosome joined by a single centromere.
Solenoid. A structure of coiled DNA consisting of approximately six nucleosomes.
Somatic cells. All cells of eukaryotes excluding gametes and their precursors or, in other words cells other than sex or germ cells.
Somatic cell gene therapy. Therapy involving the insertion of genes into somatic cells for therapeutic purposes.
Somatic cell nuclear transfer. The transfer of a nucleus from a fully differentiated cell into an egg that has had its nucleus removed.
Southern transfer or blot. Transfer by absorption of DNA fragments separated in electrophoretic gels to a solid membrane such as nitrocellulose for detection of specific base sequences by radiolabeled complementary probes.
Specificity. The proportion of unaffected individuals who are correctly identified by a test; true negatives.
Spliced mRNA. mRNA after removal of the intron regions from the primary messenger and linking of the exon (coding) portions.
Splice site mutation. DNA sequence alterations in donor or acceptor sites or in the consensus sites near them that produces altered exon splicing such that portions of exons are deleted or portions of introns are included in the mature mRNA transcript.
Splicing. The removal of introns and joining of exons to form a continuous coding sequence in messenger RNA.
Stem cells. Embryonic cells that still are undifferentiated.
Stop codon. mRNA base triplets UGA, UAG or UAA signalling the end of mRNA translation into protein.
Stringency. The buffer salt concentration and temperature used in the DNA blot wash post-hybridization process. As these parameters are changed, the degree of the binding of the probe to target DNA changes.
© 2005 Susan A. Ehrlich
Strip. Removal by melting of hydrogen bonds of hybridized probe from DNA blots.
Structural gene. Genes that encode protein products.
Submetacentric. A chromosome in which the centromere is located closer to one end of the chromosome arm than the other.
Substrate. Material acted on by an enzyme.
Syndrome. A pattern of multiple primary malformations or defects due to a single underlying cause, e.g., Down syndrome.

Selfing: To fertilize by means of pollen from the same plant.
Self-pollination: see Autogamy.
Sexual reproduction: The production of new individuals, following the mixing in
a single cell of the genes of two different cells, usually gametes and usually from
different parents.
Species diversity: A function of the distribution and abundance of species,
similar in meaning to ‘species richness’. In more technical literature, includes
considerations of the evenness of species abundances. An ecosystem is said to
be more diverse, according to the more technical definition, if species present
have equal population sizes and less diverse if many species are rare and some
are very common.
SSR: See Microsatellite DNA.
Sympatric: Occurring in the same geographic area.
Weed:
• In agriculture, an individual plant or species growing where it is not wanted.
• In ecology, a plant that is adapted to grow in disturbed or open habitats, e.g.
after fire or human disturbance.

T

Tandem repeat sequences. Multiple copies of the same base sequence located directly next to each other on a chromosome, used as a marker in physical mapping.
Taq polymerase. A DNA polymerase isolated from the bacterium Thermus aquaticus that lives in hot springs. This enzyme is capable of withstanding high temperatures and therefore very useful in the PCR process.
Target DNA. The DNA sequence to be hybridized to a specific probe.
Targeted disruption. The disabling of a specific gene so that it is not expressed.
Telomerase. An enzyme that replaces the DNA sequences in telomeres during cell division.
Telomere. The tip of a chromosome.
Template. The single-stranded DNA blueprint for complementary strand assembly; the DNA strand from which mRNA is transcribed.
Termination sequence. The DNA sequence that signals the cessation of transcription.
Terminator. Sequence of DNA bases that tells the RNA polymerase to stop synthesizing RNA.
Thymine (T). A pyrimidine base, one of the four molecules containing nitrogen present in DNA. Designated by the letter T, it always binds with A.
Totipotent. The state of a cell that can give rise to any and all adult cell types.
© 2005 Susan A. Ehrlich
36
Transcription. The process in which an mRNA sequence is synthesized from a DNA template.
Transcription factor. Protein that binds to DNA to influence and regulate transcription. General: class required for transcription of all structural genes. Specific: class that activates only specific genes at specific times.
Transdifferentiation. The process whereby a specialized cell de-differentiates and re-differentiates into a different cell type.
Transduction. The transfer of genetic material from one cell to another by means of a virus or phage vector.
Transfection. The transfer of a DNA sequence into a cell.
Transfer RNA (tRNA). RNA molecules that brings amino acids to ribosomes for protein production during mRNA translation. The anticodon portion of the tRNA binds to a complementary mRNA codon, and the 3' end of the tRNA molecule attaches to a specific amino acid.
Transformation. The process by which the genetic material carried by an individual cell is altered by the incorporation of exogenous DNA into its genome.
Translation. The process by which the information on a messenger RNA molecule is used to direct the synthesis of a protein.
Transgenic. Refers to an organism into which a gene has been introduced from an organism of another species.
Translation. The process in which an amino-acid sequence is assembled according to the sequence specified by the mature mRNA transcript.
Translocation. The exchange of genetic material between non-homologous chromosomes.
Translocation, reciprocal. A translocation resulting from breaks on two different chromosomes and a subsequent exchange of material. Carriers of reciprocal translocations maintain the normal number of chromosomes and normal amount of chromosome material.
© 2005 Susan A. Ehrlich
37
Translocation, Robertsonian. A translocation in which the long arms of two acrocentric chromosomes are fused at the centromere; the short arms of each chromosome are lost. The carrier has 45 chromosomes instead of 46 but is phenotypically normal because the short arms contain no essential genetic material.
Transposon. See mobile element.
Triplet code. A code in which a series of three successive DNA or RNA nucleotide bases specify a particular amino acid.
Trisomy. An aneuploid condition in which the individual has an extra copy of one chromosome for a total of 47 chromosomes in each cell. Partial trisomy. Chromosomal abnormality in which a portion of a chromosome is present in three copies, may be produced by reciprocal translocation or unequal crossover.
True match. Two samples of DNA that have the same profile should match when tested. If there is no error in the labeling, handling and analysis of the samples and in the report of the results, a match is a true match.

U

Unequal crossover. Crossing over between improperly aligned DNA sequences producing deletions or duplications of genetic material.
Uniparental disomy. Condition in which two copies of one chromosome are derived from a single parent, and no copies are derived from the other parent.
Uracil (U). The pyrimidine base in RNA that appears in place of thymine in DNA.

V

Variable expression. A trait in which the same genotype may produce phenotypes of varying severity or expression, e.g., neurofibromatosis type I.
Variable number tandem repeats (VNTRs). A type of polymorphism created by variations in the number of minisatellite repeats in a defined region. The number of repeats varies from individual to individual, thus providing a basis for individual recognition.
Vector. An engineered DNA molecule into which any other DNA molecule can be inserted that then can be returned to an organism and replicated in order to amplify the DNA. The vector needs an origin for replication and usually has a gene, such as for antibiotic resistance, that permits selection for organisms carrying the vector and DNA of
© 2005 Susan A. Ehrlich
38
interest. This is the key technology for cloning DNA molecules.
Virus. A non-cellular biological entity that can reproduce only within a host cell, it consists of nucleic acid covered by protein.

W

Wash. The process of removing non-bound or loosely bound probe from blots after hybridization, used to reduce background interference.

Wild relative: A relative of a crop species that grows in the wild and is not used
for agricultural purposes.


X

X chromosome. A chromosome responsible for sex determination. Two copies are present in the genome of the homogametic sex and one copy in the heterogametic sex. The human female has two X chromosomes and the male has one X chromosome and a Y chromosome.
X inactivation. Process in which genes from one X chromosome in each cell of the female embryo are rendered transcriptionally inactive.
X-linked. Genes on the X chromosome.

Y

Y chromosome. A chromosome responsible for sex determination in the heterogametic sex.

Z

Zygosity. Twin development from one or two zygotes. If one, the twins are identical (monozygotic); if two, they are fraternal (dizygotic).
Zygote. The diploid cell resulting from the union of a haploid egg and sperm, i.e., the diploid fertilized ovum.



Numeric words

3'. The end of a nucleoside defined by the number 3 carbon atom of the ribose or deoxyribose sugar as opposed to the 5' end defined by the number 5 carbon atom. The nucleoside - phosphate - nucleoside - phosphate - nucleoside - phosphate ... chain of DNA or RNA thus has polarity; one end is the 3' end and one end is the 5' end. DNA replication is from 5' to 3' on the template.
5' cap. A chemically modified guanine nucleotide added to the 5' end of a growing mRNA molecule.





# The users are requested to send me the mistakes in this dictionary if they get any.



Visit My Website

Glossary for Molecular biology

3' end/5' end: A nucleic acid strand is inherently directional, and the "5 prime end" has a free hydroxyl (or phosphate) on a 5' carbon and the "3 prime end" has a free hydroxyl (or phosphate) on a 3' carbon (carbon atoms in the sugar ring are numbered from 1' to 5'; see Figure 1). That's simple enough for an RNA strand or for single-stranded (ss) DNA. However, for double-stranded (ds) DNA it's not so obvious - each strand has a 5' end and a 3' end, and the 5' end of one strand is paired with the 3' end of the other strand (it is "antiparallel"; Figure 2). One would talk about the 5' end of ds DNA only if there was some reason to emphasize one strand over the other - for example if one strand is the sense strand of a gene. In that case, the orientation of the sense strand establishes the direction (see Figures 3 and 4).

3' flanking region: A region of DNA which is NOT copied into the mature mRNA, but which is present adjacent to 3' end of the gene (see Figure 4). It was originally thought that the 3' flanking DNA was not transcribed at all, but it was discovered to be transcribed into RNA, but quickly removed during processing of the primary transcript to form the mature mRNA. The 3' flanking region often contains sequences which affect the formation of the 3' end of the message. It may also contain enhancers or other sites to which proteins may bind.

3' untranslated region:A region of the DNA which IS transcribed into mRNA and becomes the 3' end or the message, but which does not contain protein coding sequence. Everything between the stop codon and the polyA tail is considered to be 3' untranslated (see Figure 4). The 3' untranslated region may affect the translation efficiency of the mRNA or the stability of the mRNA. It also has sequences which are required for the addition of the poly(A) tail to the message (including one known as the "hexanucleotide", AAUAAA).

5' flanking region: A region of DNA which is NOT transcribed into RNA, but rather is adjacent to 5' end of the gene (see Figure 4). The 5'-flanking region contains the promoter, and may also contain enhancers or other protein binding sites.

5' untranslated region:A region of a gene which IS transcribed into mRNA, becoming the 5' end of the message, but which does not contain protein coding sequence. The 5'-untranslated region is the portion of the DNA starting from the cap site and extending to the base just before the ATG translation initiation codon (see Figure 4). While not itself translated, this region may have sequences which alter the translation efficiency of the mRNA, or which affect the stability of the mRNA.

Ablation experiment:An experiment designed to produce an animal deficient in one or a few cell types, in order to study cell lineage or cell function. The idea is to make a transgenic mouse with a toxin gene (often diphtheria toxin) under control of a specialized promoter which activates only in the target cell type. When embryo development progresses to the point where it starts to form the target tissue, the toxin gene is activated, and that specific tissue dies. Other tissues are unaffected.

Acrylamide gels: A polymer gel used for electrophoresis of DNA or protein to measure their sizes (in daltons for proteins, or in base pairs for DNA). See "Gel Electrophoresis". Acrylamide gels are especially useful for high resolution separations of DNA in the range of tens to hundreds of nucleotides in length.

Agarose gels: A polysaccharide gel used to measure the size of nucleic acids (in bases or base pairs). See "Gel Electrophoresis". This is the gel of choice for DNA or RNA in the range of thousands of bases in length, or even up to 1 megabase if you are using pulsed field gel electrophoresis.

Amp resistance: See "Antibiotic resistance".

Anneal: Generally synonymous with "hybridize".

Antibiotic resistance: Plasmids generally contain genes which confer on the host bacterium the ability to survive a given antibiotic. If the plasmid pBR322 is present in a host, that host will not be killed by (moderate levels of) ampicillin or tetracycline. By using plasmids containing antibiotic resistance genes, the researcher can kill off all the bacteria which have not taken up his plasmid, thus ensuring that the plasmid will be propagated as the surviving cells divide.

Anti-sense strand: See discussion under "Sense strand".

AP-1 site: The binding site on DNA at which the transcription "factor" AP-1 binds, thereby altering the rate of transcription for the adjacent gene. AP-1 is actually a complex between c-fos protein and c-jun protein, or sometimes is just c-jun dimers. The AP-1 site consensus sequence is (C/G)TGACT(C/A)A. Also known as the TPA-response element (TRE). [TPA is a phorbol ester, tetradecanoyl phorbol acetate, which is a chemical tumor promoter]

ATG or AUG: The codon for methionine; the translation initiation codon. Usually, protein translation can only start at a methionine codon (although this codon may be found elsewhere within the protein sequence as well). In eukaryotic DNA, the sequence is ATG; in RNA it is AUG. Usually, the first AUG in the mRNA is the point at which translation starts, and an open reading frame follows - i.e. the nucleotides taken three at a time will code for the amino acids of the protein, and a stop codon will be found only when the protein coding region is complete.

BAC: Bacterial Artificial Chromosome — a cloning vector capable of carrying between 100 and 300 kilobases of target sequence. They are propagated as a mini-chromosome in a bacterial host. The size of the typical BAC is ideal for use as an intermediate in large-scale genome sequencing projects. Entire genomes can be cloned into BAC libraries, and entire BAC clones can be shotgun-sequenced fairly rapidly.

Band shift assay: see Gel shift assay.

Bacteriophage lambda: A virus which infects E. coli , and which is often used in molecular genetics experiments as a vector, or cloning vehicle. Recombinant phages can be made in which certain non-essential l DNA is removed and replaced with the DNA of interest. The phage can accommodate a DNA "insert" of about 15-20 kb. Replication of that virus will thus replicate the investigator's DNA. One would use phage l rather than a plasmid if the desired piece of DNA is rather large.

Binding site: A place on cellular DNA to which a protein (such as a transcription factor) can bind. Typically, binding sites might be found in the vicinity of genes, and would be involved in activating transcription of that gene (promoter elements), in enhancing the transcription of that gene (enhancer elements), or in reducing the transcription of that gene (silencers). NOTE that whether the protein in fact performs these functions may depend on some condition, such as the presence of a hormone, or the tissue in which the gene is being examined. Binding sites could also be involved in the regulation of chromosome structure or of DNA replication.

Blotting: A technique for detecting one RNA within a mixture of RNAs (a Northern blot) or one type of DNA within a mixture of DNAs (a Southern blot). A blot can prove whether that one species of RNA or DNA is present, how much is there, and its approximate size. Basically, blotting involves gel electrophoresis, transfer to a blotting membrane (typically nitrocellulose or activated nylon), and incubating with a radioactive probe. Exposing the membrane to X-ray film produces darkening at a spot correlating with the position of the DNA or RNA of interest. The darker the spot, the more nucleic acid was present there. (see figure, below)







The DNA is first transferred from the gel to a membrane by capillary action. Fluid wicks from the gel through the blotting membrane to several layers of absorbent paper, but the nucleic acids stick to the membrane. Baking the filter fixes the DNA or RNA to the filter.



Specific bands are detected by hybridization. The filter membrane is incubated with radioactive probe, which hybridizes to some bands. After the filter is washed (to remove unused probe), an X-ray film exposed to the filter will show which bands have hybridized.

BP: Abbreviation for base pair(s). Double stranded DNA is usually measured in bp rather than nucleotides (nt).

Cap: All eukaryotes have at the 5' end of their messages a structure called a "cap", consisting of a 7-methylguanosine in 5'-5' triphosphate linkage with the first nucleotide of the mRNA. It is added post-transcriptionally, and is not encoded in the DNA.

Cap site: Two usages: In eukaryotes, the cap site is the position in the gene at which transcription starts, and really should be called the "transcription initiation site". The first nucleotide is transcribed from this site to start the nascent RNA chain. That nucleotide becomes the 5' end of the chain, and thus the nucleotide to which the cap structure is attached (see "Cap"). In bacteria, the CAP site (note the capital letters) is a site on the DNA to which a protein factor (the Catabolite Activated Protein) binds.

CAT assay: An enzyme assay. CAT stands for chloramphenicol acetyl transferase, a bacterial enzyme which inactivates chloramphenicol by acetylating it. CAT assays are often performed to test the function of a promoter. The gene coding for CAT is linked onto a promoter (transcription control region) from another gene, and the construct is "transfected" into cultured cells. The amount of CAT enzyme produced is taken to indicate the transcriptional activity of the promoter (relative to other promoters which must be tested in parallel). It is easier to perform a CAT assay than it is to do a Northern blot, so CAT assays were a common method for testing the effects of sequence changes on promoter function. Largely supplanted by the reporter gene luciferase.

CCAAT box: (CAT box, CAAT box, other variants) A sequence found in the 5' flanking region of certain genes which is necessary for efficient expression. A transcription factor (CCAAT-binding protein, CBP) binds to this site.

cDNA clone: "complementary DNA"; a piece of DNA copied from an mRNA. The term "clone" indicates that this cDNA has been spliced into a plasmid or other vector in order to propagate it. A cDNA clone may contain DNA copies of such typical mRNA regions as coding sequence, 5'-untranslated region, 3' untranslated region or poly(A) tail. No introns will be present, nor any promoter sequences (or other 5' or 3' flanking regions). A "full-length" cDNA clone is one which contains all of the mRNA sequence from nucleotide #1 through to the poly(A) tail.

ChIP: See Chromatin Immuniprecipitation (below).

Chromatin Immunoprecipitation: This is a method for isolating and characterizing the specific pieces of DNA out of an entire genome, to which is bound a protein of interest. The protein of interest could for example be a transcription factor, or a specific modified histone, or any other DNA binding protein. This procedure requires an antibody to that protein of interest.

One isolates chromosomal material with all the proteins still bound to the genomic DNA. After fragmenting the DNA, you use the antibody to immunoprecipitate all chunks that contain your protein of interest. Isolate the DNA from those chunks, and you can characterize the specific DNA sites to which your protein was bound.

There are two common ways to characterize the DNA so isolated: ChIP-chip (or "ChIP-on-chip") or ChIP-seq.

* ChIP-chip: In this variant, the DNA isolated from a ChIP experiment is characterized by labeling it with a fluorescent dye, then hybrizidizing it to a DNA array (an oligonucleotide array, for example). Array spots that "light up" are taken as evidence that their specific sequence is present in your ChIP product. Unfortunately, designing these arrays requires that you have some idea what to expect in your ChIP isolates.

* ChIP-seq: A newer variant for characterizing ChIP results, one can simply sequence everything that immunoprecipitated with the antibody. It requires no fore-knowlege of the expected products, as would ChIP-chip.

Chromosome walking: A technique for cloning everything in the genome around a known piece of DNA (the starting probe). You screen a genomic library for all clones hybridizing with the probe, and then figure out which one extends furthest into the surrounding DNA. The most distal piece of this most distal clone is then used as a probe, so that ever more distal regions can be cloned. This has been used to move as much as 200 kb away from a given starting point (an immense undertaking). Typically used to "walk" from a starting point towards some nearby gene in order to clone that gene. Also used to obtain the remainder of a gene when you have isolated a part of it.

Clone (verb): To "clone" something is to produce copies of it. To clone a piece of DNA, one would insert it into some type of vector (say, a plasmid) and put the resultant construct into a host (usually a bacterium) so that the plasmid and insert replicate with the host. An individual bacterium is isolated and grown and the plasmid containing the "cloned" DNA is re-isolated from the bacteria, at which point there will be many millions of copies of the DNA - essentially an unlimited supply. Actually, an investigator wishing to clone some gene or cDNA rarely has that DNA in a purified form, so practically speaking, to "clone" something involves screening a cDNA or genomic library for the desired clone. See also "Probe" for a description of how one might start a cloning project, and "Screening" for how the probe in used.

One can also clone more complex organisms, with considerable difficulty. The much-publicized Scottish research that resulted in the sheep ‘Dolly’ exemplifies this approach.

Clone (noun): The term "clone" can refer either to a bacterium carrying a cloned DNA, or to the cloned DNA itself. If you receive a clone from a collaborator, you should first figure out if they send you DNA or bacteria. If it is DNA, your first job is to introduce it ("transform" it) into bacteria [see "Transformation (with respect to bacteria)"]. Occasionally, someone might send just the "insert", rather than the whole plasmid. "Your assignment, Jim, if you decide to accept it", is to splice that DNA into a convenient vector, and only then can you transform it into bacteria.

Coding sequence: The portion of a gene or an mRNA which actually codes for a protein. Introns are not coding sequences; nor are the 5' or 3' untranslated regions (or the flanking regions, for that matter - they are not even transcribed into mRNA). The coding sequence in a cDNA or mature mRNA includes everything from the AUG (or ATG) initiation codon through to the stop codon, inclusive.

Coding strand: an ambiguous term intended to refer to one specific strand in a double-stranded gene. See "Sense strand".

Codon: In an mRNA, a codon is a sequence of three nucleotides which codes for the incorporation of a specific amino acid into the growing protein. The sequence of codons in the mRNA unambiguously defines the primary structure of the final protein. Of course, the codons in the mRNA were also present in the genomic DNA, but the sequence may be interrupted by introns.

Consensus sequence: A ‘nominal’ sequence inferred from multiple, imperfect examples. Multiple lanes of shotgun sequence can be merged to show a consensus sequence. The optimal sequence of nucleotides recognized by some factor. A DNA binding site for a protein may vary substantially, but one can infer the consensus sequence for the binding site by comparing numerous examples. For example, the (fictitious) transcription factor ZQ1 usually binds to the sequences AAAGTT, AAGGTT or AAGATT. The consensus sequence for that factor is said to be AARRTT (where R is any purine, i.e. A or G). ZQ1 may also be able to weakly bind to ACAGTT (which differs by one base from the consensus).

Contig: Several uses, all nouns. The term comes from a shortening of the word ‘contiguous’. A ‘contig’ may refer to a map showing placement of a set of clones that completely, contiguously cover some segment of DNA in which you are interested. Also called the ‘minimal tiling path’. More often, the term ‘contig’ is used to refer to the final product of a shotgun sequencing project. When individual lanes of sequence information are merged to infer the sequence of the larger DNA piece, the product consensus sequence is called a ‘contig’.

Cosmid: A type of vector used for cloning 35-45 kb of DNA. These are plasmids carrying a phage l cos site (which allows packaging into l capsids), an origin of replication and an antibiotic resistance gene. A plasmid of 40 kb is very difficult to put into bacteria, but can replicate once there. Cosmids, however, have a cos site, and thus can be packaged into l phage heads (a reaction which can be performed in vitro ) to allow efficient introduction into bacteria (you'll have to look up the cos site elsewhere).

DNase: Deoxyribonuclease, a class of enzymes which digest DNA. The most common is DNase I, an endonuclease which digests both single and double-stranded DNA.

Dot blot: A technique for measuring the amount of one specific DNA or RNA in a complex mixture. The samples are spotted onto a hybridization membrane (such as nitrocellulose or activated nylon, etc.), fixed and hybridized with a radioactive probe. The extent of labeling (as determined by autoradiography and densitometry) is proportional to the concentration of the target molecule in the sample. Standards provide a means of calibrating the results.

Downstream: See "Upstream/Downstream".

E. coli: A common Gram-negative bacterium useful for cloning experiments. Present in human intestinal tract. Hundreds of strains of E. coli exist. One strain, K-12, has been completely sequenced.

Electrophoresis: See "Gel electrophoresis".

Endonuclease: An enzyme which digests nucleic acids starting in the middle of the strand (as opposed to an exonuclease, which must start at an end). Examples include the restriction enzymes, DNase I and RNase A.

Enhancer: An enhancer is a nucleotide sequence to which transcription factor(s) bind, and which increases the transcription of a gene. It is NOT part of a promoter; the basic difference being that an enhancer can be moved around anywhere in the general vicinity of the gene (within several thousand nucleotides on either side or even within an intron), and it will still function. It can even be clipped out and spliced back in backwards, and will still operate. A promoter, on the other hand, is position- and orientation-dependent. Some enhancers are "conditional" - in other words, they enhance transcription only under certain conditions, for example in the presence of a hormone.

ERE: Estrogen Response Element. A binding site in a promoter to which the activated estrogen receptor can bind. The estrogen receptor is essentially a transcription factor which is activated only in the presence of estrogens. The activated receptor will bind to an ERE, and transcription of the adjacent gene will be altered. See also "Response element".

Evolutionary Footprinting: One can infer which portions of a gene are important by comparing the sequence of that gene with its cognates from other species. A plot showing the regions of high conservation will presumably reflect the regions that are functional in all the test species. In theory, the more species involved in the comparison, the more stringent the result can be (i.e. the more the conserved regions will reflect truly important sequences). Care must be taken, however, to use species in which the function of the gene has not diverged excessively, or the outcome will be uninformative.

Exon: Those portions of a genomic DNA sequence which WILL be represented in the final, mature mRNA. The term "exon" can also be used for the equivalent segments in the final RNA. Exons may include coding sequences, the 5' untranslated region or the 3' untranslated region.

Exonuclease: An enzyme which digests nucleic acids starting at one end. An example is Exonuclease III, which digests only double-stranded DNA starting from the 3' end.

Expression: To "express" a gene is to cause it to function. A gene which encodes a protein will, when expressed, be transcribed and translated to produce that protein. A gene which encodes an RNA rather than a protein (for example, a rRNA gene) will produce that RNA when expressed.

Expression clone: This is a clone (plasmid in a bacteria, or maybe a l phage in bacteria) which is designed to produce a protein from the DNA insert. Mammalian genes do not function in bacteria, so to get bacterial expression from your mammalian cDNA, you would place its coding region (i.e. no introns) immediately adjacent to bacterial transcription/translation control sequences. That artificial construct (the "expression clone") will produce a pseudo-mammalian protein if put back into bacteria. Often, that protein can be recognized by antibodies raised against the authentic mammalian protein, and vice versa.

Footprinting: A technique by which one identifies a protein binding site on cellular DNA. The presence of a bound protein prevents DNase from "nicking" that region, which can be detected by an appropriately designed gel.

Gel electrophoresis: A method to analyze the size of DNA (or RNA) fragments. In the presence of an electric field, larger fragments of DNA move through a gel slower than smaller ones. If a sample contains fragments at four different discrete sizes, those four size classes will, when subjected to electrophoresis, all migrate in groups, producing four migrating "bands". Usually, these are visualized by soaking the gel in a dye (ethidium bromide) which makes the DNA fluoresce under UV light.

Gel shift assay: (aka gel mobility shift assay (GMSA), band shift assay (BSA), electrophoretic mobility shift assay (EMSA)) A method by which one can determine whether a particular protein preparation contains factors which bind to a particular DNA fragment. When a radiolabeled DNA fragment is run on a gel, it shows a characteristic mobility. If it is first incubated with a cellular extract of proteins (or with purified protein), any protein-DNA complexes will migrate slower than the naked DNA - a shifted band.

Gene: A unit of DNA which performs one function. Usually, this is equated with the production of one RNA or one protein. A gene contains coding regions, introns, untranslated regions and control regions.

Genome: The total DNA contained in each cell of an organism. Mammalian genomic DNA (including that of humans) contains 6x109 base pairs of DNA per diploid cell. There are somewhere in the order of a hundred thousand genes, including coding regions, 5' and 3' untranslated regions, introns, 5' and 3' flanking DNA. Also present in the genome are structural segments such as telomeric and centromeric DNAs and replication origins, and intergenic DNA.

Genomic blot: A type of Southern blot specifically used to analyze a mixture of DNA fragments derived from total genomic DNA. Because genomic DNA is very complicated, when it has been digested with restriction enzymes, it produces a complex set of fragments ranging from tens of bp to tens of thousands of bp. However, any specific gene will be reproducibly found on only one or a few specific fragments. A million identical cells will produce a million identical restriction fragments for any given gene, so probing a genomic Southern with a gene-specific probe will produce a pattern of perhaps one or just a few bands.

Genomic clone: A piece of DNA taken from the genome of a cell or animal, and spliced into a bacteriophage or other cloning vector. A genomic clone may contain coding regions, exons, introns, 5' flanking regions, 5' untranslated regions, 3' flanking regions, 3' untranslated regions, or it may contain none of these...it may only contain intergenic DNA (usually not a desired outcome of a cloning experiment!).

Genotype: Two uses: one is a verb, the other a noun. To 'genotype' (verb) is to example polymorphisms (e.g. RFLPs, microsatellites, SNPs) present in a sample of DNA. You might be looking for linkage between a microsatellite marker and an unknown disease gene. With such information, you can infer the chromosomal location of the unknown gene, and can sometimes identify the gene. As a noun, a 'genotype' is the result of a genotyping experiment, be it a SNP or microsat or whatever.

GRE: Glucocorticoid Response Element: A binding site in a promoter to which the activated glucocorticoid receptor can bind. The glucocorticoid receptor is essentially a transcription factor which is activated only in the presence of glucocorticoids. The activated receptor will bind to a GRE, and transcription of the adjacent gene will be altered. See also "Response element".

Helix-loop-helix: A protein structural motif characteristic of certain DNA-binding proteins.

hnRNA: Heterogeneous nuclear RNA; refers collectively to the variety of RNAs found in the nucleus, including primary transcripts, partially processed RNAs and snRNA. The term hnRNA is often used just for the unprocessed primary transcripts, however.

Host strain (bacterial): The bacterium used to harbor a plasmid. Typical host strains include HB101 (general purpose E. coli strain), DH5a (ditto), JM101 and JM109 (suitable for growing M13 phages), XL1-Blue (general-purpose, good for blue/white lacZ screening). Note that the host strain is available in a form with no plasmids (hence you can put one of your own into it), or it may have plasmids present (especially if you put them there). Hundreds, perhaps thousands, of host strains are available.

Hybridization: The reaction by which the pairing of complementary strands of nucleic acid occurs. DNA is usually double-stranded, and when the strands are separated they will re-hybridize under the appropriate conditions. Hybrids can form between DNA-DNA, DNA-RNA or RNA-RNA. They can form between a short strand and a long strand containing a region complementary to the short one. Imperfect hybrids can also form, but the more imperfect they are, the less stable they will be (and the less likely to form). To "anneal" two strands is the same as to "hybridize" them.

Insert: In a complete plasmid clone, there are two types of DNA - the "vector" sequences and the "insert". The vector sequences are those regions necessary for propagation, antibiotic resistance, and all those mundane functions necessary for useful cloning. In contrast, however, the insert is the piece of DNA in which you are really interested.

Intergenic: Between two genes; e.g. intergenic DNA is the DNA found between two genes. The term is often used to mean non-functional DNA (or at least DNA with no known importance to the two genes flanking it). Alternatively, one might speak of the "intergenic distance" between two genes as the number of base pairs from the polyA site of the first gene to the cap site of the second. This usage might therefore include the promoter region of the second gene.

Intron: Introns are portions of genomic DNA which ARE transcribed (and thus present in the primary transcript) but which are later spliced out. They thus are not present in the mature mRNA. Note that although the 3' flanking region is often transcribed, it is removed by endonucleolytic cleavage and not by splicing. It is not an intron.

KB: abbreviation for kilobase, one thousand bases.

Kinase: A kinase is in general an enzyme that catalyzes the transfer of a phosphate group from ATP to something else. In molecular biology, it has acquired the more specific verbal usage for the transfer onto DNA of a radiolabeled phosphate group. This would be done in order to use the resultant "hot" DNA as a probe.

Knock-out experiment: A technique for deleting, mutating or otherwise inactivating a gene in a mouse. This laborious method involves transfecting a crippled gene into cultured embryonic stem cells, searching through the thousands of resulting clones for one in which the crippled gene exactly replaced the normal one (by homologous recombination), and inserting that cell back into a mouse blastocyst. The resulting mouse will be chimaeric but, if you are lucky (and if you've gotten this far, you obviously are), its germ cells will carry the deleted gene. A few rounds of careful breeding can then produce progeny in which both copies of the gene are inactivated.

Lambda: see Bacteriophage Lambda.

Leucine zipper: A motif found in certain proteins in which Leu residues are evenly spaced through an a-helical region, such that they would end up on the same face of the helix. Dimers can form between two such proteins. The Leu zipper is important in the function of transcription factors such as Fos and Jun and related proteins.

Library: A library might be either a genomic library, or a cDNA library. In either case, the library is just a tube carrying a mixture of thousands of different clones - bacteria or l phages. Each clone carries an "insert" - the cloned DNA.

A cDNA library is usually just a mixture of bacteria, where each bacteria carries a different plasmid. Inserted into the plasmids (one per plasmid) are thousands of different pieces of cDNA (each typ. 500-5000 bp) copied from some source of mRNA, for example, total liver mRNA. The basic idea is that if you have a large enough number of different liver-derived cDNAs carried in those bacteria, there is a 99% probability that a cDNA copy of any given liver mRNA exists somewhere in the tube. The real trick is to find the one you want out of that mess - a process called screening (see "Screening").

A genomic library is similar in concept to a cDNA library, but differs in three major ways - 1) the library carries pieces of genomic DNA (and so contains introns and flanking regions, as well as coding and untranslated); 2) you need bacteriophage l or cosmids, rather than plasmids, because... 3) the inserts are usually 5-15 kb long (in a l library) or 20-40 kb (in a cosmid library). Therefore, a genomic library is most commonly a tube containing a mixture of l phages. Enough different phages must be present in the library so that any given piece of DNA from the source genome has a 99% probability of being present.

Ligase: An enzyme, T4 DNA ligase, which can link pieces of DNA together. The pieces must have compatible ends (both of them blunt, or else mutually compatible sticky ends), and the ligation reaction requires ATP.

Ligation: The process of splicing two pieces of DNA together. In practice, a pool of DNA fragments are treated with ligase (see "Ligase") in the presence of ATP, and all possible splicing products are produced, including circularized forms and end-to-end ligation of 2, 3 or more pieces. Usually, only some of these products are useful, and the investigator must have some way of selecting the desirable ones.

Linker: A small piece of synthetic double-stranded DNA which contains something useful, such as a restriction site. A linker might be ligated onto the end of another piece of DNA to provide a desired restriction site.

Marker: Two typical usages:

Molecular weight size marker: a piece of DNA of known size, or a mixture of pieces with known size, used on electrophoresis gels to determine the size of unknown DNA’s by comparison.

Genetic marker: A known site on the chromosome. It might for example be the site of a locus with some recognizable phenotype, or it may be the site of a polymorphism that can be experimentally discerned. See 'Microsatellite', 'SNP', 'Genotyping'.

Message: see mRNA.

Microsatellite: A microsatellite is a simple sequence repeat (SSR). It might be a homopolymer ('...TTTTTTT...'), a dinucleotide repeat ('....CACACACACACACA.....'), trinucleotide repeat ('....AGTAGTAGTAGTAGT...') etc. Due to polymerase slip (a.k.a. polymerase chatter), during DNA replication there is a slight chance these repeat sequences may become altered; copies of the repeat unit can be created or removed. Consequently, the exact number of repeat units may differ between unrelated individuals. Considering all the known microsatellite markers, no two individuals are identical. This is the basis for forensic DNA identification and for testing of familial relationships (e.g. paternity testing).

mRNA: "messenger RNA" or sometimes just "message"; an RNA which contains sequences coding for a protein. The term mRNA is used only for a mature transcript with polyA tail and with all introns removed, rather than the primary transcript in the nucleus. As such, an mRNA will have a 5' untranslated region, a coding region, a 3' untranslated region and (almost always) a poly(A) tail. Typically about 2% of the total cellular RNA is mRNA.

M13: A bacteriophage which infects certain strains of E. coli . The salient feature of this phage is that it packages only a single strand of DNA into its capsid. If the investigator has inserted some heterologous DNA into the M13 genome, copious quantities of single-stranded DNA can subsequently be isolated from the phage capsids. M13 is often used to generate templates for DNA sequencing.

Nick translation: A method for incorporating radioactive isotopes (typically 32P) into a piece of DNA. The DNA is randomly nicked by DNase I, and then starting from those nicks DNA polymerase I digests and then replaces a stretch of DNA. Radiolabeled precursor nucleotide triphosphates can thus be incorporated.

Non-coding strand: Anti-sense strand. See "Sense strand" for a discussion of sense strand vs. anti-sense strand.

Northern blot: A technique for analyzing mixtures of RNA, whereby the presence and rough size of one particular type of RNA (usually an mRNA) can be ascertained. See "Blotting" for more information. After Dr. E. M. Southern invented the Southern blot, it was adapted to RNA and named the "Northern" blot.

NT: Abbreviation for nucleotide; i.e. the monomeric unit from which DNA or RNA are built. One can express the size of a nucleic acid strand in terms of the number of nucleotides in its chain; hence ‘nt’ can be a measure of chain length.

Nuclear run-on: A method used to estimate the relative rate of transcription of a given gene, as opposed to the steady-state level of the mRNA transcript (which is influenced not just by transcription rates, but by the stability of the RNA). This technique is based on the assumption that a highly-transcribed gene should have more molecules of RNA polymerase bound to it than will the same gene in a less-active state. If properly prepared, isolated nuclei will continue to transcribe genes and incorporate 32P into RNA, but only in those transcripts that were in progress at the time the nuclei were isolated. Once the polymerase molecules complete the transcript they have in progress, they should not be able to re-initiate transcription. If that is true, then the amount of radiolabel incorporated into a specific type of mRNA is theoretically proportional to the number of RNA polymerase complexes present on that gene at the time of isolation. A very difficult technique, rarely applied appropriately from what I understand.

Nuclease: An enzyme which degrades nucleic acids. A nuclease can be DNA-specific (a DNase), RNA-specific (RNase) or non-specific. It may act only on single stranded nucleic acids, or only on double-stranded nucleic acids, or it may be non-specific with respect to strandedness. A nuclease may degrade only from an end (an exonuclease), or may be able to start in the middle of a strand (an endonuclease). To further complicate matters, many enzymes have multiple functions; for example, Bal31 has a 3'-exonuclease activity on double-stranded DNA, and an endonuclease activity specific for single-stranded DNA or RNA.

Nuclease protection assay: See "RNase protection assay".

Oncogene: A gene in a tumor virus or in cancerous cells which, when transferred into other cells, can cause transformation (note that only certain cells are susceptible to transformation by any one oncogene). Functional oncogenes are not present in normal cells. A normal cell has many "proto-oncogenes" which serve normal functions, and which under the right circumstances can be activated to become oncogenes. The prefix "v-" indicates that a gene is derived from a virus, and is generally an oncogene (like v-src , v-ras, v-myb , etc). See also "Transformation (with respect to cultured cells)".

Open reading frame: Any region of DNA or RNA where a protein could be encoded. In other words, there must be a string of nucleotides (possibly starting with a Met codon) in which one of the three reading frames has no stop codons. See "Reading frame" for a simple example.

Origin of replication: Nucleotide sequences present in a plasmid which are necessary for that plasmid to replicate in the bacterial host. (Abbr. "ori")

pBR322: A common plasmid. Along with the obligatory origin of replication, this plasmid has genes which make the E. coli host resistant to ampicillin and tetracycline. It also has several restriction sites (BamHI, PstI, EcoRI, HindIII etc.) into which DNA fragments could be spliced in order to clone them.

PCR: see Polymerase Chain Reaction.

Phagemid: A type of plasmid which carries within its sequence a bacteriophage replication origin. When the host bacterium is infected with "helper" phage, the phagemid is replicated along with the phage DNA and packaged into phage capsids.

Plasmid: A circular piece of DNA present in bacteria or isolated from bacteria. Escherichia coli, the usual bacteria in molecular genetics experiments, has a large circular genome, but it will also replicate smaller circular DNAs as long as they have an "origin of replication". Plasmids may also have other DNA inserted by the investigator. A bacterium carrying a plasmid and replicating a million-fold will produce a million identical copies of that plasmid. Common plasmids are pBR322, pGEM, pUC18.

PolyA tail: After an mRNA is transcribed from a gene, the cell adds a stretch of A residues (typically 50-200) to its 3' end. It is thought that the presence of this "polyA tail" increases the stability of the mRNA (possibly by protecting it from nucleases). Note that not all mRNAs have a polyA tail; the histone mRNAs in particular do not.

Polymerase: An enzyme which links individual nucleotides together into a long strand, using another strand as a template. There are two general types of polymerase — DNA polymerases (which synthesize DNA) and RNA polymerase (which makes RNA). Within these two classes, there are numerous sub-types of polymerase, depending on what type of nucleic acid can function as template and what type of nucleic acid is formed. A DNA-dependant DNA polymerase will copy one DNA strand starting from a primer, and the product will be the complementary DNA strand. A DNA-dependant RNA polymerase will use DNA as a template to synthesize an RNA strand.

Polymerase chain reaction: A technique for replicating a specific piece of DNA in-vitro , even in the presence of excess non-specific DNA. Primers are added (which initiate the copying of each strand) along with nucleotides and Taq polymerase. By cycling the temperature, the target DNA is repetitively denatured and copied. A single copy of the target DNA, even if mixed in with other undesirable DNA, can be amplified to obtain billions of replicates. PCR can be used to amplify RNA sequences if they are first converted to DNA via reverse transcriptase. This two-phase procedure is known as ‘RT-PCR’.

Polymerase Chain Reaction (PCR) is the basis for a number of extremely important methods in molecular biology. It can be used to detect and measure vanishingly small amounts of DNA and to create customized pieces of DNA. It has been applied to clinical diagnosis and therapy, to forensics and to vast numbers of research applications. It would be difficult to overstate the importance of PCR to science.

Post-transcriptional regulation: Any process occurring after transcription which affects the amount of protein a gene produces. Includes RNA processing efficiency, RNA stability, translation efficiency, protein stability. For example, the rapid degradation of an mRNA will reduce the amount of protein arising from it. Increasing the rate at which an mRNA is translated will increase the amount of protein product.

Post-translational processing: The reactions which alter a protein's covalent structure, such as phosphorylation, glycosylation or proteolytic cleavage.

Post-translational regulation: Any process which affects the amount of protein produced from a gene, and which occurs AFTER translation in the grand scheme of genetic expression. Actually, this is often just a buzz-word for regulation of the stability of the protein. The more stable a protein is, the more it will accumulate.

PRE: Progesterone Response Element: A binding site in a promoter to which the activated progesterone receptor can bind. The progesterone receptor is essentially a transcription factor which is activated only in the presence of progesterone . The activated receptor will bind to a PRE, and transcription of the adjacent gene will be altered. See also "Response element".

Primary transcript: When a gene is transcribed in the nucleus, the initial product is the primary transcript, an RNA containing copies of all exons and introns. This primary transcript is then processed by the cell to remove the introns, to cleave off unwanted 3' sequence, and to polyadenylate the 5' end. The mature message thus formed is then exported to the cytoplasm for translation.

Primer: A small oligonucleotide (anywhere from 6 to 50 nt long) used to prime DNA synthesis. The DNA polymerases are only able to extend a pre-existing strand along a template; they are not able to take a naked single strand and produce a complementary copy of it de-novo. A primer which sticks to the template is therefore used to initiate the replication. Primers are necessary for DNA sequencing and PCR.

Primer extension: This is a method used to figure out how far upstream from a fixed site the start of an mRNA is. For example, perhaps you have isolated a cDNA clone, but you don't think that the clone has all of the 5' untranslated region. To find out how much is missing, you would first sequence the part you have, and figure out which strand is coding strand (usually the coding strand will have a large open reading frame). Next, you ask the DNA Synthesis Facility to make an oligonucleotide complementary to the 5'-most region of the coding strand (and thus complementary to the mRNA). This "primer" is hybridized to mRNA (say, a mixture of mRNA containing the one in which you are interested), and reverse transcriptase is added to copy the mRNA from the primer out to the 5' end. The size of the resulting DNA fragment shows how far away from the 5' end your primer is.

Probe: A fragment of DNA or RNA which is labeled in some way (often incorporating 32P or 35S), and which is used to hybridize with the nucleic acid in which you are interested. For example, if you want to quantitate the levels of alpha subunit mRNA in a preparation of pituitary RNA, you might make a radiolabeled RNA in-vitro which is complementary to the mRNA, and then use it to probe a Northern blot of the pit RNA. A probe can be radiolabeled, or tagged with another functional group such as biotin. A probe can be cloned DNA, or might be a synthetic DNA strand. As an example of the latter, perhaps you have isolated a protein for which you wish to obtain a cDNA or genomic clone. You might (pay to) microsequence a portion of the protein, deduce the nucleic acid sequence, (pay to) synthesize an oligonucleotide carrying that sequence, radiolabel it and use it as a probe to screen a cDNA library or genomic library. A better way is to call up someone who already has the clone.

Processing: The reactions occurring in the nucleus which convert the primary RNA transcript to a mature mRNA. Processing reactions include capping, splicing and polyadenylation. The term can also refer to the processing of the protein product, including proteolytic cleavages, glycosylation, etc.

Promoter: The first few hundred nucleotides of DNA "upstream" (on the 5' side) of a gene, which control the transcription of that gene. The promoter is part of the 5' flanking DNA, i.e. it is not transcribed into RNA, but without the promoter, the gene is not functional. Note that the definition is a bit hazy as far as the size of the region encompassed, but the "promoter" of a gene starts with the nucleotide immediately upstream from the cap site, and includes binding sites for one or more transcription factors which can not work if moved farther away from the gene.

Proto-oncogene: A gene present in a normal cell which carries out a normal cellular function, but which can become an oncogene under certain circumstances. The prefix "c-" indicates a cellular gene, and is generally used for proto-oncogenes (examples: c-myb , c-myc , c-fos , c-jun , etc).

Pulsed field gel electrophoresis: (PFGE) A gel technique which allows size-separation of very large fragments of DNA, in the range of hundreds of kb to thousands of kb. As in other gel electrophoresis techniques, populations of molecules migrate through the gel at a speed related to their size, producing discrete bands. In normal electrophoresis, DNA fragments greater than a certain size limit all migrate at the same rate through the gel. In PFGE, the electrophoretic voltage is applied alternately along two perpendicular axes, which forces even the larger DNA fragments to separate by size.

Random primed synthesis: If you have a DNA clone and you want to produce radioactive copies of it, one way is to denature it (separate the strands), then hybridize to that template a mixture of all possible 6-mer oligonucleotides. Those oligos will act as primers for the synthesis of labeled strands by DNA polymerase (in the presence of radiolabeled precursors).

Reading frame: When mRNA is translated by the cell, the nucleotides are read three at a time. By starting at different positions, the groupings of three that are produced can be entirely different. The following example shows a DNA sequence and the three reading frames in which it could be read. Not only is an entirely different amino acid sequence specified by the different reading frames, but two of the three frames have stop codons, and thus are not open reading frames (asterisks indicate a stop codon).

A DNA open reading frame:

...ATG ACA TGT AAA GAT AGA CTA ACC TTT TGG... ...Met Thr Cys Lys Asp Arg Leu Thr Phe Trp...

Same bases, different 'frame':

...A TGA CAT GTA AAG ATA GAC TAA CCT TTT GG... ... *** His Val Lys Ile Asp *** Pro Phe Gly..

Same sequence, the last of the 3 possible frames:

...AT GAC ATG TAA AGA TAG ACT AAC CTT TTG G.. ... Asp Met *** Arg *** Thr Asn Leu Leu ...


If we shift the grouping again, we will just get the first reading frame again. The reading frame that is actually used is determined by the first methionine codon (the initiation codon). Once that first AUG is recognized, the pattern of triplet groupings follows unambiguously.

Repetitive DNA: A surprising portion of any genome consists not of genes or structural elements, but of frequently repeated simple sequences. These may be short repeats just a few nt long, like CACACA etc. They can also range up to a few hundred nt long. Examples of the latter include Alu repeats, LINEs, SINEs. The function of these elements is often unknown. In shorter repeats like di- and tri-nucleotide repeats, the number of repeating units can occasionally change during evolution and descent. They are thus useful markers for familial relationships and have been used in paternity testing, forensic science and in the identification of human remains.

Response element: By definition, a "response element" is a portion of a gene which must be present in order for that gene to respond to some hormone or other stimulus. Response elements are binding sites for transcription factors. Certain transcription factors are activated by stimuli such as hormones or heat shock. A gene may respond to the presence of that hormone because the gene has in its promoter region a binding site for hormone-activated transcription factor. Example: the glucocorticoid response element (GRE).

Restriction: To "restrict" DNA means to cut it with a restriction enzyme. See "Restriction Enzyme".

Restriction enzyme: A class of enzymes ("restriction endonucleases") generally isolated from bacteria, which are able to recognize and cut specific sequences ("restriction sites") in DNA. For example, the restriction enzyme BamHI locates and cuts any occurrence of:

5'-GGATCC-3'
||||||
3'-CCTAGG-5'

Note that both strands contain the sequence GGATCC, but in antiparallel orientation. The recognition site is thus said to be palindromic, which is typical of restriction sites. Every copy of a plasmid is identical in sequence, so if BamHI cuts a particular circular plasmid at three sites producing three "restriction fragments", then a million copies of that plasmid will produce those same restriction fragments a million times over. There are more than six hundred known restriction enzymes.

Bacteria produce restriction enzymes for protection against invasion by foreign DNA such as phages. The bacteria's own DNA is modified in such a way as to prevent it from being clipped.

Restriction fragment: The piece of DNA released after restriction digestion of plasmids or genomic DNA. See "Restriction enzyme". One can digest a plasmid and isolate one particular restriction fragment (actually a set of identical fragments). The term also describes the fragments detected on a genomic blot which carry the gene of interest.

Restriction fragment length polymorphism: See "RFLP".

Restriction map: A "cartoon" depiction of the locations within a stretch of known DNA where restriction enzymes will cut.

The map usually indicates the approximate length of the entire piece (scale on the bottom), as well as the position within the piece at which designated enzymes will cut. This map happens to be of a plasmid, and the two ends are joined together with about 25 nt between the EcoRI and HindIII sites.

Restriction site: See Restriction enzyme.

Reverse transcriptase: An enzyme which will make a DNA copy of an RNA template - a DNA-dependant RNA polymerase. RT is used to make cDNA; one begins by isolating polyadenylated mRNA, providing oligo-dT as a primer, and adding nucleotide triphosphates and RT to copy the RNA into cDNA.

RFLP: Restriction fragment length polymorphism; the acronym is pronounced "riflip". Although two individuals of the same species have almost identical genomes, they will always differ at a few nucleotides. Some of these differences will produce new restriction sites (or remove them), and thus the banding pattern seen on a genomic Southern will thus be affected. For any given probe (or gene), it is often possible to test different restriction enzymes until you find one which gives a pattern difference between two individuals - a RFLP. The less related the individuals, the more divergent their DNA sequences are and the more likely you are to find a RFLP.

Ribonuclease: see "RNAse".

Riboprobe: A strand of RNA synthesized in-vitro (usually radiolabeled) and used as a probe for hybridization reactions. An RNA probe can be synthesized at very high specific activity, is single stranded (and therefore will not self anneal), and can be used for very sensitive detection of DNA or RNA.

Ribosome: A cellular particle which is involved in the translation of mRNAs to make proteins. Ribosomes are a complex consisting of ribosomal RNAs (rRNA) and several proteins.

RNAi: 'RNA interference' (a.k.a. 'RNA silencing') is the mechanism by which small double-stranded RNAs can interfere with expression of any mRNA having a similar sequence. Those small RNAs are known as 'siRNA', for short interfering RNAs. The mode of action for siRNA appears to be via dissociation of its strands, hybridization to the target RNA, extension of those fragments by an RNA-dependent RNA polymerase, then fragmentation of the target. Importantly, the remnants of the target molecule appears to then act as an siRNA itself; thus the effect of a small amount of starting siRNA is effectively amplified and can have long-lasting effects on the recipient cell.

The RNAi effect has been exploited in numerous research programs to deplete the call of specific messages, thus examining the role of those messages by their absence.

RNase: Ribonuclease; an enzyme which degrades RNA. It is ubiquitous in living organisms and is exceptionally stable. The prevention of RNase activity is the primary problem in handling RNA.

RNase protection assay: This is a sensitive method to determine (1) the amount of a specific mRNA present in a complex mixture of mRNA and/or (2) the sizes of exons which comprise the mRNA of interest. A radioactive DNA or RNA probe (in excess) is allowed to hybridize with a sample of mRNA (for example, total mRNA isolated from tissue), after which the mixture is digested with single-strand specific nuclease. Only the probe which is hybridized to the specific mRNA will escape the nuclease treatment, and can be detected on a gel. The amount of radioactivity which was protected from nuclease is proportional to the amount of mRNA to which it hybridized. If the probe included both intron and exons, only the exons will be protected from nuclease and their sizes can be ascertained on the gel.

rRNA: "ribosomal RNA"; any of several RNAs which become part of the ribosome, and thus are involved in translating mRNA and synthesizing proteins. They are the most abundant RNA in the cell (on a mass basis).

RT-PCR: See ‘Polymerase Chain Reaction’.

Run-off: see Nuclear run-on.

Run-on: see Nuclear run-on.

S1 end mapping: A technique to determine where the end of an RNA transcript lies with respect to its template DNA (the gene). Can't be described in a short paragraph. See "RNAse Protection assay" for a closely related technique.

S1 nuclease: An enzyme which digests only single-stranded nucleic acids.

Screening: To screen a library (see "Library") is to select and isolate individual clones out of the mixture of clones. For example, if you needed a cDNA clone of the pituitary glycoprotein hormone alpha subunit, you would need to make (or buy) a pituitary cDNA library, then screen that library in order to detect and isolate those few bacteria carrying alpha subunit cDNA.

There are two methods of screening which are particularly worth describing: screening by hybridization, and screening by antibody.

Screening by hybridization involves spreading the mixture of bacteria out on a dozen or so agar plates to grow several ten thousand isolated colonies. Membranes are laid onto each plate, and some of the bacteria from each colony stick, producing replicas of each colony in their original growth position. The membranes are lifted and the adherent bacteria are lysed, then hybridized to a radioactive piece of alpha DNA (the source of which is a story in itself - see "Probe"). When X-ray film is laid on the filter, only colonies carrying alpha sequences will "light up". Their position on the membranes show where they grew on the original plates, so you now can go back to the original plate (where the remnants of the colonies are still alive), pick the colony off the plate and grow it up. You now have an unlimited source of alpha cDNA.

Screening by antibody is an option if the bacteria and plasmid are designed to express proteins from the cDNA inserts (see "Expression clones"). The principle is similar to hybridization, in that you lift replica filters from bacterial plates, but then you use the antibody (perhaps generated after olde tyme protein purification rituals) to show which colony expresses the desired protein.

Sense strand: A gene has two strands: the sense strand and the anti-sense strand. The Sense strand is, by definition, the same 'sense' as the mRNA; that is it can be translated exactly as the mRNA sequence can. Given a sense strand with the following sequence:

5' - ATG GGG CCA CGG CTG TGA - 3'
Met Gly Pro Arg Leu stop

The anti-sense strand will read as follows (note that the strand has been reversed and complemented):

5' - TCA CAG CCG TGG CCC CAT - 3'

The duplex DNA will pair as follows:

5' - ATGGGGCCACGGCTGTGA - 3'
||||||||||||||||||
3' - TACCCCGGAGCCGACACT - 5'


Note however that when the RNA is transcribed from this sequence, the ANTI-SENSE strand is used as the template for RNA polymerization. After all, the RNA must base-pair with its template strand (see Figure 3), so the process of transcription produces the complement of the anti-sense strand. This introduces some confusion about terminology:

Some people use the term ‘coding strand’ and ‘non-coding strand’ to refer to the sense and antisense strands, respectively. Unfortunately, many people interpret these terms in exactly the opposite way. I consider the terms ‘coding strand’ and ‘non-coding strand’ to be too ambiguous. Some people use the exact opposite definition for ‘sense’ and ‘anti-sense’ that I have given here. Be aware of the possibility of a discrepancy. Textbooks I have consulted generally agree with the nomenclature given herein, albeit some avoid defining these terms at all.

Sequence: As a noun, the sequence of a DNA is a buzz word for the structure of a DNA molecule, in terms of the sequence of bases it contains. As a verb, "to sequence" is to determine the structure of a piece of DNA; i.e. the sequence of nucleotides it contains.

Shotgun cloning: The practice of randomly clipping a larger DNA fragment into various smaller pieces, cloning everything, and then studying the resulting individual clones to figure out what happened. For example, if one was studying a 50 kb gene, it "may" be a bit difficult to figure out the restriction map. By randomly breaking it into smaller fragments and mapping those, a master restriction map could be deduced. See also Shotgun sequencing.

Shotgun sequencing: A way of determining the sequence of a large DNA fragment which requires little brainpower but lots of late nights. The large fragment is shotgun cloned (see above), and then each of the resulting smaller clones ("subclones") is sequenced. By finding out where the subclones overlap, the sequence of the larger piece becomes apparent. Note that some of the regions will get sequenced several times just by chance.

siRNA: Small Inhibitory RNA; a.k.a. 'RNAi'. See 'RNAi'.

Slot blot: Similar to a dot blot, but the analyte is put onto the membrane using a slot-shaped template. The template produces a consistently shaped spot, thus decreasing errors and improving the accuracy of the analysis. See Dot blot.

snRNA: Small nuclear RNA; forms complexes with proteins to form snRNPs; involved in RNA splicing, polyadenylation reactions, other unknown functions (probably).

snRNP: "snerps", Small Nuclear RiboNucleoProtein particles, which are complexes between small nuclear RNAs and proteins, and which are involved in RNA splicing and polyadenylation reactions.

SNP: Single Nucleotide Polymorphism (SNP) - a position in a genomic DNA sequence that varies from one individual to another. It is thought that the primary source of genetic difference between any two humans is due to the presence of single nucleotide polymorphisms in their DNA. Furthermore, these SNPs can be extremely useful in genetic mapping (see 'Genetic Mapping') to follow inheritance of specific segments of DNA in a lineage. SNP-typing is the process of determining the exact nucleotide at positions known to be polymorphic.

Solution hybridization: A method closely related to RNase protection (see "RNase protection assay"). Solution hybridization is designed to measure the levels of a specific mRNA species in a complex population of RNA. An excess of radioactive probe is allowed to hybridize to the RNA, then single-strand specific nuclease is used to destroy the remaining unhybridized probe and RNA. The "protected" probe is separated from the degraded fragments, and the amount of radioactivity in it is proportional to the amount of mRNA in the sample which was capable of hybridization. This can be a very sensitive detection method.

Southern blot: A technique for analyzing mixtures of DNA, whereby the presence and rough size of one particular fragment of DNA can be ascertained. See "Blotting". Named for its inventor, Dr E. M. Southern.

SSR: Simple Sequence Repeat. See 'Microsatellite'.

Stable transfection: A form of transfection experiment designed to produce permanent lines of cultured cells with a new gene inserted into their genome. Usually this is done by linking the desired gene with a "selectable" gene, i.e. a gene which confers resistance to a toxin (like G418, aka Geneticin). Upon putting the toxin into the culture medium, only those cells which incorporate the resistance gene will survive, and essentially all of those will also have incorporated the experimenter's gene.

Sticky ends: After digestion of a DNA with certain restriction enzymes, the ends left have one strand overhanging the other to form a short (typically 4 nt) single-stranded segment. This overhang will easily re-attach to other ends like it, and are thus known as "sticky ends". For example, the enzyme BamHI recognizes the sequence GGATCC, and clips after the first G in each strand:

The overhangs thus produced can still hybridize ("anneal") with each other, even if they came from different parent DNA molecules, and the enzyme ligase will then covalently link the strands. Sticky ends therefore facilitate the ligation of diverse segments of DNA, and allow the formation of novel DNA constructs.

Stringency: A term used to describe the conditions of hybridization. By varying the conditions (especially salt concentration and temperature) a given probe sequence may be allowed to hybridize only with its exact complement (high stringency), or with any somewhat related sequences (relaxed or low stringency). Increasing the temperature or decreasing the salt concentration will tend to increase the selectivity of a hybridization reaction, and thus will raise the stringency.

Sub-cloning: If you have a cloned piece of DNA (say, inserted into a plasmid) and you need unlimited copies of only a part of it, you might "sub-clone" it. This involves starting with several million copies of the original plasmid, cutting with restriction enzymes, and purifying the desired fragment out of the mixture. That fragment can then be inserted into a new plasmid for replication. It has now been subcloned.

Taq polymerase: A DNA polymerase isolated from the bacterium Thermophilis aquaticus and which is very stable to high temperatures. It is used in PCR procedures and high temperature sequencing.

TATA box: A sequence found in the promoter (part of the 5' flanking region) of many genes. Deletion of this site (the binding site of transcription factor TFIID) causes a marked reduction in transcription, and gives rise to heterogeneous transcription initiation sites.

Tet resistance: See "Antibiotic resistance".

Tissue-specific expression: Gene function which is restricted to a particular tissue or cell type. For example, the glycoprotein hormone alpha subunit is produced only in certain cell types of the anterior pituitary and placenta, not in lungs or skin; thus expression of the glycoprotein hormone alpha-chain gene is said to be tissue-specific. Tissue specific expression is usually the result of an enhancer which is activated only in the proper cell type.

Tm: The melting point for a double-stranded nucleic acid. Technically, this is defined as the temperature at which 50% of the strands are in double-stranded form and 50% are single-stranded, i.e. midway in the melting curve. A primer has a specific Tm because it is assumed that it will find an opposite strand of appropriate character.

Transcription factor: A protein which is involved in the transcription of genes. These usually bind to DNA as part of their function (but not necessarily). A transcription factor may be general (i.e. acting on many or all genes in all tissues), or tissue-specific (i.e. present only in a particular cell type, and activating the genes restricted to that cell type). Its activity may be constitutive, or may depend on the presence of some stimulus; for example, the glucocorticoid receptor is a transcription factor which is active only when glucocorticoids are present.

Transcription: The process of copying DNA to produce an RNA transcript. This is the first step in the expression of any gene. The resulting RNA, if it codes for a protein, will be spliced, polyadenylated, transported to the cytoplasm, and by the process of translation will produce the desired protein molecule.

Transfection: A method by which experimental DNA may be put into a cultured mammalian cell. Such experiments are usually performed using cloned DNA containing coding sequences and control regions (promoters, etc) in order to test whether the DNA will be expressed. Since the cloned DNA may have been extensively modified (for example, protein binding sites on the promoter may have been altered or removed), this procedure is often used to test whether a particular modification affects the function of a gene.

Transformation (with respect to bacteria): The process by which a bacteria acquires a plasmid and becomes antibiotic resistant. This term most commonly refers to a bench procedure performed by the investigator which introduces experimental plasmids into bacteria.

Transformation (with respect to cultured cells): A change in cell morphology and behavior which is generally related to carcinogenesis. Transformed cells tend to exhibit characteristics known collectively as the "transformed phenotype" (rounded cell bodies, reduced attachment dependence, increased growth rate, loss of contact inhibition, etc). There are different "degrees" of transformation, and cells may exhibit only a subset of these characteristics. Not well understood, the process of transformation is the subject of intense research.

Transgenic mouse: A mouse which carries experimentally introduced DNA. The procedure by which one makes a transgenic mouse involves the injection of DNA into a fertilized embryo at the pro-nuclear stage. The DNA is generally cloned, and may be experimentally altered. It will become incorporated into the genome of the embryo. That embryo is implanted into a foster mother, who gives birth to an animal carrying the new gene. Various experiments are then carried out to test the functionality of the inserted DNA.

Transient transfection: When DNA is transfected into cultured cells, it is able to stay in those cells for about 2-3 days, but then will be lost (unless steps are taken to ensure that it is retained - see Stable transfection). During those 2-3 days, the DNA is functional, and any functional genes it contains will be expressed. Investigators take advantage of this transient expression period to test gene function.

Translation: The process of decoding a strand of mRNA, thereby producing a protein based on the code. This process requires ribosomes (which are composed of rRNA along with various proteins) to perform the synthesis, and tRNA to bring in the amino acids. Sometimes, however, people speak of "translating" the DNA or RNA when they are merely reading the nucleotide sequence and predicting from it the sequence of the encoded protein. This might be more accurately termed "conceptual translation".

Tumor suppressor: A gene that inhibits progression towards neoplastic transformation. The best-known examples of tumor suppressors are the proteins p53 and Rb.

tRNA: "transfer RNA"; one of a class of rather small RNAs used by the cell to carry amino acids to the enzyme complex (the ribosome) which builds proteins, using an mRNA as a guide. Fairly abundant.

Upstream activator sequence: A binding site for transcription factors, generally part of a promoter region. A UAS may be found upstream of the TATA sequence (if there is one), and its function is (like an enhancer) to increase transcription. Unlike an enhancer, it can not be positioned just anywhere or in any orientation.

Upstream/Downstream: In an RNA, anything towards the 5' end of a reference point is "upstream" of that point. This orientation reflects the direction of both the synthesis of mRNA, and its translation - from the 5' end to the 3' end. In DNA, the situation is a bit more complicated. In the vicinity of a gene (or in a cDNA), the DNA has two strands, but one strand is virtually a duplicate of the RNA, so it's 5' and 3' ends determine upstream and downstream, respectively. NOTE that in genomic DNA, two adjacent genes may be on different strands and thus oriented in opposite directions. Upstream or downstream is only used on conjunction with a given gene.

Vector: The DNA "vehicle" used to carry experimental DNA and to clone it. The vector provides all sequences essential for replicating the test DNA. Typical vectors include plasmids, cosmids, phages and YACs.

Western blot: A technique for analyzing mixtures of proteins to show the presence, size and abundance of one particular type of protein. Similar to Southern or Northern blotting (see "Blotting"), except that (1) a protein mixture is electrophoresed in an acrylamide gel, and (2) the "probe" is an antibody which recognizes the protein of interest, followed by a radioactive secondary probe (such as 125I-protein A).

YAC: Yeast artificial chromosome. This is a method for cloning very large fragments of DNA. Genomic DNA in fragments of 200-500 kb are linked to sequences which allow them to propagate in yeast as a mini-chromosome (including telomeres, a centromere and an ARS - an autonomous replication sequence). This technique is used to clone large genes and intergenic regions, and for chromosome walking.

Zinc finger: A protein structural motif common in DNA binding proteins. Four Cys residues are found for each "finger" and one finger can bind a molecule of zinc. A typical configuration is: CysXxxXxxCys--(intervening 12 or so aa's)--CysXxxXxxCys.




REFERENCE-A Quick and Dirty Reference to Terms Used in Molecular Biology

Dr. Robert H. Lyons, Director
University of Michigan DNA Sequencing Core

Blog Archive