One of many library books on protein analysis |
Use Socrates to find printed or digital books or journals at Stanford. Digital Journals available to the Stanford community are listed in Electronic Journals and Newspapers. To find journal articles on Protein Analysis, use one of the databases listed under Article Resources below. Under Web Resources, find organizations, government agencies, or web sites related to the topic. Can’t find what you need? Comments or questions? Contact Falconer Biology Library. |
Article Resources
BIOBASE Knowledge Library (BKL), combining the PROTEOME and TRANSFAC suites of databases, offers rich, high quality content and integrated analysis tools spanning gene regulation and pathways to fully annotated genomes. These database products offer a combination of computational and experimental data. PROTEOME is a comprehensive knowledge resource for the complete known proteomes of selected species, including humans. TRANSFAC compiles data on transcription factors, their experimentally-proven binding sites, and regulated genes.
Also known as BIOSIS Previews, indexes the worldwide literature of research in the biological and biomedical sciences. The database covers the entire field of life sciences including original research reports and reviews in field, laboratory, clinical, experimental, and theoretical work. BIOSIS indexes journals, technical reports, meeting proceedings, United States patents, and books in biology, biomedicine, and related areas. The database indexes literature published from 1926 to the present. Over 500,000 journal articles and other documents from over 6,000 journals and other sources are indexed each year.
Developed by the National Center for Biotechnology Information (NCBI), Pubmed is a free version of MEDLINE that offers links to GenBank records, other molecular sequence data, and other resources. MEDLINE, produced by the National Library of Medicine, is an index to journal articles in medicine, nursing, dentistry, veterinary medicine, the health care system, and the preclinical sciences, including both basic biomedical sciences and clinical practice. It indexes over 4,000 journals. The database contains citations from 1950 to the present with some older citations. An important feature of MEDLINE is the use of Medical Subject Headings (MeSH), a powerful tool for searching specific topics or for comprehensive general searches. MEDLINE is also available to Stanford users via EBSCO, ISI Web of Knowledge, NLM Gateway, OCLC ,and Ovid.
The Web of Science service, available via ISI Web of Knowledge, includes three core component databases: the expanded version of Science Citation Index (SciSearch), Social Sciences Citation Index, and Arts & Humanities Citation Index. The Science Citation Index provides access to current and retrospective bibliographic information, author abstracts, and cited references found in the world's leading scholarly science and technical journals covering over 100 disciplines from 1900 to the present. An important feature of SciSearch is the ability to perform cited reference searching to find recent articles that cite an earlier work.
xSearch, developed through the Stanford University Libraries' partnership with Deep Web Technologies, provides Stanford researchers and students with a single search option for multiple online resources. Searches may be limited to specific databases, or all available sources may be searched simultaneously. Search results are merged into one relevance ranked list, and are clustered by topic, author, source publication, publisher, and date. Custom searches using any selection of available databases can be created and re-used. Users may also create alerts in order to be informed automatically of new items that match search criteria.
Web Resources
The Arabidopsis Information Resource offers genomic information, as well as analysis tools, links and downloads all about Arabidopsis. It is also searchable for information on genes, DNA, genetic markers, proteins and polymorphisms.
ArrayExpress is a public archive for functional genomics data compliant with MIAME- and MINSEQE requirements in accordance with MGED recommendations. The Gene Expression Atlas uses a curated, re-annotated subset of data from the Archive to provide information about gene expression under various biological conditions.
The Alternative Splicing and Transcript Diversity (ASTD) database project, maintained by EMBL-EBI, is creating a database of alternative splice events and transcripts of genes from human, mouse and rat. Full length transcripts are generated with the aim of understanding the mechanism of alternative splicing on a genome-wide scale.
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.
The Binding Database project aims to make experimental data on the noncovalent association of molecules in solution searchable via the WWW. The initial focus is on biomolecular systems, but data on host-guest and supramolecular systems are also important and will be included in time
The CBS website contains an assortment of information including news, events, research groups, prediction servers, bioinformatics tools, datasets and publications. The prediction servers have information on both nucleotide and amino acid sequences.
CiteXplore combines literature searches of MEDLINE/PubMed and full-text databases such as PubMedCentral and patents with text mining tools for biology. Search results are cross referenced to EBI applications based on publication identifiers. Links to full text versions are provided where available.
Cold Spring Harbor Protocols is an interdisciplinary digital journal providing a definitive source of research methods in cell, developmental and molecular biology, genetics, bioinformatics, protein science, computational biology, immunology, neuroscience and imaging. Each monthly issue details multiple essential methods—a mix of cutting-edge and well-established techniques. Newly commissioned protocols and unsolicited submissions are supplemented with articles based on Cold Spring Harbor Laboratory’s courses and manuals.
The Catalytic Site Atlas (CSA) is a database documenting enzyme active sites and catalytic residues in enzymes of 3D structure. The classification of catalytic residues includes only those residues thought to be directly involved in some aspect of the reaction catalyzed by an enzyme.
This encyclopaedia for cytokines is browsable and utilizes an alphabetical list. Also provided is structure information, as well as links to journals, symposiums and reviews.
The DIP database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. The data stored within the DIP database are curated manually by expert curators and automatically using computational approaches that utilize the knowledge about the protein-protein interaction networks extracted from the core subset of the DIP data. Individual registration is required.
The European Genome-phenome Archive (EGA), maintained by EMBL-EBI, is designed to be a repository for all types of genotype experiments, including case control, population, and family studies. We will include SNP and CNV genotypes from array based methods and genotyping done with re-sequencing methods. This data may be either publicly available or limited access, depending on the design of the study.
The Entrez Cross-database Search Page lists, describes, and provides a link to each NCBI database. On this page, all NCBI databases can be searched simultaneously.
ENZYME is a repository of information relative to the nomenclature of enzymes. It is primarily based on the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) and it describes each type of characterized enzyme for which an EC (Enzyme Commission) number has been provided.
The European Bioinformatics Institute is in coalition with European Molecular Biology Laboratory. Their site has a large number of searchable databases of nucleotide sequences, protein sequences and protein structures.
The European Molecular Biology Laboratory (EMBL) is dedicated to basic research in the molecular life sciences.
This proteomics server from the Swiss Institute of Bioinformatics includes PROSITE, ENZYME, and GermOnLine, as well as other databases. There are also links to tools and software packages that can be downloaded and used for proteomics and sequencing.
The ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics (SIB) is dedicated to the analysis of protein sequences and structures. ExPASy Translate is a tool that translates a DNA or RNA sequence to a protein sequence.
The Experimental Factor Ontology (EFO) is an application focused ontology modeling the experimental factors in ArrayExpress. The ontology has been developed to increase the richness of the annotations that are currently made in the ArrayExpress repository, to promote consistent annotation, to facilitate automatic annotation and to integrate external data.
FingerPRINTScan classifies sequences using familial definitions from the PRINTS database, allowing progress to be made with the identification of distant evolutionary relationships. The approach makes use of the contextual information inherent in a multiple-motif method, and has the power to identify hitherto unidentified relationships in mass genome data.
FSSP (families of structurally similar proteins) is a database of structural alignments of proteins in the Protein Data Bank (PDB) [1]. The database currently contains an extended structural family for each of 330 representative protein chains.
This website takes information from a variety of databases, including GenBank, LocusLink and PDB to provide ample data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. It also includes information about mitochondrial diseases and links.
The human protein atlas shows expression and localization of proteins in a large variety of normal human tissues, cancer cells and cell lines with the aid of immunohistochemistry images and immunofluorescence confocal microscopy images.
Human Protein Reference Database (HPRD) integrates data that is deposited in Human Proteinpedia with the existing literature in a collection of curated information on individual proteins. Human Proteinpedia is a community portal for sharing and integration of human protein data. It allows research laboratories to contribute and maintain protein annotations. All the public data contributed to Human Proteinpedia can be queried, viewed and downloaded.
The Human Proteome Initiative (HPI) aims to annotate all known human protein sequences and their mammalian orthologs, according to the quality standards of UniProtKB/Swiss-Prot.
IMGT, the international ImMunoGeneTics information system, is an integrated knowledge resource specialized in the immunoglobulins (IG), T cell receptors (TR), major histocompatibility complex (MHC), immunoglobulin superfamily (IgSF), major histocompatibility complex superfamily (MhcSF) and related proteins of the immune system (RPI) of human and other vertebrate species.
The Integr8 web portal provides easy access to integrated information about deciphered genomes and their corresponding proteomes. Available data includes DNA sequences (from databases including the EMBL Nucleotide Sequence Database, Genome Reviews, and Ensembl); protein sequences (from databases including the UniProt Knowledgebase and IPI); statistical genome and proteome analysis (performed using InterPro, CluSTr, and GOA); and information about orthology, paralogy, and synteny.
IntEnz (Integrated relational Enzyme database) is a freely available resource focused on enzyme nomenclature. IntEnz is created by EMBL-EBI in collaboration with the Swiss Institute of Bioinformatics (SIB). This collaboration is responsible for the production of the ENZYME resource. IntEnz contains the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) on the nomenclature and classification of enzyme-catalysed reactions.
InterProScan Sequence Search allows you to query your sequence against InterPro.
The Immuno Polymorphism Database (IPD) was developed to provide a centralized system for the study of polymorphism in genes of the immune system. The IPD project was established by the HLA Informatics Group of the Anthony Nolan Research Institute in close collaboration with the European Bioinformatics Institute.
IPI, maintained by EMBL-EBI, provides a top level guide to the main databases that describe the proteomes of higher eukaryotic organisms.
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a database of biological systems, consisting of genetic building blocks of genes and proteins (KEGG GENES), chemical building blocks of both endogenous and exogenous substances (KEGG LIGAND), molecular wiring diagrams of interaction and reaction networks (KEGG PATHWAY), and hierarchies and relationships of various biological objects (KEGG BRITE). KEGG provides a reference knowledge base for linking genomes to biological systems and also to environments by the processes of PATHWAY mapping and BRITE mapping.
In the LGIC Database you will find the nucleic acid and protein sequences of ligand-gated ion channel subunits. Multiple sequence alignments can be generated, and some phylogenetic studies of the superfamilies are provided. Finally, the atomic coordinates of subunits, or portion of subunits, are provided when available.
MaizeGDB, funded by USDA Agricultural Research Service, is the community database for biological information about the crop plant Zea mays ssp. mays. The following datatypes are accessible through this site: genetic, genomic, sequence, gene product, functional characterization, literature reference, and person/organization contact information.
MaxSprout is a fast database algorithm for generating protein backbone and side chain co-ordinates from a C(alpha) trace. The backbone is assembled from fragments taken from known structures. Side chain conformations are optimised in rotamer space using a rough potential energy function to avoid clashes.
MAST is a tool for searching biological sequence databases for sequences that contain one or more of a group of known motifs.
The PANTHER (Protein ANalysis THrough Evolutionary Relationships) Classification System is a unique resource that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in the absence of direct experimental evidence.
Phosphosite provides information about in vivo phosphorylation sites in both humans and mice. It allows users to view specific phosphorylation sites based on a specific protein sequence.
An important problem in sequence analysis is to find patterns matching sets or subsets of sequences. This tool allows the user to search for patterns conserved in sets of unaligned protein sequences. The user can specify what kind of patterns should be searched for, and how many sequences should match a pattern to be reported.
The Predict Protein Server not only provides resources for structure prediction, but also sequence analysis. It offers many databases to search including multiple sequence alignments, functional motifs, composition-bias, protein domains and fold recognition. It also supplies predictions of various protein structures, such as secondary structure, solvent accessibility, transmemberane helix, globularity, coiled-coil regions, cysteine bonds and structural switching regions.
ProPhylER (Protein Phylogeny and Evolutionary Rates) quantifies evolutionary constraint to annotate functionally or structurally important regions in proteins and to predict the impact of coding polymorphisms. ProPhylER gives the researcher comprehensive constraint data on tens of thousands of eukaryotic proteins, represented by hundreds of thousands of individual sequences.
PDB is home to many 3-D macromolecular protein structures as well as nucleic acid structures. The database is searchable, and can provide information regarding each structure including who discovered it, the source, and the number of polymer chains and atoms.
PDBe is the EBI Protein Structure Database in Europe, a project for the collection, management and distribution of data about macromolecular structures, derived from the Protein Data Bank (PDB).
SCANSITE has three databases that can be utilized with protein sequence information: motif scan, database search, and sequence search. Some features listed are the ability to scan multiple proteins at once, the ability to search using multiple motifs, and the ability to search databases for two sequence patterns.
RADAR stands for Rapid Automatic Detection and Alignment of Repeats in protein sequences. Many large proteins have evolved by internal duplication and many internal sequence repeats correspond to functional and structural units. Radar is uses an automatic algorithm, for segmenting your query sequence into repeats, it identifies short composition biased as well as gapped approximate repeats and complex repeat architectures involving many different types of repeats in your query sequence.
The RESID Database of Protein Modifications is a comprehensive collection of annotations and structures for protein modifications including amino-terminal, carboxyl-terminal and peptide chain cross-link post-translational modifications.
Sequence Motif search will search through motif libraries online to see if the protein or nucleic acid sequence entered contains any motifs and identifies them.
This page, maintained by the European Bioinformatics Institute, lists various sequence database similarity search tools such as FASTA, BLAST, MPsrch and ScanPS. Interactive as well as email submissions are available for each of these services.
The Stanford bioinformatics resource provides workshops, consultations, hardware and software access for various computers and systems for use in Biomedical research.
The Stanford Genome Technology center provides information on the cost-effectiveness of DNA sequencing, and about the genome of S. Cerevisiae. The site also offers publications and project information.
The Stanford Microarray Database (SMD) is a research tool and archive that allows researchers to store, annotate, analyze and share data generated by microarray technology. SMD supports most major microarray platforms, and is MIAME-supportive and can export or import MAGE-ML. Registration is required and fees may apply.
SWISS-2DPAGE contains data on proteins identified on various 2-D PAGE and SDS-PAGE reference maps. You can locate these proteins on the 2-D PAGE maps or display the region of a 2-D PAGE map where one might expect to find a protein from UniProtKB/Swiss-Prot.
The SWISS-MODEL Repository is a database of annotated three-dimensional comparative protein structure models generated by the fully automated homology-modelling pipeline SWISS-MODEL.
This online data library is a great source for information on calcium binding proteins. Included about each protein are general information, sequence information, structural information and mutant information.
UniPathway provides a collection of manually curated metabolic pathways for the UniProtKB/Swiss-Prot knowledgebase.
The Universal Protein Resource (UniProt), maintained by EMBL-EBI, is a comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc). The UniProt Metagenomic and Environmental Sequences (UniMES) database is a repository specifically developed for metagenomic and environmental data.
World-2DPAGE Repository is a public standards-compliant repository for proteomics image data published in the literature.
The Zebrafish Information Network is a comprehensive database on everything there is to know about Zebrafish, including transgenics, wild-type lines, genes, gene expression, genetics maps and publications.
Encyclopedias
NCBI Entrez Resources
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.
Entrez searches the National Center for Biotechnology Information (NCBI) databases to provide integrated access to nucleotide and protein sequence data from over 160,000 species, along with three-dimensional protein structures, genomic mapping information, PubMed MEDLINE, and more. Sequence data are combined from various sources, including GenBank, EMBL, DDBJ, RefSeq, PIR-International, PRF, Swiss-Prot, and PDB. Entrez can be searched with a wide variety of text terms such as author name, journal name, gene or protein name, organism, unique identifier (e.g., accession number, sequence ID, PubMed ID), and other terms, depending on the database being searched.
3D Domains are compact structural domains identified automatically in MMDB, Entrez's macromolecular three-dimensional structure database. 3D Domains are the units of comparison for structure neighbor calculations using the VAST algorithm. Links to VAST or 3D Domain neighbors display 3D Domains (and complete polypeptide chains) with similar 3D structures.
The BioSystems database contains records that group together molecules that interact in biological systems. One type of biosystem is a biological pathway, which can consist of interacting genes, proteins, and small molecules. Another type of biosystem is a disease, which can involve components such as genes, biomarkers, and drugs.
In collaboration with authors and publishers, the National Center for Biotechnology Information (NCBI) is adapting biomedical Books for the web.
Cancer Chromosomes contains three cancer cytogenetic databases: the NCI Mitelman Database of Chromosome Aberrations in Cancer, the NCI Recurrent Chromosome Aberrations in Cancer, and the NCI and NCBI SKY/M-FISH & CGH Database. Karyotype, SKY/M-FISH, and CGH data can be searched simultaneously. Similarity searches demonstrate cytogenetic and clinical relatedness at varying levels of specificity.
CDD currently contains domains derived from two popular collections, Smart and Pfam, plus contributions from colleagues at NCBI, such as COG. The source databases also provide descriptions and links to citations. Since conserved domains correspond to compact structural units, CDs contain links to 3D-structure via Cn3D whenever possible.
The Entrez Cross-database Search Page lists, describes, and provides a link to each NCBI database. On this page, all NCBI databases can be searched simultaneously.
The database of genotype and phenotype (dbGaP) stores phenotype and genotype data, as well as the associations between them. Studies generating data for dbGaP will include genome-wide association studies, medical sequencing, and molecular diagnostic assays. Summaries of phenotype and genotype data as well as study documents and association analyses (when available) will be found on the public site. Authorized access may be required for downloading coded individual-level phenotypes genotypes and pedigrees.
The EST database contains all records found within the Expressed Sequence Tag (EST) division of GenBank. EST records contain first-pass single-read cDNA sequences and include no annotated biological features.
Gene organizes information about the characteristics and defining sequences of genes from species in Genome, RefSeq, and other model organisms.
The whole Genomes of over 1000 viruses and over 100 microbes can be found in Entrez Genome. The genomes represent both completely sequenced organisms and those for which sequencing is in progress. All three main domains of life - bacteria, archaea, and eukaryota - are represented, as well as many viruses and organelles.
Genome Project is a collection of complete and incomplete large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms. The database is organized into organism-specific overviews that function as portals from which all projects in the database pertaining to that organism can be browsed and retrieved.
The GENSAT project aims to map the expression of genes in the central nervous system of the mouse, using both in situ hybridization and transgenic mouse techniques. The GENSAT database contains mouse brain images at several different developmental stages using both techniques, with a searchable set of gene expression annotations.
Comparable experimental sample sets assembled from the Gene Expression Omnibus (GEO) repository. Entrez GDS queries all GEO DataSet annotation, allowing identification of experiments of interest.
Individual gene expression and molecular abundance profiles assembled from the Gene Expression Omnibus (GEO) repository. Entrez GEO Profiles queries annotation and pre-computed profile characteristics, allowing identification of specific genes, and molecular abundance profiles of interest.
Individual gene expression and molecular abundance profiles assembled from the Gene Expression Omnibus (GEO) repository. Entrez GEO Profiles queries annotation and pre-computed profile characteristics, allowing identification of specific genes, and molecular abundance profiles of interest.
The GSS database contains all records found within the Genome Survey Sequence (GSS) division of GenBank. GSS records contain first-pass single-read genomic sequences and rarely include annotated biological features.
HomoloGene is an automated system for detecting homologs among the annotated genes of several completely sequenced eukaryotic genomes.
The Entrez Journals database searches for a journal and has links to records for that journal in the databases. The Journals database can be searched using the journal title, MEDLINE abbreviation, NLM ID, ISO abbreviation, or ISSN. The database includes the journals in all Entrez databases, e.g., PubMed, Nucleotide, Protein.
MeSH is NLM's controlled vocabulary used for indexing articles in PubMed. MeSH terminology provides a consistent way to retrieve information that may use different terminology for the same concepts. Use the MeSH database to build a PubMed search strategy.
The Nucleotide database contains records for all Entrez Nucleotide sequences that are not found within the Expressed Sequence Tag (EST) or Genome Survey Sequence (GSS) divisions of GenBank. These include sequences from all remaining divisions of GenBank, NCBI Reference Sequences (RefSeqs), Whole Genome Shotgun (WGS) sequences, Third Party Annotation (TPA) sequences, and sequences imported from the Entrez Structure database.
Online Mendelian Inheritance in Animals (OMIA) is a database of genes, inherited disorders and traits in animal species (other than human and mouse) authored by Professor Frank Nicholas of the University of Sydney, Australia, with help from many people over the years. The database contains textual information and references, as well as links to other relevant records.
Online Mendelian Inheritance in Man (OMIM) is a catalog of human genes and genetic disorders, with links to literature references, sequence records, maps, and related databases. It is based on the book, Mendelian Inheritance in Man. The online version is updated daily. The OMIM FAQs provide additional information about the book and the online database.
Peptidome a repository of tandem mass spectrometry peptide and protein identification data generated by the scientific community.
PopSet is a set of DNA sequences that have been collected to analyse the evolutionary relatedness of a population. The population could originate from different members of the same species, or from organisms from different species. They are submitted to GenBank via Sequin, often as a sequence alignment.
The Probe Database is a public registry of sequence-specific reagents designed for use in a wide variety of biomedical research applications, together with information on reagent availability, experimental protocols, probe effectiveness, and computed sequence similarities.
The Protein entries in the Entrez search and retrieval system have been compiled from a variety of sources, including SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq.
Protein Clusters is a collection of related protein sequences (clusters). Currently it consists of Reference Sequence proteins encoded by complete prokaryotic and chloroplast genomes and plasmids. This database contains both curated and non-curated clusters.
SNP is NCBI’s database of Single Nucleotide Polymorphisms.
SRA is a database of raw sequence data from sequencing instruments.
Structure: The Molecular Modeling Database (MMDB) contains 3-D macromolecular structures, including proteins and polynucleotides. MMDB contains over 20,000 structures and is linked to the rest of the NCBI databases, including sequences, bibliographic citations, taxonomic classifications, and sequence and structure neighbors.
UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.
UniSTS is a NCBI resource that reports information about markers, or Sequence Tagged Sites (STS). UniSTS integrates marker and mapping data from public resources including GenBank, RHdb, GDB, various human maps (Genethon genetic map, Marshfield genetic map, Whitehead RH map, Whitehead YAC map, Stanford RH map, NHGRI chr 7 physical map, WashU chrX physical map), various mouse maps (Whitehead RH map, Whitehead YAC map, Jackson laboratory's MGD map).
The Map Viewer provides special browsing capabilities for a subset of organisms in Entrez Genomes. Map Viewer allows you to view and search an organism's complete genome, display chromosome maps, and zoom into progressively greater levels of detail, down to the sequence data for a region of interest. The number and types of available maps vary by organism. If multiple maps are available for a chromosome, Map Viewer displays them aligned to each other based on shared marker and gene names, and, for the sequence maps, based on a common sequence coordinate system.
The NCBI Taxonomy database contains the names of all organisms that are represented in the genetic databases with at least one nucleotide or protein sequence.
PubChem BioAssay contains the results of biological activity screening from a variety of public sources. It provides searchable descriptions of each bioassay, including descriptions of the conditions and readouts specific to that screening procedure. PubChem BioAssay results are linked to PubChem Substance, and in turn to PubChem Compound, whenever chemical structures are known. Screening results may be browsed via a web interface and also downloaded for further cheminformatics analysis.
PubChem Compound contains chemical structure information drawn from a variety of public sources. Compounds may be searched by chemical properties and are pre-clustered into identity and similarity groups by structure comparison. Whenever possible, compounds are linked via PubChem Substance to information on their biological activities. Available links include PubMed citations, protein 3D structures and links to biological screening results available in PubChem BioAssay.
PubChem Substance contains descriptions of chemical samples, from a variety of public sources, and links to information on their biological activities. The description includes links to PubChem Compound in cases where the chemical structures of compounds in the sample are known. Links providing information on biological activity include links to PubMed citations, protein 3D structures, and to biological screening results available in PubChem BioAssay.
Developed by the National Center for Biotechnology Information (NCBI), Pubmed is a free version of MEDLINE that offers links to GenBank records, other molecular sequence data, and other resources. MEDLINE, produced by the National Library of Medicine, is an index to journal articles in medicine, nursing, dentistry, veterinary medicine, the health care system, and the preclinical sciences, including both basic biomedical sciences and clinical practice. It indexes over 4,000 journals. The database contains citations from 1950 to the present with some older citations. An important feature of MEDLINE is the use of Medical Subject Headings (MeSH), a powerful tool for searching specific topics or for comprehensive general searches. MEDLINE is also available to Stanford users via EBSCO, ISI Web of Knowledge, NLM Gateway, OCLC ,and Ovid.
PubMed Central (PMC) is the U.S. National Library of Medicine's digital archive of life sciences journal literature. Access to the full text of articles in PMC is free, except where a journal requires a subscription for access to recent articles.


One of many library books on protein analysis