top of page
  • Sahana

Using KEGG Pathways for computational Biology

You’ve decided on a disease to research and are ready to do some intense genetic investigation - at home.  

Computational biology and bioinformatics have become revolutionary fields for researchers in biology due to the wide range of functions that software analysis tools provide. From BLAST to Ensembl, biological research has evolved to allow us to investigate genetic mechanisms, all with the computational power of a laptop. 

But how would you figure out what genes are involved in a disease without a lab?  What’s even more difficult is learning how certain genes, molecules, and proteins interact together, as well as what that interaction means for a disease. You could Google ‘your disease’ + ‘genes’ and sift through the 4 million Google Scholar results. Another option would be using R or Python to perform analysis on a database of genes implicated in your disease of focus. 

Or, you could opt for a more timely, codeless manner to learn all about the biological mechanisms behind your disease - KEGG. While KEGG can also be used as a supplement to bioinformatics code, it can be used without any programming as well. So, if speed, accuracy, and holistic analysis are what your research goals are, then KEGG Pathway would be your most potent option. 

KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases that can be used to interpret the role of genes in the body by displaying biological pathways for diseases, drugs, and chemicals.  KEGG pathway maps are hand drawn and easily comprehensible, as well as beginner friendly. Each pathway map draws on experimental concepts of metabolic function, organism behavior, and molecular interactions. KEGG objects are biological entities that include concrete genes, proteins, small molecules, pathways, and viruses. 

Once you’ve chosen your disease, visit KEGG Pathway Maps (BRITE) to conduct your analysis. KEGG BRITE is an ontology database that displays the functional hierarchies of the previously mentioned biological objects and their relationships. If you have a disease in mind that you’re interested in researching, click the ‘Human Diseases’ dropdown and select the category in which your disease falls under. For example, tuberculosis would fall under the category of an ‘infectious bacterial disease’.

Next, click on your disease of focus to be directed to the KEGG Pathway site. You should see a network or weblike structure of small rectangles with labels of biological objects. KEGG Pathways are constructed using this guide, so refer to the regular map notation to interpret the pathway(s) involved in your disease. Generally, multiple pathways are implicated in one disease and will be indicated by a rounded box labeled with the name of the specific pathway. Another helpful function of KEGG is that in human + pathogen interactions, the cell structure is clearly labeled - the cell membrane is evident from the outer boundary, and the nuclear membrane is labeled with a dotted line. 

KEGG Pathways involved in Tuberculosis

Caption: KEGG Pathways involved in Tuberculosis. Image by Kegg Pathway site, cited below.

Within tuberculosis, the NOD-like receptor signaling pathway is triggered by NamH, and this is indicated by the direction of arrows. The pathway map without coloring is called KegSketch, but generally, the pathways you investigate will have computationally generated colors, which are depicted in the help guide. If you’re interested in genes that are involved in your disease of focus, look for green boxes, which are hyperlinked to GENES entries in KEGG. Each node of your network, like a box, is given a KEGG Orthology identifier, called a K number.  In the NOD-like receptor signaling pathway, I could select NOD2 and RIP2 to investigate their functions independently and view their information as separate genes. 

If your network is large and you’re interested in knowing how one specific gene influences the collection of pathways, or at least where this gene is located, simply use the ‘Search’ bar and type in the gene name. You will receive a numerical ID, which is the number you should use the ‘ID search’ tool on. Click ‘Go’ and you should see a gene on your network be highlighted in a different color, usually red or pink. To investigate multiple genes in your pathway at once, you must search the ID numbers of these genes independently, make a list, and then enter them into ID Search. 

KEGG can be used for more than just diseases, however. KEGG BRITE allows you to learn about metabolic pathways, cellular processes, drug development, and even organismal systems. Of course, there are other methods of exploring these biological processes, and there are multiple ways to use a KEGG Pathway aside from learning about biological networks. For more information on KEGG and BRITE,  visit the KEGG website and NCBI Papers  listed below. 


51 views0 comments

Recent Posts

See All


bottom of page