Tutorial

KnetMiner search interface

The search field of KnetMiner allows users to input any terms related to traits of interest. The terms can be high level descriptions of a phenotypic trait (e.g. heat tolerance) or more specific terms such as biological processes and protein families (e.g. defence response to fungi or LRR). Search terms can be combined with; "OR", "AND" & "NOT" statements, or put into double quotations for exact searches, i.e. "pathogen" AND "disease"

Additionally, as the user types a query, the number of resulting documents and genes related to the query are shown and constantly updated in real time. This is only active once the query term is greater than 3 characters in length, updating at each keyboard event. This will 1) help the user to detect spelling mistakes, 2) give a hint if the query term is too general, or too specific prior to the user executing the search, and 3) motivates the user to examine their query and explore different spelling, language, or more complex query statements ("OR", "AND" & "NOT").

Interested in a certificate of competency? See the bottom of this tutorial.

Refining search terms

A hint icon appears at the right end side of the search box to indicate that alternative search terms are available. Click the hint icon to open a tab-based query suggester; click it again and it will close. The shown terms are derived from the underlying knowledge network. The query suggester helps users to refine their keywords by suggesting more specific or synonymous terms. For example, using the query suggester on the term "drought" suggests other terms such as "drought sensitivity" or "drought recovery". The wizard allows adding, replacing or excluding the new terms. The real-time messaging directly updates when the keywords change to indicate if the new terms would lead to a different number of linked genes.



Searching with keyword, gene list and/or genome regions

You can search KnetMiner with keywords, gene list or genome regions (or a combination). KnetMiner will provide different types of responses based on the given inputs. The keyword search will search the whole genome while the other two search modes will be restricted to the specified gene list or genomic regions, respectively.

The gene list search allows users to enter a list of gene names or accessions (limit of 1 entry per line and a maximum of 100 gene ID's). The names/accession IDs need to match (partial matches are not enabled – name/accession ID searches are exact) the gene names/ids stored in the knowledge network. Tools like the Ensembl ID converter can be used to convert old gene ID's to those supported by KnetMiner.

The genome region mode restricts the search to genes that fall within the specified region. Entering the start and end position of a region will display the number of genes within those boundaries. Since KnetMiner v3.0, entering search keywords is no longer mandatory. If you have a list of genes and no clue about what they do, simply paste your gene ids/names into the Gene List box (without any keywords) and let KnetMiner provide a summary of all information it has for your genes, their location, and enriched linked terms. You can then view their individual knowledge networks. Searching by this method will not rank genes; only paths from gene to trait and phenotype nodes will initially be shown in this instance. However, if you combine your gene list with keywords, KnetMiner will be able to rank your gene list based on relevance and highlight the most interesting paths of the knowledge network accordingly.

KnetMiner results views

The result of a search is a list of candidate genes along with supporting evidence. KnetMiner provides different views that help to explore the search results and drill into interesting candidate gene networks.



Gene View

The Gene View displays identified candidate genes sorted by the KnetScore in a table. The various node types (GO, TO, phenotype, pathway, gene, publication, etc.) matching the search terms are summarised in the legend (below max number of genes dropdown). The legend is interactive and can be used as a filter. Clicking on one or multiple concepts (shapes) in the legend filters the table to genes with matching concepts in the EVIDENCE column, e.g. genes with pathway AND phenotype information. The concepts in the EVIDENCE column are extendible and provide a short description of the evidence. If the evidence is a publication, then the PubMed ID is shown and linked to PubMed.

Genes supplied by a user that are associated with the search terms are referred to as known targets, whereas those user genes that are not associated with any search term (nil evidence) are referred to as novel targets. A checkbox at the top of the Gene View table allows a user to select all known targets or novel targets instantly. Clicking on a single gene, or on the "View Network" button below, for a selection of genes opens the Network View.



Map View

The Map View is the chromosome-based display. To the right of the chromosome, it'll show all the genes which are related to the search term(s) given. Colour coding is used to distinguish genes, with green for high scores, orange for medium, and red for low. SNPs are shown to the left as the highlighted squares, colour coded according to the study shown in the SNP legend, relating to evidence found for the SNP. The user can also specify a specific region search and then only show genes and SNPs within this region. The view can be exported as a PNG, the user can zoom in and out and move across, and the p-value (how significant the association of the gene is with the search term) can be altered via the settings option (cog icon). Right clicking SNPs will provide further information in a pop-up box, where the selected SNP/QTL can be hidden or displayed, as shown below. Genes can be selected in the Map View and opened in Network View by clicking the network icon on the top (far left).



Evidence View

Evidence View provides a table-based view of the node types (concepts) linked to the search results. The results are sorted according to query-relevance score. The number of genes which are linked to each concept in the knowledge network is also displayed (within the "USER_GENES" column). Clicking this value will bring you to the Network View, containing the selected concepts in the centre of the network with the shortest path which connects the evidence documents to the linked genes. When clicking on a node (concept) icon in the interactive legend, located above the table, the table results will be filtered by evidence type.



Network View

The Network View will display knowledge networks of one or multiple genes selected from the previous views. The entry gene will be displayed as a blue triangle with a double border. Each path starting from the entry gene travelling towards another node will provide a relationship, initially showing only the most relevant relationships to the search term. The maximize button (far left) in the top menu renders the network in a maximised viewport. Should the user click the binoculars, this will show the whole network, but this can cause the application to slow down when loading too many concepts.

drawing To view information regarding a concept (node), or its relationship (edge), hold the right click button on the node/edge and a wheel (context menu) will appear.

Click 'show info' to see an information table on the right of the network viewport. You can close this by clicking the 'X' button in the info-box. To reset the orientation (zoom) of the graph, click the reset button. The info button next to the 'CoSE layout' drop down menu can also be used to show the info box (you must then click a concept to show information for it). You can move concepts and edges to be more easily viewable.

You can also hide concepts or show their labels or hide by their specific type by using the same wheel, or alternatively use the interactive legend where double clicking a concept will remove the concept, and single clicking adds it. On a touchscreen device, gently flick up on the legend concept to remove the concept and tap to add. The concept count will also update to show the current number of concepts present on the graph over the total number, updating as added or removed. The number of concepts and relationships visible are shown below the interactive legend, and the total number are shown adjacent to them in brackets.


View this KnetWork interactively here


Saving a KnetWork

Saving a KnetWork is as simple as pressing the green Save Knetwork button on the top right of the Network view. Clicking it will prompt you to login or to create an account if you do not already have one. KnetSpace enables you to store, edit, and collaborate on your KnetWorks with other scientists. If you wish to Sign in before saving a KnetWork, you can do so by clicking the Sign in button on the top right of the webpage, in the header.



Saving a KnetWork prompts a pop-up to appear at the top right corner of the webpage, indicating that the KnetWork has been saved to KnetSpace. To view it, either click on "View it in KnetSpace" before the popup dissolves, or navigate to KnetSpace.

KnetSpace will permanently store your saved KnetWorks, so long as you are within the free or paid-tier limitations. If you ever downgraded from Pro to Free, your extra KnetWorks will become locked.

Navigating to KnetSpace presents you with all of your saved KnetWorks:



KnetSpace overview, showcasing all your existing KnetWorks.


Clicking on a KnetWork allows you to view a variety of metadata about the KnetWork, set the KnetWork to be private or public (useful for archiving interactive KnetWorks publically for referencing purposes - also see how to cite KnetMiner), download the KnetWork as a PNG and view the search parameters. You can also check for KnetWork updates using the "Check for updates" button, which will re-run your query in the most up-to-date version of our Knowledge Graph, rename the graph, edit the description, clone the graph and even continue editing the graph by clicking on the node/edge like icon, leftmost of the quick-access menu at the top right.



Specific KnetWork, in this case petal QTL using keywords "dormancy OR germination OR color OR flavon* OR proanthocyanidin".


Sharing a KnetWork in KnetSpace

  1. Click the KnetWork you want to share.
  2. Top right of the window, under your username, click 'Share'.
  3. Type in the username of the user you want to share the KnetWork with and click 'Add'.

Clicking "Shared KnetWorks" at the top left of the header displays all the KnetWorks shared with you.


KnetMiner Plant Use Case

This application case shows the utility of KnetMiner for the functional analysis of a transcriptomics (RNA-Seq) experiment in bread wheat (Triticum aestivum). Wheat is the third most-grown cereal crop in the world after maize and rice and has a hexaploid genome 5 times the size of the human genome.


The red colour of the grain is due to the presence of coloured compounds, called flavonoids, in the seed coat (bran). These flavonoids give wholemeal bread not only its colour, but also a slightly bitter taste which is disliked by many people. Whitegrained wheat varieties lack the red compounds of the seed coat and are milder in flavour. However, white grains are prone to pre-harvest sprouting (PHS) which causes the grain to germinate before harvest and results in a loss of grain quality. It has been known for some time that PHS is associated with grain colour and that the red pigmentation of wheat grain is controlled by R genes on the long arms of chromosomes 3A, 3B, and 3D. In the last decade, the genetic basis of the relationship between grain colour and PHS has been studied and molecular characterisation showed the R gene is a Myb-type transcription factor responsible for transcriptional activation of genes (CHS, CHI, F3H and DFR) in the flavonoid biosynthesis pathway. However, the link between the R (Myb) gene and PHS is still unclear.

Here we demonstrate the utility of KnetMiner for analysing candidate genes from reverse genetics or transcriptomics studies and answering questions such as:

  1. Do any of these genes contribute to the expression of trait A (e.g. grain colour)?
  2. Do any of these genes contribute to the expression of trait B (e.g. PHS trait)?
  3. Which biological processes and pathways are underlying these traits?
  4. Are there common genes or mechanisms that influence both traits?
  5. Which other processes and traits will be affected by loss-of-function mutants?

KnetMiner Practice Exercises

Exercise 1 - Choosing the right search terms

Seed dormancy and germination are the underlying developmental processes that activate or prevent pre-harvest sprouting in many grains and other seeds. The user can provide this knowledge as a list of keywords into the search box. The Query Suggester provides alternative synonyms or more specific keywords. It also highlights key concept types that match the keywords.



Task: Type the keyword dormancy into the search box. Try to replace it with a more specific keyword.

A: The Query Suggester shows that the keyword dormancy has BioProc, Trait, Protein, PlantOntologyTerm and Gene matches (note the various shapes along the left of the box). Click on the Trait category and pick a more specific trait keyword from the list by clicking the 'Replace' button. Alternatively, add the keyword with the '+' button, or remove it with the '-' button.


Exercise 2 – Exploratory analysis of genes supplied by the user

We're now going to explore a list of differentially expressed genes (red vs. white grain) in the context of grain colour and PHS traits. The Wheat KnetMiner has five example queries. We're going to use three of them:

Example 1 - Grain colour

  • color OR flavon* OR proanthocyanidin

Example 2 - Pre-harvest sprouting (PHS)

  • "seed germination" OR "seed dormancy"

Example 3 - Grain colour + PHS

  • dormancy OR germination OR color OR flavon* OR proanthocyanidin

Task: Clicking on 'Example 1' populates the search box with search terms and the Gene List with ids. Directly below the search box, in light grey, you can see two numbers. The first number indicates the nodes in the wheat knowledge network that match the search terms. The second number indicates the number of genes in the wheat genome that have direct or indirect links to these search terms. Press the search button.

Pro tip: to get even more out of your KnetWork, consolidate American and British words in the Keyword Search, like: "colour OR color".



On your version KnetMiner the number of documents and genes might be higher than in the above screenshot, as we're constantly updating our Knowledge Graph.

In Gene View, you can use the interactive legend at the top by clicking on one or more symbols, e.g. pathways, phenotype, to retain genes with the selected evidence types and filter other genes.

  • Which genes are part of the flavonoid biosynthesis pathway?

Tip: try selecting several genes (tickboxes) then clicking on 'View Network'. You'll be able to traverse the Knowledge Graph to find the pathway.



User genes that were not associated with the search terms appear in Gene View with a "0" in the EVIDENCE column.

Open up Evidence View.

  • How many user provided genes have known links to Example 1 search terms (known targets) and how many genes have no obvious links (novel targets)?

Repeat these steps for Example 2.


Exercise 3 - Exploring gene knowledge networks

We're now going to select single or multiple genes and explore their gene-evidence networks, i.e. the information that links the genes with the search terms.

Task: Click on Example 3 (grain color + PHS) and perform a search. In Gene View, focus on the MYB1 gene (TRAESCS3D02G468400). Check its evidence column by clicking on individual shapes in the Evidence column, or clicking on the accession number (TRAESCS3D02G468400) to open the full network.



  • Can you find the wheat genes (blue triangles with a double border) in the network and follow the edges (paths/arrows) to the Arabidopsis ortholog? Where is the ortholog relation coming from?

  • Find the evidence path linking the wheat TT2 gene to seed maturation and seed colour.

  • Find the evidence path linking the wheat TTG1 gene to seed coat color and flower color.

  • Which other traits can be affected by TT2 loss-of-function mutants?

Hint: Enable labels on TO terms that appear as a green pentagon.

A: The knowledge network of wheat MYB1/ TT2 contains gene regulatory information, protein-protein interactions, phenotypic information in the form of mutant/genetic studies or text-mining, links to relevant ontology terms and publications, and, similar information from Arabidopsis and other species. A more thorough exploration of the information (i.e. node and edge properties) captured in MYB1/ TT2 network tells the following detailed biological story:

MYB1/ TT2 (R Myb) on chromosome '3D' in wheat is predicted (p-value = 0.01) to regulate the transcriptional activation of MFT according to data from the analysis of 850 RNA-Seq samples in wheat using GENIE3. The TT2 3B homeologue is not predicted to regulate MFT, and the TT2 3A homeologue is not annotated in the latest version of the wheat genome, MFT has been recently linked to grain germination ["Recent studies in both Arabidopsis and wheat have uncovered a new role of MOTHER OF FT AND TFL1 (MFT) in seed germination"] and seed dormancy [Mapping analysis showed that MFT on chromosome 3A (MFT-3A) colocalized with the seed dormancy quantitative trait locus (QTL) QPhs.ocs-3A.]. The MFT ortholog in Arabidopsis has a 3' UTR variant that has been associated with (p-value=5.5x10-5) increased germination rate after 56 days of dry storage.

To discover which other traits will be affected by MYB1/ TT2 loss-of-function mutants, we can expand the initial MYB1/ TT2 knowledge graph (click the genes icon in the interactive legend) to add all other genes that are regulated by, or interact with, MYB1/ TT2. Other wheat genes regulated by MYB1/ TT2 don't show any surprising phenotypes. However, the Arabidopsis MYB1/ TT2 interacts with TTG1; a gene known to be involved in controlling root hair density and root hair length in Arabidopsis root hairs. These root hairs are tubular outgrowths from specific epidermal cells, which have important roles in nutrient and water absorption. This interesting clue enables the creation of a speculative hypothesis that pre-harvest sprouting could be caused by increased root hairs, due to higher nutrient and water absorption in MYB1/ TT2 knock-outs. This is shown in the gene knowledge network of the MYB1/ TT2 (R Myb) gene.



Fullscreen your Knowledge Graph by clicking the Fullscreen button top left of your KnetWork

View this KnetWork interactively here


Exercise 4 - Exploring KnetMiner Map View

Task: Click on Example 5 (grain growth) and perform a search. Go to Map View and explore the chromosome, genes, SNP and QTL information displayed for genes ADA2A and BPM5 and then visualize their knowledge network.
1. Click on gene GLO5 on 2D and NAR1 on 5A.
3. Click on the Network button (top left) and explore the network.
4. Add more information by clicking on keys/icons in the interactive legend.



Tutorial Complete. Well done and thank you for your attention!

If you have any questions, feel free to reach out to us on hello@knetminer.com. We are always happy to help and we assure you that a human will always respond to your email.


Additional Content

Accessing data from outside the KnetMiner GUI

It is possible to access our data without using the KnetMiner search interface. To do your own analyses in terminal or languages such as R or Python, you'll need to make use of either our SPARQL or Neo4j databases.

To learn more, take a look at our Jupyter Notebooks on Using Knetminer SPARQL or Using KnetMiner NeoJ and read about the examples in our blog post on The Power Of Standardised and Fair Knowledge Graphs.


Earning a Certificate of Competency for Tutorial Completion


Want something to show your competency? Email hello@knetminer.com proof of completing the tutorial (such as a list of correct answers, questions and what you found interesting. No need to have accessed the data from outside of the KnetMiner GUI/search interface), as well as the email you've used to create your KnetMiner account, and we'll send you back a certificate to print or display on your LinkedIn profile.


Please use "Certiciate of Competency Request" as the subject of your email.


Example certificate. Real version contains unique hash identifier.


Citation:

Hassani-Pak, K., Singh, A., Brandizi, M., Hearnshaw, J., Parsons, J. D., Amberkar, S., Phillips, A. L., Doonan, J. H.and Rawlings, C. (2021) KnetMiner: a comprehensive approach for supporting evidence-based gene discovery and complex trait analysis across species. Plant Biotechnol. J. https://doi.org/10.1111/pbi.13583

Alternatively, view how to cite KnetMiner here.


Copyright KnetMiner 2022