Queriyng
- Simple Search
- Advanced Search
- Probability
Results
- Human
- Mouse
FAQ

CIRCA Help

Welcome to the CIRCA help page. Here you will find all the information you need for a successful search.

Querying CircaDB

There are three modes of querying CircaDB: By gene symbol, Simple and Advanced. In all three modes using the wildcard character "*" is possible.

This is the default. Simply enter a set of terms and the engine will query for each term separately, then join all of the results from each separate term together into the final result. In other words a query for kinase inhibitor will return all entries with the word kinase together with any entry that has inhibitor in their respective annotations.

This is usually not what one would want returned from such a query. This is where the advanced query mode comes in.

Advanced Query Mode

Once the "advanced query mode" check-box has been selected, the query strings are no longer simply split into individual queries. As the saying goes "with great power comes great responsibility". In a nutshell, you must now use the underlying Sphinx search engine's extended syntax query mode.

Briefly, the default behavior of the query kinase inhibitor would bring back all entries that match both kinase AND inhibitor. Instead of an implicit OR between terms, there is now an implicit AND between terms.

To bring back the OR behavior you would need to separate terms with a | (a vertical bar, or "pipe") character. E.g. kinase | inhibitor.

To bring back the specific phrase "kinase inhibitor", you would need to surround both words with quotes. E.g. "kinase inhibitor"

More examples

Query for the exact phrase "kinase inhibitor" that also make mention of "mitochondrial"
"kinase inhibitor" mitochondrial

Query for kinases, but not any entry with "inhibitor" in the record
kinase !inhibitor

Even more examples

Sphinx is a large a powerful system. You can review all of the various ways to use the extended syntax at the Sphinx extended query syntax help page

CircaDB Query Fields

Having read the above, you may be wondering what fields you can use to query CircaDB. The annotation is straight from the Affymetrix annotation files. We have pulled out certain columns from that file and indexed them as follows:

Field	Description
probeset_name	The Affymetrix probeset name
transcript_id	The transcript ID
representative_public_id
unigene_id	Unigene ID
gene_symbol	NCBI gene symbol
gene_title	Full gene title
entrez_gene	The Entrez gene ID
swissprot	The SwisProt accession
refseq_protein_id	The RefSeq protein accession
refseq_transcript_id	The RefSeq NA sequence accession
target_description	The description from the Affymetrix annotation file

Probability filter

The probability filter can be used to narrow the output to the most significant results. The database can be filtered for p-values and q-values produced by the different algorithms. The given value is used as the upper cutoff.

Understanding the results

Human

We have applied CYCLOPS (Anafi et al., PNAS 2017), an algorithm designed to reconstruct sample order in the absence of time-of-day information, to the public GTEx collection (GTEx Consortium, Nat. Genet. 2013) of 632 human donors contributing 4,292 RNA-seq samples from 13 distinct human tissue types. Additional sample information can be found at GTEx’s documentation page: https://gtexportal.org/home/documentationPage.

For each tissue that was CYCLOPS ordered, cosinor regression (modified, Anafi et al., PNAS 2017) was used to test if individual genes are rhythmic or not. We only looked for rhythms with a period of 24 hours. Gene-level expression data was filtered to exclude any gene with a read count of zero (TPM = 0) in any sample. Following this, only the top 15,000 expressed genes by median TPM were considered for each tissue.

Plots

Each point represents a single human donor. The x-axis is the CYCLOPS-estimated sample phase in radians (from 0 to 2𝝿). The y-axis is the expression level (TPM) from RNAseq. Phase has been adjusted so that time 𝝿 represents E-box phase (i.e. time of peak expression of E-box target genes NR1D1, NR1D2, and PER3). PLOT_HUMAN

P-value

The probability of seeing data at least this extreme given the null hypothesis is true – that the dataset is not rhythmic.

FDR

FDR is the ratio of false-positives to total genes discovered at a particular threshold. This is an adjustment to P-value in order to control the number of false discoveries when simultaneously testing multiple hypotheses (Benjamini & Hochberg, J. R. Stat. Soc. 1995).

How do I select more than one experiment?	"Ctrl+Right Click" lets you select multiple experiments ("Command+Right Click" for mac users)
How do I request a data set be added to CircaDB?	Researchers can request that a particular data set be added by submitting an issue at our project page hosted at Github.

back to the top

CIRCA Help

Querying CircaDB

Simple Query Mode

Advanced Query Mode

More examples

Even more examples

CircaDB Query Fields

Probability filter

Understanding the results

Human

Plots

P-value

FDR

R² (Rsq)

rAMP (relative amplitude)

Period

Phase

Mouse

Plots

p-Value

q-Value

Period

Phase

Frequently asked questions