View the Project on GitHub COG-UK/docs

Updated 2021-01-19 by @samstudio8

Accessing data

Using public sources to access samples that have passed high-quality QC

Using CLIMB to access FASTA, BAM and metadata for samples that have passed basic QC

You need a CLIMB account to upload or access data. If you haven’t got one, see instructions on registering. You will need to SSH into the CLIMB-COVID server.

The FASTA consensus and metadata table are perfectly paired. The sequence records in the FASTA and metadata rows in the table are in the same order. Additionally, the table contains a fasta_header column that can be used to map the records in the FASTA file; and likewise, the end of each FASTA header ends with the numeric index of the corresponding row in the metadata table (starting at 1). Note that the order is not guaranteed between different runs of the pipeline (i.e., the FASTA will not be in the same order each time the inbound pipeline finishes).

Note also that the merged consensus FASTA will also include resequencing. That is, a biosample may have more than one genome in the consensus FASTA.

Accessing restricted metadata through the controlled Majora data view API

See accessing dataviews.

Published 2020-03-29. Updated 2021-01-19. Page maintainer @samstudio8.