The inbound distribution pipeline (called Elan
) currently runs every day (including weekends).
The dataset as it stands after the Friday pipeline is used for weekly reporting.
The pipeline currently consists of the following events:
Time | Event | Descirption |
---|---|---|
1605 |
End of day check 1 | Using the uploaded metadata, Majora generates a list of samples to check against the file system. |
1630 |
End of day message 1 | The Majora bot annouces in #inbound-distribution the number of new sequences for each site that can be linked to metadata. It will also list the number of sequences that appear to be missing uploaded metadata, or metadata that appears to be missing uploaded sequence. |
1705 |
End of day check 2 | Using the uploaded metadata, Majora generates a list of samples to check against the file system. |
1730 |
End of day message 2 | The Majora bot announces in #inbound-distribution the number of sequences pulled for each site, the week’s total and the new cumulative upload total. |
+1 day 0501 |
Permissions check | A cron job runs chmod to ensure all the upload directories are readable by the pipeline. |
+1 day 0505 |
Pre-pull | Using the uploaded metadata, Majora generates a list of samples to check against the file system. |
+1 day 0601 |
Pipeline starts | If nothing horrible has gone wrong, the pipeline will start. |
~ | Pipeline ends | After a few hours, the Majora bot will annouce to #inbound-distribution the number of sequences that made it through the pipeline and passed basic QC. |
Elan can process approximately 1000 samples an hour on a good day. Combined with around 90 minutes for “post-Elan publishing”, an average pipeline of 3000 samples should take around 5 hours to complete (ready for lunch). For PHA subscribed to Asklepian
, processing time on a good day is around 90 minutes (mid afternoon).
The GISAID pipeline runs every day and releases sequences on a 7 day time lag. The ENA BAM pipeline runs on Mondays, the ENA consensus pipeline runs on Fridays. All sites are automatically enrolled for ENA BAM uploads. You must however opt-in for GISAID uploads or ENA consensus uploads. Data uploaded over the weekend will miss the official reporting cut-off, but will be included in Monday’s outbound pipeline.
1600
and 0500
. The pre-Elan message will inform you whether your gamble was a success or not.0601
.