22FAlama: | Year |
22FAlama: | 2 character code representing the school: Forfar Academy |
22FAlama: | 4 character code indicating the variety: Lady Margaret |
CO | Combined Data |
BA | Banchory Academy |
BC | Beaulieu Convent School |
TA | St Thomas of Aquin's High School |
FA | Forfar Academy |
QA | Queen Anne High School |
SP | St Peter the Apostle High School |
SM | St Modan's High School |
MA | Morgan Academy |
cher | Cheerio |
coul | Coulmouny |
cove | Coverack beauty |
empr | Empress |
fofi | Forest Fire |
furn | Furness |
lama | Lady Margaret Boscawen |
lofy | Loch Fyne |
luci | Lucifer |
mihu | Minnie Hume |
morv | Morven |
orna | Ornatus |
poet | Poetarum |
prin | Princeps |
alba | Albatross |
tazz | Tezzeta |
seal | Sealing Wax |
swee | Sweetness |
duma | Dutch Master |
hiso | High Society |
reds | Redstart |
topo | Topolino |
carl | Carlton |
The data available from this site have been generated as part of a Royal Society (London) project (The Scottish Daffodil Project) to introduce 16-18 year old school pupils across Scotland to genome sequencing, molecular evolution and ecology.
The data are all freely available for teaching and research but are preliminary prior to formal publication by the Scottish Daffodil Project. The Scottish Daffodil Project is preparing publications that include: (a) the principals and teaching value of practical genome sequencing and genome analysis in schools and (b) the scientific interpretation of the genome data generated in the Project.
The expectation is that the Scottish Daffodil Project will submit articles within 18 months of the first release of raw data. If you wish to use the data in your own publications during this time, then please contact the Scottish Daffodil Project to ensure you have the best quality data to work from. Also, the following acknowledgment should be included: "These data were produced by the Scottish Daffodil Project in collaboration." We request that you notify the Scottish Daffodil Project upon publication so that this information can be included in the final annotation of the data and reporting on the Scottish Daffodil Project.
While still in waiting period status, the assembly and raw sequence reads should not be redistributed or repackaged without permission from Scottish Daffodil Project." Once moved to unreserved status, the data are freely available for any subsequent use.
This project is a collaboration between Jon Hale (Head of Biology at Beaulieu Convent School, Jersey), the University of Dundee School of Life Sciences and the James Hutton Institute.
Schools are working in conjunction with STEM partners from the University of Dundee and the James Hutton Institute to sample various daffodil varieties and carry out DNA sequencing of the chloroplast genome using Oxford Nanopore MinIon sequencing. The schools are then carrying out analysis of the resulting sequence to answer specific research questions.
The resulting sequencing data from all the schools are being collated on this site, allowing with additional sequences obtained by Beaulieu Convent School, Jersey. This enables the phylogeny of the various varietites to be inferred.
This project has been funded by The Royal Society, with additional funding from the Friends of Dundee University Botanic Garden to enable the University of Dundee Data Analysis Group to provide this centralised resource with consistent analysis processes.
We would be happy for anyone to reuse this data in their own projects, but please see our Data Usage Policy.
The sequences have all been processed using an automated workflow consisting of a number of stages allowing us to get from the raw data produced by the MinIon sequencer to assembled, annotated sequences
This is the process of converting the signal from the MinIon sequencer into the DNA bases this represents. The instrument produces thousands of reads of varying length, up to 40-50kb in length. Basecalling was carried out using the Oxford Nanopore Guppy software (version 6.1.2) using the appropriate high accuracy model for the flowcells and sequencing kits used (dna_r9.4.1_450bps_hac).
Every base of sequence generated is assigned a quality score, which is a measure of how likely the base is to have been called correctly. These range from 0-40, using a logarithmic scale, where a score of 10 represents a 1 in 10 chance that the base call is incorrect, a score of 20 relating to a 1 in 100 chance, and 30 being 1 in 1000. Sequences are filtered using NanoFilt to remove sequences shorter than 300 bp and also those with a minimum average quality score below 10.
A large proportion of these samples does not consist of daffodil chloroplastic DNA, so to make the assembly process easier, contaminants are removed from the sequence data. Kraken 2.1.2 is also used to identify what the origin of the seqeunces in each sample. The sequence reads are mapped to a known daffodil chloroplast sequence using winnowmap 2.03, and those which do not have similarity with this reference sequence are discarded.
Total Sequenced Bases (bp)
help
|
329,228,066 | Mean Quality
help
|
10.3 |
Mean Read Length (bp)
help
|
470.6 | N50 Read Length (bp)
help
|
525.0 |
Proportion of Chloroplast Bases (%)
help
|
0.62 | Chloroplast Coverage
help
|
12 |
Assembled Contigs
help
|
5 | Assembled Contig Length (bp)
help
|
19047 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
120354 |
Total Sequenced Bases (bp)
help
|
778,191,977 | Mean Quality
help
|
10.2 |
Mean Read Length (bp)
help
|
3,036.5 | N50 Read Length (bp)
help
|
5,395.0 |
Proportion of Chloroplast Bases (%)
help
|
4.5 | Chloroplast Coverage
help
|
217 |
Assembled Contigs
help
|
1 | Assembled Contig Length (bp)
help
|
184643 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
159064 |
Total Sequenced Bases (bp)
help
|
2,088,574,589 | Mean Quality
help
|
10.6 |
Mean Read Length (bp)
help
|
1,455.0 | N50 Read Length (bp)
help
|
2,173.0 |
Proportion of Chloroplast Bases (%)
help
|
2.3 | Chloroplast Coverage
help
|
298 |
Assembled Contigs
help
|
3029 | Assembled Contig Length (bp)
help
|
21321774 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
151524 |
Total Sequenced Bases (bp)
help
|
193,339,657 | Mean Quality
help
|
10.4 |
Mean Read Length (bp)
help
|
2,434.5 | N50 Read Length (bp)
help
|
4,564.0 |
Proportion of Chloroplast Bases (%)
help
|
2.5 | Chloroplast Coverage
help
|
30 |
Assembled Contigs
help
|
7 | Assembled Contig Length (bp)
help
|
117365 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
137260 |
Total Sequenced Bases (bp)
help
|
637,075,774 | Mean Quality
help
|
10.4 |
Mean Read Length (bp)
help
|
974.3 | N50 Read Length (bp)
help
|
1,280.0 |
Proportion of Chloroplast Bases (%)
help
|
2.8 | Chloroplast Coverage
help
|
110 |
Assembled Contigs
help
|
21 | Assembled Contig Length (bp)
help
|
157859 |
Assembled Scaffolds
help
|
2 | Assembled Scaffold Length (bp)
help
|
159759 |
Total Sequenced Bases (bp)
help
|
853,626,090 | Mean Quality
help
|
10.6 |
Mean Read Length (bp)
help
|
858.2 | N50 Read Length (bp)
help
|
1,138.0 |
Proportion of Chloroplast Bases (%)
help
|
10 | Chloroplast Coverage
help
|
534 |
Assembled Contigs
help
|
2 | Assembled Contig Length (bp)
help
|
172528 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
159291 |
Total Sequenced Bases (bp)
help
|
117,375,135 | Mean Quality
help
|
10.1 |
Mean Read Length (bp)
help
|
2,650.9 | N50 Read Length (bp)
help
|
4,438.0 |
Proportion of Chloroplast Bases (%)
help
|
5.4 | Chloroplast Coverage
help
|
39 |
Assembled Contigs
help
|
2 | Assembled Contig Length (bp)
help
|
162909 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
159334 |
Total Sequenced Bases (bp)
help
|
329,245,778 | Mean Quality
help
|
10.1 |
Mean Read Length (bp)
help
|
1,028.3 | N50 Read Length (bp)
help
|
1,552.0 |
Proportion of Chloroplast Bases (%)
help
|
0.82 | Chloroplast Coverage
help
|
16 |
Assembled Contigs
help
|
427 | Assembled Contig Length (bp)
help
|
2918318 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
140215 |
Total Sequenced Bases (bp)
help
|
18,026,495 | Mean Quality
help
|
10.2 |
Mean Read Length (bp)
help
|
423.2 | N50 Read Length (bp)
help
|
466.0 |
Proportion of Chloroplast Bases (%)
help
|
0.62 | Chloroplast Coverage
help
|
0 |
Assembled Contigs
help
|
0 | Assembled Contig Length (bp)
help
|
0 |
Assembled Scaffolds
help
|
0 | Assembled Scaffold Length (bp)
help
|
0 |
Total Sequenced Bases (bp)
help
|
1,844,402,915 | Mean Quality
help
|
10.5 |
Mean Read Length (bp)
help
|
2,408.6 | N50 Read Length (bp)
help
|
3,782.0 |
Proportion of Chloroplast Bases (%)
help
|
2.8 | Chloroplast Coverage
help
|
327 |
Assembled Contigs
help
|
2 | Assembled Contig Length (bp)
help
|
150611 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
150759 |
Total Sequenced Bases (bp)
help
|
148,406 | Mean Quality
help
|
10.2 |
Mean Read Length (bp)
help
|
1,613.1 | N50 Read Length (bp)
help
|
2,265.0 |
Proportion of Chloroplast Bases (%)
help
|
11 | Chloroplast Coverage
help
|
0 |
Assembled Contigs
help
|
0 | Assembled Contig Length (bp)
help
|
0 |
Assembled Scaffolds
help
|
0 | Assembled Scaffold Length (bp)
help
|
0 |
Total Sequenced Bases (bp)
help
|
941,653 | Mean Quality
help
|
10.1 |
Mean Read Length (bp)
help
|
973.8 | N50 Read Length (bp)
help
|
1,556.0 |
Proportion of Chloroplast Bases (%)
help
|
11 | Chloroplast Coverage
help
|
0 |
Assembled Contigs
help
|
0 | Assembled Contig Length (bp)
help
|
0 |
Assembled Scaffolds
help
|
0 | Assembled Scaffold Length (bp)
help
|
0 |
Total Sequenced Bases (bp)
help
|
689,139 | Mean Quality
help
|
10.1 |
Mean Read Length (bp)
help
|
1,027.0 | N50 Read Length (bp)
help
|
1,512.0 |
Proportion of Chloroplast Bases (%)
help
|
16 | Chloroplast Coverage
help
|
0 |
Assembled Contigs
help
|
0 | Assembled Contig Length (bp)
help
|
0 |
Assembled Scaffolds
help
|
0 | Assembled Scaffold Length (bp)
help
|
0 |
Total Sequenced Bases (bp)
help
|
154,483,778 | Mean Quality
help
|
11.3 |
Mean Read Length (bp)
help
|
1,906.5 | N50 Read Length (bp)
help
|
3,539.0 |
Proportion of Chloroplast Bases (%)
help
|
17 | Chloroplast Coverage
help
|
165 |
Assembled Contigs
help
|
2 | Assembled Contig Length (bp)
help
|
164568 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
165506 |
Total Sequenced Bases (bp)
help
|
16,703,346 | Mean Quality
help
|
10.0 |
Mean Read Length (bp)
help
|
2,058.6 | N50 Read Length (bp)
help
|
4,184.0 |
Proportion of Chloroplast Bases (%)
help
|
3.3 | Chloroplast Coverage
help
|
3 |
Assembled Contigs
help
|
4 | Assembled Contig Length (bp)
help
|
45827 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
131294 |
Total Sequenced Bases (bp)
help
|
660,329 | Mean Quality
help
|
11.0 |
Mean Read Length (bp)
help
|
2,934.8 | N50 Read Length (bp)
help
|
6,953.0 |
Proportion of Chloroplast Bases (%)
help
|
5 | Chloroplast Coverage
help
|
0 |
Assembled Contigs
help
|
0 | Assembled Contig Length (bp)
help
|
0 |
Assembled Scaffolds
help
|
0 | Assembled Scaffold Length (bp)
help
|
0 |
Total Sequenced Bases (bp)
help
|
1,293,024,147 | Mean Quality
help
|
12.1 |
Mean Read Length (bp)
help
|
1,072.8 | N50 Read Length (bp)
help
|
1,558.0 |
Proportion of Chloroplast Bases (%)
help
|
37 | Chloroplast Coverage
help
|
2,982 |
Assembled Contigs
help
|
2 | Assembled Contig Length (bp)
help
|
149075 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
153893 |
Total Sequenced Bases (bp)
help
|
43,051,165 | Mean Quality
help
|
10.2 |
Mean Read Length (bp)
help
|
1,531.3 | N50 Read Length (bp)
help
|
2,428.0 |
Proportion of Chloroplast Bases (%)
help
|
1.5 | Chloroplast Coverage
help
|
3 |
Assembled Contigs
help
|
10 | Assembled Contig Length (bp)
help
|
35625 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
128305 |
Total Sequenced Bases (bp)
help
|
276,232,033 | Mean Quality
help
|
11.9 |
Mean Read Length (bp)
help
|
751.6 | N50 Read Length (bp)
help
|
1,024.0 |
Proportion of Chloroplast Bases (%)
help
|
55 | Chloroplast Coverage
help
|
954 |
Assembled Contigs
help
|
10 | Assembled Contig Length (bp)
help
|
184993 |
Assembled Scaffolds
help
|
2 | Assembled Scaffold Length (bp)
help
|
208919 |
Total Sequenced Bases (bp)
help
|
3,793,871,703 | Mean Quality
help
|
10.7 |
Mean Read Length (bp)
help
|
2,344.6 | N50 Read Length (bp)
help
|
4,325.0 |
Proportion of Chloroplast Bases (%)
help
|
1.9 | Chloroplast Coverage
help
|
451 |
Assembled Contigs
help
|
1 | Assembled Contig Length (bp)
help
|
161992 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
161992 |
Total Sequenced Bases (bp)
help
|
637,388,752 | Mean Quality
help
|
10.3 |
Mean Read Length (bp)
help
|
1,562.3 | N50 Read Length (bp)
help
|
2,449.0 |
Proportion of Chloroplast Bases (%)
help
|
1.7 | Chloroplast Coverage
help
|
65 |
Assembled Contigs
help
|
2 | Assembled Contig Length (bp)
help
|
127041 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
143789 |
Total Sequenced Bases (bp)
help
|
11,975,706 | Mean Quality
help
|
10.4 |
Mean Read Length (bp)
help
|
1,217.8 | N50 Read Length (bp)
help
|
2,056.0 |
Proportion of Chloroplast Bases (%)
help
|
9 | Chloroplast Coverage
help
|
6 |
Assembled Contigs
help
|
12 | Assembled Contig Length (bp)
help
|
70605 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
127240 |
Total Sequenced Bases (bp)
help
|
172,644,669 | Mean Quality
help
|
10.3 |
Mean Read Length (bp)
help
|
2,731.9 | N50 Read Length (bp)
help
|
4,175.0 |
Proportion of Chloroplast Bases (%)
help
|
9.9 | Chloroplast Coverage
help
|
106 |
Assembled Contigs
help
|
2 | Assembled Contig Length (bp)
help
|
143892 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
149951 |
Total Sequenced Bases (bp)
help
|
13,958,953 | Mean Quality
help
|
10.3 |
Mean Read Length (bp)
help
|
2,510.2 | N50 Read Length (bp)
help
|
5,812.0 |
Proportion of Chloroplast Bases (%)
help
|
6.9 | Chloroplast Coverage
help
|
6 |
Assembled Contigs
help
|
6 | Assembled Contig Length (bp)
help
|
105128 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
130314 |
Total Sequenced Bases (bp)
help
|
4,063,677,562 | Mean Quality
help
|
10.2 |
Mean Read Length (bp)
help
|
3,396.6 | N50 Read Length (bp)
help
|
5,846.0 |
Proportion of Chloroplast Bases (%)
help
|
1.4 | Chloroplast Coverage
help
|
353 |
Assembled Contigs
help
|
2 | Assembled Contig Length (bp)
help
|
159765 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
159865 |
Total Sequenced Bases (bp)
help
|
4,591,656 | Mean Quality
help
|
10.1 |
Mean Read Length (bp)
help
|
912.5 | N50 Read Length (bp)
help
|
1,442.0 |
Proportion of Chloroplast Bases (%)
help
|
12 | Chloroplast Coverage
help
|
3 |
Assembled Contigs
help
|
6 | Assembled Contig Length (bp)
help
|
43778 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
142227 |
Total Sequenced Bases (bp)
help
|
558,705,420 | Mean Quality
help
|
10.2 |
Mean Read Length (bp)
help
|
740.5 | N50 Read Length (bp)
help
|
1,020.0 |
Proportion of Chloroplast Bases (%)
help
|
0.13 | Chloroplast Coverage
help
|
4 |
Assembled Contigs
help
|
6 | Assembled Contig Length (bp)
help
|
44844 |
Assembled Scaffolds
help
|
1 | Assembled Scaffold Length (bp)
help
|
72568 |
Total Sequenced Bases (bp)
help
|
4,398,963 | Mean Quality
help
|
10.0 |
Mean Read Length (bp)
help
|
910.8 | N50 Read Length (bp)
help
|
1,295.0 |
Proportion of Chloroplast Bases (%)
help
|
5.1 | Chloroplast Coverage
help
|
1 |
Assembled Contigs
help
|
0 | Assembled Contig Length (bp)
help
|
0 |
Assembled Scaffolds
help
|
0 | Assembled Scaffold Length (bp)
help
|
0 |
Two multiple sequence alignments have been carried out. The first includes all (including partial) sequences, while the second includes just sequences > 120kb in length, which were used for carrying out the phylogenetic analysis.
The JalviewJS link below for the 'long' sequences will also load they phylogenetic tree in addition to the alignment.
Each identified gene within the assembled chloroplast genomes has been translated to a protein sequence, and then a multiple sequence alignment generated for each gene. Alignments were created using Muscle 5.1 with default parameters.