Differences

This shows you the differences between two versions of the page.

--- training:dudijon:galaxy [2021/12/14 18:12]
slegras [9.2 Run the workflow DNA-seq data analysis]
+++ training:dudijon:galaxy [2022/12/09 15:37] (current)
slegras
@@ Line 3: / Line 3: @@
 ^ Instructor ^ Stephanie Le Gras ^
 ^ Duration | 3.5 hours |
-^ Content | {{:training:dudijon:introgalaxy_2020_compressed.pdf|Description of the key features of Galaxy (Lecture)}} |
+^ Content | {{:training:dudijon:introgalaxy_2021_compressed.pdf|Description of the key features of Galaxy (Lecture)}} |
 ^ ::: | Practical session on basic features of Galaxy (Hands-on) |
 ^ Prerequisites | None |
@@ Line 27: / Line 27: @@
 ++++
-==== - Import data into Galaxy ====
+==== - Import files from your computer to Galaxy ====
-=== - Import files from your computer to Galaxy ===
-  - Download the two files **CRN-107_11-R1.fastq.gz** and **CRN-107_11-R2.fastq.gz** following this [[https://seafile.igbmc.fr/d/345d7581237d4295bf2c/|link]].
-  - Import them to your history called “DNA-seq data analysis”
-  * The genome is: Human (hg19)
+  - Download the file “**sample.bed.gz**” following this [[https://seafile.igbmc.fr/d/1adaad8f80394182a784/|link]]  and upload it to Galaxy.
-  * The format: <auto detect>
-++++ Answer |
-  - Click on "Upload Data"
-  - Drag and drop the two fastq files “**CRN-107_11-R1.fastq**” and "**CRN-107_11-R2.fastq**"
-  - Select/Enter Genome for both datasets as: hg19
-  - Click on Start
-{{:training:dudijon:03.1-LaunchDragAndDrop.png?|}}
-{{:training:dudijon:03.2-UploadFastqFiles.png?|}}
-++++
-==== - Remove a dataset ====
-=== - Import a file from your computer ===
-  - Download the file “**sample.bed.gz**” following this [[https://seafile.igbmc.fr/d/345d7581237d4295bf2c/|link]]  and upload it to Galaxy.
   * The genome is: Mouse (mm9)
   * The format is: bed
@@ Line 62: / Line 44: @@
 ++++
+==== - Remove a dataset ====
   - Remove the dataset **sample.bed** from your history by clicking on the button
   - You are told that your history is empty. Look at the size of your history
-    - Click on “**deleted**” in the top of the history panel (below the history name). Remove definitely the file from the disk by clicking on "**Permanently remove it from disk**”.
+    - Click on “**deleted**” in the top of the history panel (below the history name). Remove definitely the file from the disk by clicking on "**Supprimer définitivement du disque**”.
     - Click on “hide deleted”
 ==== - Running a tool ====
+  - Download the two files **CRN-107_11-R1.fastq.gz** and **CRN-107_11-R2.fastq.gz** following this [[https://seafile.igbmc.fr/d/1adaad8f80394182a784/|link]].
+  - Import them to your history called “DNA-seq data analysis”
+    * The genome is: Human (hg19)
+    * The format: <auto detect>
+++++ Answer |
+  - Click on "Upload Data"
+  - Drag and drop the two fastq files “**CRN-107_11-R1.fastq.gz**” and "**CRN-107_11-R2.fastq.gz**"
+  - Select/Enter Genome for both datasets as: hg19
+  - Click on Start
+{{:training:dudijon:03.1-LaunchDragAndDrop.png?|}}
+{{:training:dudijon:03.2-UploadFastqFiles.png?|}}
+++++
   - Use the tool “FastQC Read Quality reports” to compute quality analysis on the datasets “**CRN-107_11-R1.fastq**” and "**CRN-107_11-R2.fastq**"
     - Use default parameters.
@@ Line 95: / Line 93: @@
   * CRN-107_11-R1.fastq
   * CRN-107_11-R2.fastq
-  * CaptureDesign_chr4.bed (download it from [[https://seafile.igbmc.fr/d/345d7581237d4295bf2c/|here]])
+  * CaptureDesign_chr4.bed (download it from [[https://seafile.igbmc.fr/d/1adaad8f80394182a784/|here]])
 Import missing files from the data library "**DNA-seq test datasets**"
@@ Line 123: / Line 121: @@
     - Limit analysis to regions in this BED dataset: CaptureDesign_chr4.bed
   - __SnpEff__ Variant effect and annotation
-    - Sequence changes (SNPs, MNPs, InDels): **output of GATK Haplotype Caller (VCF)**
+    - Sequence changes (SNPs, MNPs, InDels): **output of FreeBayes (VCF)**
     - Input format: VCF
     - Output format: VCF (only if input is VCF)
@@ Line 143: / Line 141: @@
 === - Rename the workflow "DNA-seq data analysis" ===
 ++++ Answer |
-{{:training:dudijon:04-manageworkflow.png?|}}
+{{:training:dudijon:05-editorrunworklow.png?|}}
-Now your can edit or run the workflow:
-{{:training:dudijon:05-editorrunworklow.png?|}}
 ++++
@@ Line 174: / Line 169: @@
   - __Samtools flagstat__ to compute mapping statistics (after BWA mem)
-  - __Filter__ to select aligned reads with a mapping quality >= 20 (after MarkDuplicates)
+  - __Filter SAM or BAM, output SAM or BAM__ to select aligned reads with a mapping quality >= 20 (after MarkDuplicates)
   - __Samtools flagstat__ to compute mapping statistics after removing reads with low mapping qualities (after Filter)
@@ Line 180: / Line 175: @@
   - __Flagstat__ tabulate descriptive stats for BAM dataset
     - BAM File to Convert: **output of BWA mem**
-  - __Filter__ BAM datasets on a variety of attributes
+  - __Filter SAM or BAM, output SAM or BAM__ files on FLAG MAPQ RG LN or by region
-    - BAM dataset(s) to filter: **output of Picard MarkDuplicates**
+    - SAM or BAM file to filter: **output of Picard MarkDuplicates**
-    - Select BAM property to filter on: mapQuality
+    - Minimum MAPQ quality score: **20**
-      - Filter on read mapping quality (phred scale): **>=20** (this exact expression, including ">="!)
   - __Flagstat__ tabulate descriptive stats for BAM dataset
     - BAM File to Convert: **output of Filter**
@@ Line 215: / Line 209: @@
 ++++ Answer |
-- 531417 = 30181
+- 530355 = 31243
 ++++