Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
training:dudijon:galaxy [2021/12/14 18:05]
slegras [9.1 Import files]
training:dudijon:galaxy [2021/12/14 18:37]
slegras [Introduction to Galaxy]
Line 3: Line 3:
 ^ Instructor ^ Stephanie Le Gras ^ ^ Instructor ^ Stephanie Le Gras ^
 ^ Duration | 3.5 hours | ^ Duration | 3.5 hours |
-^ Content | {{:​training:​dudijon:​introgalaxy_2020_compressed.pdf|Description of the key features of Galaxy (Lecture)}} |+^ Content | {{:​training:​dudijon:​introgalaxy_2021_compressed.pdf|Description of the key features of Galaxy (Lecture)}} |
 ^ ::: | Practical session on basic features of Galaxy (Hands-on) | ^ ::: | Practical session on basic features of Galaxy (Hands-on) |
 ^ Prerequisites | None | ^ Prerequisites | None |
Line 27: Line 27:
 ++++ ++++
  
-==== - Import data into Galaxy ==== +==== - Import files from your computer to Galaxy ====
-=== - Import files from your computer to Galaxy === +
-  - Download the two files **CRN-107_11-R1.fastq.gz** and **CRN-107_11-R2.fastq.gz** following this [[https://​seafile.igbmc.fr/​d/​345d7581237d4295bf2c/​|link]]. +
-  - Import them to your history called “DNA-seq data analysis”+
  
-  * The genome is: Human (hg19) 
-  * The format: <auto detect> 
- 
-++++ Answer | 
-  - Click on "​Upload Data" 
-  - Drag and drop the two fastq files “**CRN-107_11-R1.fastq**” and "​**CRN-107_11-R2.fastq**"​ 
-  - Select/​Enter Genome for both datasets as: hg19 
-  - Click on Start 
- 
-{{:​training:​dudijon:​03.1-LaunchDragAndDrop.png?​|}} 
-{{:​training:​dudijon:​03.2-UploadFastqFiles.png?​|}} 
-++++ 
- 
-==== - Remove a dataset ==== 
-=== - Import a file from your computer === 
   - Download the file “**sample.bed.gz**” following this [[https://​seafile.igbmc.fr/​d/​345d7581237d4295bf2c/​|link]] ​ and upload it to Galaxy.   - Download the file “**sample.bed.gz**” following this [[https://​seafile.igbmc.fr/​d/​345d7581237d4295bf2c/​|link]] ​ and upload it to Galaxy.
   * The genome is: Mouse (mm9)   * The genome is: Mouse (mm9)
Line 62: Line 44:
 ++++ ++++
  
 +==== - Remove a dataset ====
   - Remove the dataset **sample.bed** from your history by clicking on the button ​   - Remove the dataset **sample.bed** from your history by clicking on the button ​
   - You are told that your history is empty. Look at the size of your history   - You are told that your history is empty. Look at the size of your history
-    - Click on “**deleted**” in the top of the history panel (below the history name). Remove definitely the file from the disk by clicking on "**Permanently remove it from disk**”.+    - Click on “**deleted**” in the top of the history panel (below the history name). Remove definitely the file from the disk by clicking on "**Supprimer définitivement du disque**”.
     - Click on “hide deleted”     - Click on “hide deleted”
  
 ==== - Running a tool ==== ==== - Running a tool ====
 +  - Download the two files **CRN-107_11-R1.fastq.gz** and **CRN-107_11-R2.fastq.gz** following this [[https://​seafile.igbmc.fr/​d/​345d7581237d4295bf2c/​|link]].
 +  - Import them to your history called “DNA-seq data analysis”
 +    * The genome is: Human (hg19)
 +    * The format: <auto detect>
 +
 +++++ Answer |
 +  - Click on "​Upload Data"
 +  - Drag and drop the two fastq files “**CRN-107_11-R1.fastq.gz**” and "​**CRN-107_11-R2.fastq.gz**"​
 +  - Select/​Enter Genome for both datasets as: hg19
 +  - Click on Start
 +
 +{{:​training:​dudijon:​03.1-LaunchDragAndDrop.png?​|}}
 +{{:​training:​dudijon:​03.2-UploadFastqFiles.png?​|}}
 +++++
 +
   - Use the tool “FastQC Read Quality reports” to compute quality analysis on the datasets “**CRN-107_11-R1.fastq**” and "​**CRN-107_11-R2.fastq**"​   - Use the tool “FastQC Read Quality reports” to compute quality analysis on the datasets “**CRN-107_11-R1.fastq**” and "​**CRN-107_11-R2.fastq**"​
     - Use default parameters.     - Use default parameters.
Line 123: Line 121:
     - Limit analysis to regions in this BED dataset: CaptureDesign_chr4.bed     - Limit analysis to regions in this BED dataset: CaptureDesign_chr4.bed
   - __SnpEff__ Variant effect and annotation   - __SnpEff__ Variant effect and annotation
-    - Sequence changes (SNPs, MNPs, InDels): **output of GATK Haplotype Caller ​(VCF)**+    - Sequence changes (SNPs, MNPs, InDels): **output of FreeBayes ​(VCF)**
     - Input format: VCF     - Input format: VCF
     - Output format: VCF (only if input is VCF)     - Output format: VCF (only if input is VCF)
Line 143: Line 141:
 === - Rename the workflow "​DNA-seq data analysis"​ === === - Rename the workflow "​DNA-seq data analysis"​ ===
 ++++ Answer | ++++ Answer |
-{{:​training:​dudijon:​04-manageworkflow.png?|}}+{{:​training:​dudijon:​05-editorrunworklow.png?|}}
  
-Now your can edit or run the workflow: 
- 
-{{:​training:​dudijon:​05-editorrunworklow.png?​|}} 
 ++++ ++++
  
Line 174: Line 169:
  
   - __Samtools flagstat__ to compute mapping statistics (after BWA mem)   - __Samtools flagstat__ to compute mapping statistics (after BWA mem)
-  - __Filter__ ​to select aligned reads with a mapping quality >= 20 (after MarkDuplicates)+  - __Filter SAM or BAM, output SAM or BAM__ to select aligned reads with a mapping quality >= 20 (after MarkDuplicates)
   - __Samtools flagstat__ to compute mapping statistics after removing reads with low mapping qualities (after Filter)   - __Samtools flagstat__ to compute mapping statistics after removing reads with low mapping qualities (after Filter)
  
Line 180: Line 175:
   - __Flagstat__ tabulate descriptive stats for BAM dataset   - __Flagstat__ tabulate descriptive stats for BAM dataset
     - BAM File to Convert: **output of BWA mem**     - BAM File to Convert: **output of BWA mem**
-  - __Filter__ ​BAM datasets ​on a variety of attributes +  - __Filter SAM or BAM, output SAM or BAM__ files on FLAG MAPQ RG LN or by region 
-    - BAM dataset(s) ​to filter: **output of Picard MarkDuplicates** +    - SAM or BAM file to filter: **output of Picard MarkDuplicates** 
-    - Select BAM property to filter on: mapQuality +    - Minimum MAPQ quality ​score: **20**
-      - Filter on read mapping ​quality ​(phred scale): **>=20** (this exact expression, including ">​="​!)+
   - __Flagstat__ tabulate descriptive stats for BAM dataset   - __Flagstat__ tabulate descriptive stats for BAM dataset
     - BAM File to Convert: **output of Filter**     - BAM File to Convert: **output of Filter**
Line 204: Line 198:
  
 === - Run the workflow DNA-seq data analysis === === - Run the workflow DNA-seq data analysis ===
-  - Set the parameters.  + 
-    - Choose the right files.+{{:​training:​dudijon:​RunWorkflow.png?700|}} 
 + 
 +  ​- Choose the right files
 +  - Check the parameters
  
 {{:​training:​dudijon:​12-runworkflow.png?​700|}} {{:​training:​dudijon:​12-runworkflow.png?​700|}}
Line 212: Line 209:
  
 ++++ Answer | ++++ Answer |
-561598 - 531417 ​30181+561598 - 530355 ​31243
 ++++ ++++