...
...
...
...
...
ChIP-seq
...
Homework is due by July 17th, 2017. Please submit your homework documents here.
...
Practice Exercises
****NOTE: Work When working on O2, be sure to run this in /n/scratch2/ for this homework!rather than your home directory .****
# ChIP-Seq Analysis Workflow
1. Create Create a directory called
in your folder named with your called your eCommons ID within withinHCFC1_chipseq
/n/scratch2/
. Enter that directory and create a new directory called HCFC1_chipseq.
2. You have strong evidence that HCFC1 is the transcription co-factor that associates with your protein of interest. To confirm this hypothesis you need to find binding regions for HCFC1 and see if they overlap with your current list of regions. The ENCODE project has ChIP-seq data for HCFC1 using a human liver cancer cell line HepG2 which contains 32 bp single end reads. We have downloaded this data and made it available for you on Orchestra. (NOTE: If you are interested in finding out more about the dataset you can find the ENCODE record here).
a. Setup a project directory structure within the HCFC1_chipseq
directory as shown below and copy over the raw FASTQ files from from /n/groups/hbctraining/ngs-data-analysis-longcourse/chipseq/HCFC1
into the appropriate directory:
...
b. Create a shell script that uses positional parameters and takes in a .fastq
file as input and will perform the following:
- Run FASTQC on all FASTQ files. FASTQC
- Align reads with Bowtie2 using the parameters we used in class. **
NOTE: we are not trimming so you will need to modify the Bowtie2 command. Use the lesson from class or the user manual to help**.
You For the Bowtie2 index you will need to copy over or point to the hg19 index files from:/n/groups/shared_databases/igenome/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/
- Change alignment file format from SAM to BAM (can be done using samtools or sambamba)
- Sort the BAM file by read coordinate locations (can be done using sambamba or with samtools)
- Filter to keep only uniquely mapping reads (this will also remove any unmapped reads) using sambamba
...