ChIP-seq Practice Exercises
***NOTE: When working on O2, be sure to run this in /n/scratch2/ rather than your home directory .****
# ChIP-Seq Analysis Workflow
1. Create Create a directory called your eCommons ID within on /n/scratch2
/. Enter using your eCommons user ID as the directory name. Within that directory and create a new directory called called
HCFC1_chipseq
.
2. You have strong evidence that HCFC1 is the transcription co-factor that associates with your protein of interest. To confirm this hypothesis you need to find binding regions for HCFC1 and see if they overlap with your current list of regions. The ENCODE project has ChIP-seq data for HCFC1 using a human liver cancer cell line HepG2 which contains 32 bp single end reads. We have downloaded this data and made it available for you on Orchestra. (NOTE: If you are interested in finding out more about the dataset you can find the ENCODE record here).
a. Setup a project directory structure within the HCFC1_chipseq
directory as shown below and copy over the raw FASTQ files from /n/groups/hbctraining/ngs-data-analysis-longcourse/chipseq/HCFC1
into the appropriate directory:
...