/
Introduction to Unix and Orchestra - October 2015

Introduction to Unix and Orchestra - October 2015

General Information

This hands-on workshop teaches basic concepts, skills and tools for working more effectively with data. Less time, less pain.

Instructor led hands-on tutorials will use genomics data to demonstrate how to use the Linux/UNIX command line interface to perform basic text manipulations and effectively manage datasets. In addition, participants will learn how to effectively use a high-performance compute environment on the Orchestra compute cluster (HMS-RC) in the context of a RNA-Seq workflow. We will not be teaching any particular bioinformatics tools, but the foundational skills that will allow you to conduct any analysis and analyze the output of a genomics pipeline. 

We encourage participants to bring their own laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.

Who: The course is aimed at Harvard Medical School-affiliated researchers from the Harvard NeuroDiscovery Center or the Basic and Social Science Departments on the Quad . No prior computational experience is required, though we do expect familiarity with genomics concepts.

Where: Countway Library, Room L2-025, 10 Shattuck St, Boston, MA 02115. Get directions with Google Maps.

Requirements: Participants with their own laptop require a few specific software packages installed (listed below). 

Contact: Please email hbctraining@hsph.harvard.edu for more information. For any Orchestra-specific questions, please email rchelp@hms.harvard.edu.

Etherpad: https://etherpad.net/p/HBC_intro_to_unix

We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code. 


Schedule

 

********************************* Day1 (October 8th) Schedule *********************************
9:00 - 9:30Welcome and IntroductionsAll

9:30 - 9:50

Introduction to workshop

Radhika
9:50 - 10:30Unix 1: Introduction to the shellRadhika

10:30 - 10:45

Coffee  
10:45 - 11:35Unix 1: Introduction to the shell (continued)Radhika
11:35 - 12:30Unix 2: Searching and redirectionMary

12:30 - 13:30

Lunch 
13:30 - 14:30Unix 3: "for" loop and shell scriptsMeeta

14:30 - 15:00

Unix 4: Permissions and environment variables

Radhika

15:00 - 15:15

Coffee 

15:15 - 15:45 

Introduction to High Performance Computing

Radhika
15:45 - 17:00Introduction to OrchestraKris Holton, HMS-RC


********************************* Day2 (October 9th) Schedule *********************************
9:00 - 9:10HPC refresherRadhika

9:10 - 9:40

Project and data management

Meeta
9:40 - 10:30Data QCMary 

10:30 - 10:45

Coffee  
10:45 - 11:25Data QC (continued) Mary
11:25 - 12:30RNA-Seq workflowMeeta 

12:30 - 13:30

Lunch 
13:30 - 14:00RNA-Seq workflow (continued)Meeta

14:00 - 15:00

Automating RNA-Seq workflow

Radhika

15:00 - 15:15

Coffee 

15:15 - 16:15 

Automating RNA-Seq workflow (continued)

Radhika
15:45 - 17:00Resources and wrap-upRadhika

Setup

To participate in this workshop, you will need working copies of the described software. Please make sure to install everything (or at least to download the installers) before the start of your workshop. Participants are strongly encouraged to bring and use their own laptops to ensure the proper setup of tools for an efficient workflow once you leave the workshop.

1. Create your own account on Orchestra, if you do not already have one. Please do this ASAP.

    • An HMS eCommons ID is required to create your account on Orchestra Supercomputer. If you are unsure whether you have an account or forgot your password, there is a self-service link on the eCommons website: https://ecommons.med.harvard.edu/.

    • In order to create your own account on Orchestra, please do the following:
  1. Go to: https://rc.hms.harvard.edu/#orchestra
  2. Click the “Account Request” button (red). That will bring up a web-form on your screen for user account request.
  3. Please fill out the required fields (Name, eCommons ID, HMS (or affiliated) email address, and Organization/Department you belong to).

Once the account gets created, you will get an email from HMS Research Computing with a confirmation.


2. We strongly recommend that you bring your own computer, although we will have computers available.

    If you do bring your own computer:

a) install the following programs

Mac users

1. Java
2. Filezilla
3. Integrative Genomics Viewer (IGV)
4. Text Wrangler or Sublime Text

Windows users
1. Putty
2. Java
3. Filezilla
4. Integrative Genomics Viewer (IGV)
5. Notepad++

If you would like us to check your installations prior to class, please arrive at 8:30am on Thursday, and email us any questions about the installations.

b) be able to plug into the network (the wifi network is quite slow so Macbook Airs are not recommended!)

c) bring a power cord

 

Acknowledgements & Support

This workshop was sponsored by the HMS Tools and Technology Committee (TnT) and the Harvard NeuroDiscovery Center (HNDC).

The structure, objectives, and workflow of this workshop is derived from the Data Carpentry genomics workshop. Data Carpentry is supported by the Gordon and Betty Moore Foundation and a partnership of several NSF-funded BIO Centers (NESCentiPlantiDigBioBEACON and SESYNC) and Software Carpentry. The structure and objectives of the curriculum as well as the teaching style are informed by Software Carpentry.

 


Copyright © 2024 The President and Fellows of Harvard College * Accessibility * Support * Request Access * Terms of Use