Day 1-2 (07/11-12)

Day 1: Reviewed papers Chris sent, met with Chris and Eske on Zoom to discuss expectations, project outline on wiki page
Day 2: Set up FASRC account. Followed the "User Quick Start Guide" on FASRC page
- set up 2FA, FASRC VPN, terminal access, and watched a few data transfer videos
- Used rsync to load Eske's target directory /n/holystore01/LABS/stubbs_lab/Lab/Auxtel_data/spectrum_data onto computer
- Went into lab and met Eske, Ali, and Mark in person. Very cool people.
- Opened Jupyter notebook; looked up astropy documentation to open fits files
- Successfully opened fits file to view contents inside, retrieved and displayed table data, displayed file header information, and plotted data

Day 3 (07/13)

In Jupyter notebook, filtered all the fits files in '/spectrum_data' to print out star name and date/time of observation
Stored names of files in record type that are all associated with one star; looks like there are 4 observed stars in the dataset Eske gave me.

Day 4 (07/14)

Found data files missing the table of equivalent widths that Eske gave me

spec_data_2022062800336.fits HD205905 2022-06-29

Plotted H2O and O2 equivalent widths against air masses for each star on each night (four stars on four nights based on data Eske gave me)
Noticed that some files do not have O2 data (may have variants like O2(Z) or O2(B), though)

First Impressions of Data:

The equivalent widths of H20 seem to have much more variability than those of O2
May be some outliers in the data (particularly for the first two plots with the negative equivalent widths, may need to check on those data points

Findings

Links to Notebook:

Github repo of project code: https://github.com/ariscjj/stubbs

In notebook: https://github.com/ariscjj/stubbs/blob/master/Extract_H20.ipynb (added Chris as a collaborator on the github)

07/19/2022

Met with Chris over zoom, got a few tasks to do:
- ObsId failiure mode notes
  
  three slide intro to other undergrades 5 minute lightning intro (Thurs)
- Plot H- alpha, H-beta- should have no dependence on airmass; for quality control
- Separate O2 lines (B and Z) lines
- perform five sigma clipping (probably in sci py or astropy to clean data); data trimming
  - five sigma clipping: remove outliers outside of five STDs of data
  - If in future there are uncertainties in equivalent widths to be reported, compare reported with experimental STDs
- Add error bars to plots, error bars represent underlying Gaussian distribution
- create linear fit through data
  - a * (airmass) + b * \sqrt{airmass} + c
    create a polynomial fit
- Create list of questions for data reduction
  - Where are the uncertainties in equivalent widths?
  - Why are there different O2 lines for some plots and not others?
- Keep track of stars and airmass span
- process through more data, contact Chris when delta airmass is greater than 1
Finished adding the H-alpha, H-beta lines, abstracted code to make extraction easier
Separated O2 lines in to the B, Z, Y types and plotted
Working on sigma_clipping
Working on masking the data and applying the mask to the x column as well

20 Jul 2022

Successfully implemented five sigma clipping for all molecules and plots for each star and each night

07/21/22

successfully fit linear models to the data, can grab equations, and the R^2 value
Managed to fit the data to a single equation of the 1/2 order (a * x ^ (1/2) + b) but not (a * x ^ (1/2) + b * x + c)
attempting to fit the polynomial data, having some bugs fo fitting a polynomial to the 1/2 order and the first order
- bugs fixed, have (a * x ^ (1/2) + b * x + c) functions for each type of molecule's equivalent widths against airmass

Stubbs notes, July 22 2022

Not really enough detail here to tell where things stand..

near term goals:

construct a table that aggregates the airmass-dependent parameters a,b,c that come from the fits, with appropriate uncertainties, for all stars for all nights.

Then do consistency checks:

are the a and b values consistent for the oxygen line? Shouldn't depend on the star or the night
are the a and b values consistent with zero for all stars for all nights?
etc.

Identify instances of missing equivalent widths, to tell our friends who do these reductions.

Make a list of bad-data instances, so we can screen them out of our analysis.

stretch goals:

1)
Many of these stars have "known" spectra. They're called CALSPEC standards. See https://www.stsci.edu/hst/instrumentation/reference-data-for-calibration-and-tools/astronomical-catalogs/calspec

You could download the entire calspec catalog, and generate ASCII files of wavelength, flux. Note they use strange units. Our spectra are basically photons per sec per nm of wavelength but if Calspec uses energy units (likely ergs/nm/sec/cm^2) then need to convert to photon spectra (likely multiply by lambda).
There is an overall vertical scaling we don't care about, only relative fluxes matter.

Our observed spectra are the product of (star spectrum)*(atmos)*(instrumental response).

If you spline the CALSPEC spectra onto same wavelength scale as our wavelength-calibrated spectra, fit our spectra as a function of airmass (the actual spectra, not the equivalent widths), extrapolate to zero airmass, and divide that by the known stellar spectrum we should recover the instrumental throughput function.

2)
Using the spectra we have, for wavelengths less than 700 nm pick some clean regions and compute flux in 3-4 different intervals to investigate continuum reduction.

22 Jul 2022

Met with Chris and Eske to discuss new tasks for the week (more details can be found in personal .md file "Meeting 0722")
Continue working on the "Data_Clipping.ipynb" notebook to try to create a table that has a column of files and their missing information (missing data or negative eq widths)
Created a dictionary of files that stores the file name and an array with error messages → converted into an 2D array of filename and errors
Successfully created the table to output in Jupyter notebook, but having difficulties exporting an image of the data table, which would be nice to have. Tried the following:
from PIL import Image
import imgkit
import dataframe_image as dfi
which did not work even after I installed new packages
Here's what I have now:

Noted that I only have 187 files, I think Eske mentioned I should have access to a number of files in the 700 range. Will need to check with him on that.
Exported the data table as "faulty_files.csv" file that can be found below. I might not pursue any further the conversion of pandas data frame to image:
- faulty_files.csv
Cleaned up code (removed blocks of commented print statements, etc.)

ObsId	failiure mode	notes

Lab Notebook