Day 1-2 (07/11-12)
- Day 1: Reviewed papers Chris sent, met with Chris and Eske on Zoom to discuss expectations, project outline on wiki page
- Day 2: Set up FASRC account. Followed the "User Quick Start Guide" on FASRC page
- set up 2FA, FASRC VPN, terminal access, and watched a few data transfer videos
- Used rsync to load Eske's target directory /n/holystore01/LABS/stubbs_lab/Lab/Auxtel_data/spectrum_data onto computer
- Went into lab and met Eske, Ali, and Mark in person. Very cool people.
- Opened Jupyter notebook; looked up astropy documentation to open fits files
- Successfully opened fits file to view contents inside, retrieved and displayed table data, displayed file header information, and plotted data
Day 3 (07/13)
- In Jupyter notebook, filtered all the fits files in '/spectrum_data' to print out star name and date/time of observation
- Stored names of files in record type that are all associated with one star; looks like there are 4 observed stars in the dataset Eske gave me.
Day 4 (07/14)
- Found data files missing the table of equivalent widths that Eske gave me
spec_data_2022062800336.fits HD205905 2022-06-29
- Plotted H2O and O2 equivalent widths against air masses for each star on each night (four stars on four nights based on data Eske gave me)
- Noticed that some files do not have O2 data (may have variants like O2(Z) or O2(B), though)
First Impressions of Data:
- The equivalent widths of H20 seem to have much more variability than those of O2
- May be some outliers in the data (particularly for the first two plots with the negative equivalent widths, may need to check on those data points
Links to Notebook:
Github repo of project code: https://github.com/ariscjj/stubbs
In notebook: https://github.com/ariscjj/stubbs/blob/master/Extract_H20.ipynb (added Chris as a collaborator on the github)
- Met with Chris over zoom, got a few tasks to do:
ObsId failiure mode notes Plot H- alpha, H-beta- should have no dependence on airmass; for quality control
Separate O2 lines (B and Z) lines
perform five sigma clipping (probably in sci py or astropy to clean data); data trimming
five sigma clipping: remove outliers outside of five STDs of data
If in future there are uncertainties in equivalent widths to be reported, compare reported with experimental STDs
Add error bars to plots, error bars represent underlying Gaussian distribution
create linear fit through data
a * (airmass) + b * \sqrt{airmass} + c
create a polynomial fit
Create list of questions for data reduction
Where are the uncertainties in equivalent widths?
Why are there different O2 lines for some plots and not others?
Keep track of stars and airmass span
- successfully fit linear models to the data, can grab equations, and the R^2 value
- Managed to fit the data to a single equation of the 1/2 order (a * x ^ (1/2) + b) but not (a * x ^ (1/2) + b * x + c)
- attempting to fit the polynomial data, having some bugs fo fitting a polynomial to the 1/2 order and the first order
- bugs fixed, have (a * x ^ (1/2) + b * x + c) functions for each type of molecule's equivalent widths against airmass
- Met with Chris and Eske to discuss new tasks for the week (more details can be found in personal .md file "Meeting 0722")
- Continue working on the "Data_Clipping.ipynb" notebook to try to create a table that has a column of files and their missing information (missing data or negative eq widths)
- Created a dictionary of files that stores the file name and an array with error messages → converted into an 2D array of filename and errors
- Successfully created the table to output in Jupyter notebook, but having difficulties exporting an image of the data table, which would be nice to have. Tried the following:
from PIL import Image
import imgkit
import dataframe_image as dfi
which did not work even after I installed new packages - Here's what I have now:
- Noted that I only have 187 files, I think Eske mentioned I should have access to a number of files in the 700 range. Will need to check with him on that.
- Exported the data table as "faulty_files.csv" file that can be found below. I might not pursue any further the conversion of pandas data frame to image:
- Cleaned up code (removed blocks of commented print statements, etc.)
- Can look more into pandas to make the data table more readable (Maybe group by errors, or something else)