diff --git a/docs/index.rst b/docs/index.rst index 785b4397..28a301b3 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -77,3 +77,5 @@ Development Docs :maxdepth: 1 process/index + terms + diff --git a/docs/terms.rst b/docs/terms.rst new file mode 100644 index 00000000..815233e0 --- /dev/null +++ b/docs/terms.rst @@ -0,0 +1,488 @@ + + +Terminology +----------- + +As part of a workshop at NOIRLab, 13-16 Nov 2024, the attendees +extensively discussed terminology in an effort to generate a common +understanding of, and the nuances in, many of the terms bandied about +when discussing spectroscopy data and its reduction and analysis. The +following is a living document that stemmed from that original +discussion. + +.. note:: + + Below, relative terms that can have multiple meanings are + highlighted in italics. + + +2D Spectrum/Image +================= + +- Image = 2D Image +- Optical/IR, different for other wavelengths +- Something that is not necessarily pure “raw” data, but data that + exists prior to extraction +- Refers to e.g. the 2D CCD images (optionally pre-processed) from which + 1D spectra (flux vs. wavelength) are extracted + +.. KBW: I missed this discussion, so I don't know how we want to capture + "Alt" definitions here... Seems a bit awkward. + +Alt Spectrum 2D +=============== + +- Spectrum as a function of slit position? +- Fiber dispersed +- Lots of alt Spectra 2D +- Multiple spectral orders? +- Can get up to something like 5D for solar data: wavelength x space x + space x time x polarization +- Is there an IVOA definition that differs from both of the above? + +1D Spectrum +=========== + +- Flux versus spectral axis (wavelength/frequency). Could be calibrated but not necessarily. + +*Preprocessing* +=============== + +- (As related to spectral reduction): Things like flat-fielding and bias + subtraction that are done on a full CCD image before extractions are + performed. +- Alt term: instrument signature removal +- Meaning of this depends on context (“What do others do before I start?”) +- Typically done on raw/2d image +- Work done before mine. + +*Post-processing* +================= + +- Typically done on the extracted spectra? + +Extraction +========== + +- The process of converting raw spectrum data on 2D image into flux versus + spectral axis or pixel (i.e. Spectrum1D), not necessarily flux or spectral + calibration. + +Rectified ND spectrum +===================== + +- Non-dispersion axes means something like ra/dec, or polarization? +- Nearly always (maybe always?) Implies some amount of resampling +- There is one dimension that is ENTIRELY a spectral axis, all others are not + specified by calling it a “Rectified ND Spectrum”, although they often do have + some kind of meaning + +Calibrated 2D image +=================== + +- Not-resampled. +- Calibrated from raw pixel to wavelength. + +.. KBW: Not really sure this is terminology, but I've left it as written in the + workshop notes. + +Facilitating Data sharing, cross-matching, etc., standards via IVOA standards +============================================================================= +- Data models (but see below) +- COM +- ObsCore +- SSA +- Not widely adopted, but could be! + +*Workflow* +========== + +- More holistic than a pipeline. Often a superset of pipeline with additional + steps to facilitate data ingest, job orchestration, data + collection/coordination, data archiving, etc. +- Also analysis. + +*Pipeline* +========== + +- Organized code for **automatically** processing raw data into calibrated spectra and + - Optionally derived quantities like redshifts, line fits, ... +- More generically, the linking of several steps/jobs/codes/methods where + outputs of one feed as inputs to another. +- “Spectral reduction pipeline”? vs. “analysis pipeline” vs. “XYZ pipeline” +- SDSS and JWST pipelines are different, Dragons has three pipelines. DESI has + different pipelines depending on who you talk to. + +Classification +============== +- Identifying the type of object a spectrum represents: star, galaxy, QSO, ... +- Result of Model fitting (auto or by-eye). + +*Redshifting* +============= + +- Determining the redshift of a spectrum, which may or may not be independent of + classification. +- Sometimes also means “changing the redshift of the object to its redshift”, + e.g. converting the “wavelength/frequency” axis to rest-frame +- Related issues on GitHub: + - `#258 `__ + - `#455 `__ + - `#820 `__ + +Heliocentric / barycentric correct +================================== + +- Converting a spectrum to some rest frame in order to measure a radial velocity + for a nearby stellar source. +- Heliocentric is “the frame where the sun is at rest” +- Barycentric is “the frame where the barycenter of the solar system is at rest” +- (Some of this is very very precisely defined at the GR level) + +*Archive* +========= + +- A physical or virtual location from which processed data can be accessed. This + could include both PI/collaboration access and public access + + - Or unprocessed. + - And metadata needed to reduce raw data. + +- Raw data bundle (science+all cals needed to reduce it) +- Ideally would Supporting FAIR principles (Findable, Accessible, Interoperable, + Reusable) +- Noun or verb. + +Data Assembly +============= + +- A bundle of data prepared for collaboration access (to write papers, etc.) + that will eventually become a data release. +- Used internally by DESI, but deprecated. +- It becomes the data release at the DR date +- Sloan synonym: “Internal Product Launch” + +Data release +============ + +- A bundle of data specifically intended to be public +- Can be raw or not raw +- Somehow “pinned” data raw/reduced/analyzed with a particular version of + pipelines. +- Aspires to be frozen. +- Can be either a noun or a verb + +Open Development +================ + +- Developing software in a way that the community can see both how it has been + developed and why it was developed that way. +- Usually, but not absolutely necessarily, implies the community is also free to + contribute. +- Not necessarily open source. +- Repos are publicly visible, including issue tracker. + +Flux calibration +================ + +- Converting a 1D/2D spectrum from “counts” to astrophysical units of flux + density + +*Telluric Correction* +===================== + +- Removing the telluric (atmospheric) absorption bands from spectra +- Removing the *multiplicative* component of the sky - absorption +- But there was some disagreement over whether this includes sky + +*Sky subtraction* +================= + +- Removing the *additive* component of the sky/emission so all “photons” come + from the source +- But there was some disagreement over whether this is overlapping with a + Telluric correction + +IFU (Integral-Field Unit) +========================= + +- Covers a “contiguous” 2d field on the sky with spatial information along both + axes +- Fibers or similar are tightly bundled and contiguously cover a region on the + sky. Or an image slicer. Or a microlens array. +- May or may not be multi-object. +- IFS (Integral field Spectrograph) and IFU are sometimes distinguished where + IFS is the whole instrument but IFU is the head-unit that does the IF part + +MOS (Multi-Object Spectroscopy) +=============================== + +- Could be a fiber or a slit +- Multiple objects observed in the same exposure + +*Flux* +====== + +- Energy per time per area +- Also used as a shorthand for “the not spectral unit part of a 1D spectrum” + (would that be the “dependent variable”?) +- Oftentimes used to mean “flux density” +- `Spectrum1D + `__ + uses the attribute 'flux'. Should this be renamed to 'flux_density'? + + - The intent in specutils was to not agonize over this but just accept that + it's a shorthand astronomers use, and there wasn't a better name (“y”, + “data”, etc) + +Flux Density +============ + +- Flux per unit wavelength/energy/wavenumber, usually(?) in astrophysical units, + e.g., W/m^2/nm + +*Row-stacked spectra* +===================== + +- Collection of 1D spectra in a 2D array (image?), one spectrum per row. +- Shared spectral axis. +- This is the format of specutils.Spectrum1D when it's a “vector” spectrum1D + +*Data cube* +=========== + +- Spectral 3D matrix with 2 spatial dimensions and one spectral one. Product of + IFU data with contiguous sky coverage + + - Doesn't even have to be spectral, although in the spectral context it + usually is + +- Not always 3D (data hypercuboid??). +- “Multi dimensional data blob” + +*Spectral data cube* +==================== + +- At least one axis is a spectral axis but who knows about the rest! +- Hypercube. + +*[Spectral] Data format* +======================== + +- “Format” can mean data structure (i.e., in-memory, possibly bound to a + particular language, though it doesn't have to be - see Apache Arrow) +- “Format” can also mean a file format +- “Format” can also be something even more technical like “how the bytes in a + struct are packed“ + +Data Structures +=============== + +- Python Data structures, which are Python classes. + + - NDData/NDCube/SpectrumCollection, Spectrum1D etc. + - CCDData. Subclass of NDData + - AstroData - from DRAGONS (collection of NDData-like objects, mapped to a + file, plus metadata abstraction etc.) + - Lots of classes to represent spectra + - Link to issue about renaming Spectrum1D class in specutils. + - arrays + +*Data Model* +============ + +- In the SDSS/DESI sphere, this has a meaning that is known to differ from other + uses of the term. In SDSS/DESI this means a documentation product that + describes all of the files in a data release, both file formats and how they + are organized into a hierarchy of directories on disk. For example, + see the `desidatamodel `__. + +- IVOA data model is a formalized thing that follows a specific XML schema + + - Data model is abstract, implementation could potentially be different. + - In the IVOA it is not yet allowed to be anything other than XML although + there's a lot of interest in changing that + +- Which is different from SQL data models + + - The word 'schema' is sometimes used here, but that is also ambiguous even + within SQL flavors itself. + +*Spectroscopic search - Data discovery* +======================================= + +- Search for spectra from any/particular instrument based on position or other + known properties of the sources. If available, all the spectra will be listed. +- Example tool to do this: SPARCL (How-To Jupyter notebook available here) + +SSA = Simple Spectral Access [VO protocol] +========================================== + +- "Uniform interface to remotely discover and access one-dimensional spectra." + See `here `__. +- Not commonly (used in the US?). (Example of use Data Central) + +Reduction (of spectroscopic data) +================================= + +- Getting data from raw-off-the-instrument (or nearly so) to the point where + analysis can be done. +- The process of turning 2D spectral images to 1D spectra. Can be wavelength + calibrated, sky subtracted, flux calibrated, but intermediate products are + "reduced" compared to earlier steps of the process. +- MAYBE: Can potentially be done automatically without a human-in-the-loop? +- Required products vs optional products +- Removing instrument signature. +- "Reduction" is in the sense of reducing complexity, but it is often an + inflation of bytes (in radio it is a literal reduction, in optical usually + not) +- Astronomy specific word. +- Spectroscopic reduction: the process of going from raw data to science-ready + spectra + +Analysis (of spectroscopic data) +================================ + +- Taking scientific measurements or achieving scientific results from + already-reduced spectroscopic data +- Analysis does not depend on the instrument.Rem +- MAYBE: cannot be done automatically, requires a human to make some sort of + judgment +- Optional. +- Spectroscopic analysis: the process of going from science-ready spectra to + science + +Sky +=== + +- Model or observed sky background (really a foreground!) which was usually + subtracted from the observed spectrum + +*Stacking* +========== + +- Combination of multiple spectra in a prescribed way to increase quality (e.g., + S/N) +- in some contexts like numpy arrays and astropy.table.vstack, it can refer to + combining multiple objects / tables into a single object without coadding data. + +*Coadding* +========== + +- combining multiple spectra of the same object into a single spectrum, e.g. to + improve signal-to-noise or combine spectra across multiple wavelength ranges. +- Alternative term for “stacking” to disambiguate meanings + +Spectral fitting +================ + +- Modeling an observed spectrum with templates (possibly physically motivated) + and/or mathematical functions + +Digital Twins +============= + +- Realistic fake data that potentially adapts to new states of the system over + time. + +Trace fitting +============= + +- on a 2D CCD image from a multi-object spectrometer, typically wavelengths span + in 1 direction and fibers/objects in the other direction. “trace fitting” is + mapping the y vs. x of where the spectra actually go on the CCD image. + +- Different for slit-based and fiber-based spectrographs + + - slit-based: trace the *edges* of the slit spectrum along the spectral + direction + - fiber-based: trace the *center* of the fiber spectrum spatial point-spread + function + +- Spectral tracing + +Wavelength calibration +====================== + +- calibrating what true wavelength is represented by the observed photons on a + detector, e.g. what wavelength is row y of a detector? +- Possibly a 2d process +- The process of adding (the spectral part of) a WCS + +Visual Inspection (of spectra) +============================== + +- Humans looking at spectra and making decisions about what the “truth” is. +- Can include identifying the presence/absence of features (qualitative) or + assessing a quantitative fit (e.g., best-fit redshift value) + +*Spectral resolution* +===================== + +- Changes in spectral dispersion power as a function of wavelength due to + instrument + +- Resolving power vs Resolution/Dispersion + + - Resolving power is the ability to distinguish close features + - Dispersion is the change in wavelength/energy per pixel + +API +=== + +- Application Programming Interface + + - What makes a good API for spectroscopic software? + - What is needed for different aspects of spectroscopic software (e.g., + reduction vs. archive access)? + +*Spectral class* +================ + +- E.g., Spectrum1D +- In SDSS, 'class' is short for 'classification'. +- DESI uses SPECTYPE for spectral type (QSO, GALAXY, STAR) + +Package +======= + +- A software tool or collection of tools developed in the same “space” +- Has a specific meaning in a Python context that’s more specific, but can be + used more generally for multiple languages + +Spectral data visualization +=========================== + +- Tools and procedures to display reduced spectral data + +Spectral decomposition +====================== + +- a form of spectral fitting that identifies separate components that appear in + a spectrum; e.g, quasar + galaxy. +- Goes with spectral fitting. + +Processing Steps +================ + +- For DESI (largely inherited from SDSS usage): + + - pre-processing (of CCD images, bias, dark, pixel-flat fielding) + - extraction (getting counts vs. wavelength from 2D images) + - sky subtraction (subtracting the additive non-signal sky component) + - flux calibration (includes both instrument throughput and telluric + absorption multiplicative corrections) + - classification and redshift fitting (is it a galaxy, star, or quasar; at + what redshift?) + + +---- + +Mentioned but not defined +------------------------- + +- WCS & Database archive +- Cloud archiving +- Modular functions which can be used by other pipelines. +- Interactive Dashboard +