Target class

The heart & soul of the pySYD package

Introduction

At some point, every given star will be processed as a pysyd.target.Target object. This is where a bulk of the scientific data analysis is done.

Imports

Usage

pysyd.target API

class pysyd.target.Target(name, args)

Main pipeline target object

Deprecated since version 1.6.0: Target.ok will be removed in pySYD 6.0.0, it is replaced by new error handling, that will instead raise exceptions or warnings

A new instance (or star) is created for each target that is processed. Instantiation copies the relevant, individual star dictionary (and the inherited constants) and will then load in data using the provided star name

Parameters
namestr

which target to load in and/or process

argspysyd.utils.Parameters

container class of pysyd parameters

Attributes
paramsDict

copy of args.params[name] dictionary with pysyd parameters and options

check_numax(columns=['numax', 'dnu', 'snr'])

Check \(\\rm \\nu_{max}\)

Checks if there is an initial starting point or estimate for numax

Parameters
columnsList[str]

saved columns if the estimate_numax() function was run

Raises
utils.InputError

if an invalid value was provided as input for numax

utils.ProcessingError

if it still cannot find any estimate for numax

collapse_ed(n_trials=3)

Get ridges

Optimizes the large frequency separation by determining which spacing creates the “best” ridges (but is currently under development) think similar to a step-echelle but quicker and more hands off?

Attributes
xnumpy.ndarray

x-axis for the collapsed ED ~[0, \(2\times\Delta\nu\)]

ynumpy.ndarray

marginalized power along the y-axis (i.e. collapsed on to the x-axis)

Important

need to optimize this - currently does nothing really

collapsed_acf(n_trials=3, step=0.25, max_snr=100.0)

Collapsed ACF

Computes a collapsed autocorrelation function (ACF) using n different box sizes in n different trials (i.e. n_trials)

Parameters
n_trialsint

the number of trials to run

stepfloat

fractional step size to use for the collapsed ACF calculation

max_snrfloat

the maximum signal-to-noise of the estimate (this is primarily for plot formatting)

compute_acf(fft=True, smooth_ps=2.5)

ACF

Compute the autocorrelation function (ACF) of the background-divided power spectrum (i.e. bg_corr), with an option to smooth the BCPS first

Parameters
fftbool, default=True

if True, uses FFTs to compute the ACF, otherwise it will use numpy.correlate

smooth_psfloat, optional

convolve the background-corrected PS with a box filter of this width (\(\rm \mu Hz\))

Attributes
bgcorr_smoothnumpy.ndarray

smoothed background-corrected power spectrum if smooth_ps != 0 else copy of bg_corr

lag, autonumpy.ndarray, numpy.ndarray

the autocorrelation of the “zoomed-in” power spectrum

compute_spectrum(oversampling_factor=1, store=False)

Compute power spectrum

NEW function to calculate a power spectrum given time series data, which will normalize the power spectrum to spectral density according to Parseval’s theorem

Parameters
oversampling_factorint, default=1

the oversampling factor to use when computing the power spectrum

storebool, default=False

if True, it will store the original data arrays for plotting purposes later

Yields
frequency, powernumpy.ndarray, numpy.ndarray

power spectrum computed from the input time series data (i.e. time & flux) using the astropy.timeseries.LombScargle module

Returns
frequency, powernumpy.ndarray, numpy.ndarray

the newly-computed and normalized power spectrum (in units of \(\rm \mu Hz\) vs. \(\rm ppm^{2} \mu Hz^{-1}\))

Important

If you are unsure if your power spectrum is in the proper units, we recommend using this new module to compute and normalize for you. This will ensure the accuracy of the results.

correct_background(metric='bic')

Correct background

Corrects for the stellar background contribution in the power spectrum by both dividing and subtracting this out, which also saves copies of each (i.e. bg_div background-divided power spectrum to ID_BDPS.txt and bg_sub background-subtracted power spectrum to :ref:`ID_BSPS.txt). After this is done, a copy of the BDPS is saved to bg_corr and used for dnu calculations and the echelle diagram.

Parameters
metricstr, default=’bic’

which metric to use (i.e. bic or aic) for model selection

Attributes
frequency, bg_divnumpy.ndarray, numpy.ndarray

background-divded power spectrum (BDPS -> higher S/N for echelle diagram)

frequency, bg_subnumpy.ndarray, numpy.ndarray

background-subtracted power spectrum (BSPS -> preserves mode amplitudes)

frequency, bg_corrnumpy.ndarray, numpy.ndarray

background-corrected power spectrum, which is a copy of the BDPS

derive_parameters(mc_iter=1)

Derive parameters

Main function to derive the background and global asteroseismic parameters (including uncertainties when relevant), which does everything from finding the initial estimates to plotting/saving results

Parameters
mc_iterint, default=1

the number of iterations to run

Methods
pysyd.target.Target.check_numax

first checks to see if there is a valid estimate or input value provides for numax

pysyd.target.Target.initial_parameters

if so, it will estimate the rest of the initial guesses required for the background and global fitting (primarily using solar scaling relations)

pysyd.target.Target.first_step

the first iteration determines the best-fit background model and global properties

pysyd.target.Target.get_samples

bootstrap uncertainties by attempting to recover the parameters from the first step

echelle_diagram(smooth_ech=None, nox=None, noy='0+0', hey=False, npb=10, nshift=0, clip_value=3.0)

Echelle diagram

Calculates everything required to plot an echelle diagram Note: this does not currently have the get_ridges method attached (i.e. not optimizing the spacing or stepechelle)

Parameters
smooth_echfloat, default=None

value to smooth (i.e. convolve) ED by

noxint, default=0

number of grid points in x-axis of echelle diagram

noystr, default=’0+0’

number of orders (y-axis) to plot in echelle diagram

npbint, default=10

option to provide the number of points per bin as opposed to an arbitrary value (calculated from spacing and frequency resolution)

nshiftint, default=0

number of orders to shift echelle diagram (i.e. + is up, - is down)

heybool, default=False

plugin for Dan Hey’s echelle package (not currently implemented)

clip_valuefloat, default=3.0

to clip any peaks higher than Nx the median value

Attributes
ednumpy.meshgrid

smoothed + summed 2d power for echelle diagram

extentList[float]

bounding box for echelle diagram

estimate_background(ind_width=20.0)

Background estimates

Estimates initial guesses for the stellar background contributions for both the red and white noise components

Parameters
ind_widthfloat

the independent average smoothing width (\(\rm \mu Hz\))

Attributes
bin_freq, bin_pow, bin_errnumpy.ndarray, numpy.ndarray, numpy.ndarray

binned power spectrum using the ind_width bin size

Methods

estimate_numax(binning=0.005, bin_mode='mean', smooth_width=20.0, ask=False)

Estimate numax

Automated routine to identify power excess due to solar-like oscillations and estimate an initial starting point for numax (\(\nu_{\mathrm{max}}\))

Parameters
binningfloat

logarithmic binning width (i.e. evenly spaced in log space)

bin_mode{‘mean’, ‘median’, ‘gaussian’}

mode to use when binning

smooth_width: float

box filter width (in \(\rm \mu Hz\)) to smooth power spectrum

askbool

If True, it will ask which trial to use as the estimate for numax

Attributes
bin_freq, bin_pownumpy.ndarray, numpy.ndarray

copy of the power spectrum (i.e. freq & pow) binned equally in logarithmic space

smooth_freq, smooth_pownumpy.ndarray, numpy.ndarray

copy of the binned power spectrum (i.e. bin_freq & bin_pow) binned equally in linear space – yes, this is doubly binned intentionally

freq, interp_pownumpy.ndarray, numpy.ndarray

the smoothed power spectrum (i.e. smooth_freq & smooth_pow) interpolated back to the original frequency array (also referred to as “crude background model”)

freq, bgcorr_pownumpy.ndarray, numpy.ndarray

approximate background-corrected power spectrum computed by dividing the original PS (pow) by the interpolated PS (interp_pow)

Methods
estimate_parameters(estimate=True)

Estimate parameters

Calls all methods related to the first module

Parameters
estimatebool, default=True

if numax is already known, this will automatically be skipped since it is not needed

Methods
first_step(background=True, globe=True)

First step

Processes a given target for the first step, which has extra steps for each of the two main parts of this method (i.e. background model and global fit):

  1. background model: the automated best-fit model selection is only performed in the first step, the results which are saved for future purposes (including the background-corrected power spectrum)

  2. global fit: while the ACF is computed for every iteration, a mask is created in the first step to prevent the estimate for dnu to latch on to a different (i.e. incorrect) peak, since this is a multi-modal parameter space

Parameters
backgroundbool

run the automated background-fitting routine

globebool

perform global asteroseismic analysis (really only relevant if interested in the background model only)

Methods
pysyd.target.Target.estimate_background

estimates the amplitudes/levels of both correlated and frequency-independent noise properties from the input power spectrum

pysyd.target.Target.model_background

automated best-fit background model selection that is a summed contribution of various white + red noise componenets

pysyd.target.Target.global_fit

after correcting for the best-fit background model, this derives the global asteroseismic parameters

fix_data(frequency, power, save=True, kep_corr=False, ech_mask=None, lower_ech=None, upper_ech=None)

Fix frequency domain data

Applies frequency-domain tools to power spectra to “fix” (i.e. manipulate) the data. If no available options are used, it will simply return copies of the original arrays

Parameters
savebool

save all data products

kep_corrbool

correct for known Kepler short-cadence artefacts

ech_maskList[lower_ech,upper_ech]

corrects for mixed modes if not None

lower_echfloat

folded lower frequency limit (~[0,dnu])

upper_echfloat

folded upper frequency limit (~[0,dnu])

frequency, powernumpy.ndarray, numpy.ndarray

input power spectrum to be corrected

Methods
Returns
frequency, powernumpy.ndarray, numpy.ndarray

copy of the corrected power spectrum

frequency_spacing(n_peaks=10)

Estimate \(\Delta\nu\)

Estimates the large frequency separation (or \(\Delta\nu\)) by fitting a Gaussian to the peak of the ACF “cutout” using scipy.curve_fit.

Parameters
n_peaksint, default=10

the number of peaks to identify in the ACF

Attributes
peaks_l, peaks_anumpy.ndarray

the n highest peaks (n_peaks) in the ACF

zoom_lag, zoom_autonumpy.ndarray

cutout from the ACF of the peak near dnu

Returns
convergebool

returns False if a Gaussian could not be fit within the 1000 iterations

Raises
utils.ProcessingError

if a Gaussian could not be fit to the provided peak

Note

For the first step, a Gaussian weighting (centered on the expected value for dnu, or exp_dnu) is automatically computed and applied by the pipeline to prevent the fit from latching on to a peak that is a harmonic and not the actual spacing

get_background()

Get background

Attempts to recover background model parameters in later iterations by using the scipy.curve_fit module using the same best-fit background model settings

Returns
convergebool

returns False if background model fails to converge

get_epsilon(n_trials=3)

Get ridges

Optimizes the large frequency separation by determining which spacing creates the “best” ridges (but is currently under development) think similar to a step-echelle but quicker and more hands off?

Attributes
xnumpy.ndarray

x-axis for the collapsed ED ~[0, \(2\times\Delta\nu\)]

ynumpy.ndarray

marginalized power along the y-axis (i.e. collapsed on to the x-axis)

Important

need to optimize this - currently does nothing really

get_samples()

Get samples

Estimates uncertainties for parameters by randomizing the power spectrum and attempting to recover the same parameters by calling the pysyd.target.Target.single_step

Attributes
frequency, powernumpy.ndarray, numpy.ndarray

copy of the critically-sampled power spectrum (i.e. freq_cs & pow_cs) after applying the mask~[lower_bg,upper_bg]

pbartqdm.tqdm, optional

optional progress bar used with verbose output when running multiple iterations

Note

all iterations except for the first step are applied to the critically-sampled power spectrum and not the oversampled power spectrum

Important

if the verbose option is enabled, the tqdm package is required

global_fit()

Global fit

Fits global asteroseismic parameters \(\rm \nu{max}\) and \(\Delta\nu\), where the former is estimated two different ways.

Methods

pysyd.target.Target.numax_smooth pysyd.target.Target.numax_gaussian pysyd.target.Target.compute_acf pysyd.target.Target.frequency_spacing

initial_estimates(lower_ex=1.0, upper_ex=8000.0, max_trials=6)

Initial estimates

Prepares data and parameters associated with the first module that identifies solar-like oscillations and estimates numax

Parameters
lower_exfloat

the lower frequency limit of the PS used to estimate numax

upper_exfloat

the upper frequency limit of the PS used to estimate numax

max_trialsint

(arbitrary) maximum number of “guesses” or trials to perform to estimate numax

Attributes
frequency, powernumpy.ndarray, numpy.ndarray

copy of the entire oversampled (or critically-sampled) power spectrum (i.e. freq_os & pow_os)

freq, pownumpy.ndarray, numpy.ndarray

copy of the entire oversampled (or critically-sampled) power spectrum (i.e. freq_os & pow_os) after applying the mask~[lower_ex,upper_ex]

modulestr, default=’parameters’

which part of the pipeline is currently being used

initial_parameters(module='parameters', lower_bg=1.0, upper_bg=8000.0)

Initial guesses

Estimates initial guesses for background components (i.e. timescales and amplitudes) using solar scaling relations. This resets the power spectrum and has its own independent filter or bounds (via [lower_bg, upper_bg]) to use for this subroutine

Parameters
lower_bgfloat

lower frequency limit of PS to use for the background fit

upper_bgfloat

upper frequency limit of PS to use for the background fit

Attributes
frequency, powernumpy.ndarray, numpy.ndarray

copy of the entire oversampled (or critically-sampled) power spectrum (i.e. freq_os & pow_os)

frequency, random_pownumpy.ndarray, numpy.ndarray

copy of the entire oversampled (or critically-sampled) power spectrum (i.e. freq_os & pow_os) after applying the mask~[lower_bg,upper_bg]

modulestr

which part of the pipeline is currently being used

iint

iteration number, starts at 0

Methods
pysyd.target.Target.solar_scaling

uses multiple solar scaling relations to determnie accurate initial guesses for many of the derived parameters

Warning

This is typically sufficient for most stars but may affect evolved stars and need to be adjusted!

load_file(path)

Load text file

Load a light curve or a power spectrum from a basic 2xN txt file and stores the data into the x (independent variable) and y (dependent variable) arrays, where N is the length of the series

Parameters
pathstr

the file path of the data file

Returns
x, ynumpy.ndarray, numpy.ndarray

the independent and dependent variables, respectively

load_power_spectrum(long=1000000)

Load power spectrum

Loads in available power spectrum and computes relevant information – also checks for time series data and will raise a warning if there is none since it will have to assume a critically-sampled power spectrum

Attributes
notestr, optional

verbose output

psbool

True if star ID has an available (or newly-computed) power spectrum

Yields
frequency, powernumpy.ndarray, numpy.ndarray

input power spectrum

freq_os, pow_osnumpy.ndarray, numpy.ndarray

copy of the oversampled power spectrum (i.e. frequency & power)

freq_cs, pow_csnumpy.ndarray, numpy.ndarray

copy of the critically-sampled power spectrum (i.e. frequency & power) iff the oversampling_factor is provided, otherwise these arrays are just copies of freq_os & pow_os since this factor isn’t known and needs to be assumed

Raises
pysyd.utils.InputWarning

if no information or time series data is provided (i.e. has to assume the PS is critically-sampled)

load_time_series(save=True, stitch=False, oversampling_factor=None)

Load light curve

Loads in time series data and calculates relevant parameters like the cadence and nyquist frequency

Parameters
savebool

save all data products

stitchbool

“stitches” together time series data with large “gaps”

oversampling_factorint, optional

oversampling factor of input power spectrum

Attributes
notestr, optional

verbose output

lcbool

True if star ID has light curve data available

cadenceint

median cadence of time series data (\(\\Delta t\))

nyquistfloat

nyquist frequency of the power spectrum (calculated from time series cadence)

baselinefloat

total time series duration (\(\\Delta T\))

tau_upperfloat

upper limit of the granulation time scales, which is set by the total duration of the time series (divided in half)

Yields
time, fluxnumpy.ndarray, numpy.ndarray

input time series data

frequency, powernumpy.ndarray, numpy.ndarray

newly-computed frequency array using the time series array (i.e. time & flux)

freq_os, pow_osnumpy.ndarray, numpy.ndarray

copy of the oversampled power spectrum (i.e. frequency & power)

freq_cs, pow_csnumpy.ndarray, numpy.ndarray

copy of the critically-sampled power spectrum (i.e. frequency & power)

Raises
pysyd.utils.InputWarning

if the oversampling factor provided is different from that computed from the time series data and power spectrum

pysyd.utils.InputError

if the oversampling factor calculated from the time series data and power spectrum is not an integer

model_background(n_laws=None, fix_wn=False, basis='tau_sigma')

Model stellar background

If nothing is fixed, this method iterates through \(2\dot(n_{\mathrm{laws}}+1)\) models to determine the best-fit background model due to stellar granulation processes, which uses a solar scaling relation to estimate the number of Harvey-like component(s) (or n_laws)

Parameters
n_lawsint

specify number of Harvey-like components to use in background fit

fix_wnbool

option to fix the white noise instead of it being an additional free parameter

basisstr

which basis to use for background fitting, e.g. {a,b} parametrization TODO: not yet operational

Methods
Returns
convergebool

returns False if background model fails to converge

Raises
utils.ProcessingError

if this failed to converge on a single model during the first iteration

numax_gaussian()

Gaussian \(\nu_{\mathrm{max}}\)

Estimate numax by fitting a Gaussian to the “zoomed-in” power spectrum (i.e. region_freq and region_pow) using scipy.curve_fit

Returns
convergebool

returns False if background model fails to converge

Raises
utils.ProcessingError

if the Gaussian fit does not converge for the first step

numax_smooth(sm_par=None)

Smooth \(\nu_{\mathrm{max}}\)

Estimate numax by taking the peak of the smoothed power spectrum

Parameters
sm_parfloat, optional

smoothing width for power spectrum calculated from solar scaling relation (typically ~1-4)

Attributes
frequency, pssmnumpy.ndarray, numpy.ndarray

smoothed power spectrum

frequency, pssm_bgcorrnumpy.ndarray, numpy.ndarray

smoothed background-subtracted power spectrum

region_freq, region_pownumpy.ndarray, numpy.ndarray

oscillation region of the power spectrum (“zoomed in”) by applying the mask~[lower_ps,upper_ps]

numax_smoofloat

the ‘observed’ numax (i.e. the peak of the smoothed power spectrum)

dnu_smoofloat

the ‘expected’ dnu based on a scaling relation using the numax_smoo

optimize_ridges(n=50, res=0.01)

Get ridges

Optimizes the large frequency separation by determining which spacing creates the “best” ridges (but is currently under development) think similar to a step-echelle but quicker and more hands off?

Attributes
xnumpy.ndarray

x-axis for the collapsed ED ~[0, \(2\times\Delta\nu\)]

ynumpy.ndarray

marginalized power along the y-axis (i.e. collapsed on to the x-axis)

Important

need to optimize this - currently does nothing really

process_star()

Run pipeline

Processes a given star with pySYD

Methods
red_noise(box_filter=1.0, n_rms=20)

Estimate red noise

Estimates amplitudes of red noise components by using a smoothed version of the power spectrum with the power excess region masked out – which will take the mean of a specified number of points (via -nrms, default=20) for each Harvey-like component

Parameters
box_filterfloat

the size of the 1D box smoothing filter

n_rmsint

number of data points to average over to estimate red noise amplitudes

Attributes
smooth_pownumpy.ndarray

smoothed power spectrum after applying the box filter

remove_artefact(freq, pow, lcp=566.4233312035817, lf_lower=[240.0, 500.0], lf_upper=[380.0, 530.0], hf_lower=[4530.0, 5011.0, 5097.0, 5575.0, 7020.0, 7440.0, 7864.0], hf_upper=[4534.0, 5020.0, 5099.0, 5585.0, 7030.0, 7450.0, 7867.0])

Remove Kepler artefacts

Module to remove artefacts found in Kepler data by replacing known frequency ranges with simulated noise

Parameters
lcpfloat

long cadence period (in Msec)

lf_lowerList[float]

lower limits of low-frequency artefacts

lf_upperList[float]

upper limits of low-frequency artefacts

hf_lowerList[float]

lower limit of high frequency artefact

hf_upperList[float]

upper limit of high frequency artefact

freq, pownumpy.ndarray, numpy.ndarray

input data that needs to be corrected

Returns
frequency, powernumpy.ndarray, numpy.ndarray

copy of the corrected power spectrum

Note

Known Kepler artefacts include:
  1. long-cadence harmonics

  2. sharp, high-frequency artefacts (\(\rm >5000 \mu Hz\))

  3. low frequency artefacts 250-400 muHz (mostly present in Q0 and Q3 data)

show_results(show=False, verbose=False)

Show results

Parameters
showbool, optional

show output figures and text

verbosebool, optional

turn on verbose output

single_step()

Single step

Similar to the first step, this function calls the same methods but uses the selected best-fit background model from the first step to estimate the parameters

Attributes
convergebool

removes any saved parameters if any fits did not converge (i.e. False)

Returns
convergebool

returns True if all relevant fits converged

Methods
solar_scaling(numax=None, scaling='tau_sun_single', max_laws=3, ex_width=1.0, lower_ps=None, upper_ps=None)

Initial values

Using the initial starting value for \(\\rm \\nu_{max}\), estimates the rest of the parameters needed for both the background and global fits. Uses scaling relations from the Sun to:

  1. estimate the width of the region of oscillations using numax

  2. guess starting values for granulation time scales

Parameters
numaxfloat, default=None
provide initial value for numax to bypass the first module
scalingstr, default=’tau_sun_single’

which solar scaling relation to use

max_lawsint, default=3

the maximum number of resolvable Harvey-like components

ex_widthfloat, default=1.0

fractional width to use for power excess centered on numax

lower_psfloat, default=None

lower bound of power excess to use for ACF [in \(\\rm \mu Hz\)]

upper_psfloat, default=None

upper bound of power excess to use for ACF [in \(\\rm \mu Hz\)]

Attributes
convergebool, default=True

True if all fitting converges

stitch_data(gap=20)

Stitch light curve

For computation purposes and for special cases that this does not affect the integrity of the results, this module ‘stitches’ a light curve together for time series data with large gaps. For stochastic p-mode oscillations, this is justified if the lifetimes of the modes are smaller than the gap.

Parameters
gapint

how many consecutive missing cadences are considered a ‘gap’

Attributes
timenumpy.ndarray

original time series array to correct

new_timenumpy.ndarray

the corrected time series array

Raises
pysyd.utils.InputWarning

when using this method since it’s technically not a great thing to do

Warning

USE THIS WITH CAUTION. This is technically not a great thing to do for primarily two reasons:

  1. you lose phase information and

  2. can be problematic if mode lifetimes are shorter than gaps (i.e. more evolved stars)

Note

temporary solution for handling very long gaps in TESS data – still need to figure out a better way to handle this

white_noise()

Estimate white noise

Estimate the white noise level by taking the mean of the last 10% of the power spectrum

whiten_mixed(freq, pow, dnu=None, lower_ech=None, upper_ech=None, notching=False)

Whiten mixed modes

Module to help reduce the effects of mixed modes random white noise in place of \(\ell=1\) for subgiants with mixed modes to better constrain the large frequency separation

Parameters
dnufloat, default=None

the so-called large frequency separation to fold the PS to

lower_echfloat, default=None

lower frequency limit of mask to “whiten”

upper_echfloat, default=None

upper frequency limit of mask to “whiten”

notchingbool, default=False

if True, uses notching instead of generating white noise

freq, pownumpy.ndarray, numpy.ndarray

input data that needs to be corrected

folded_freqnumpy.ndarray

frequency array modulo dnu (i.e. folded to the large separation, \(\Delta\nu\))

Returns
frequency, powernumpy.ndarray, numpy.ndarray

copy of the corrected power spectrum