polyfemos.back.seismic.lumberjack

Functions for reading and parsing data from state of health and data files.

The main public functions that return list of lists of timestamps and values:

copyright

2019, University of Oulu, Sodankyla Geophysical Observatory

license

GNU Lesser General Public License v3.0 or later (https://spdx.org/licenses/LGPL-3.0-or-later.html)

Public Functions

polyfemos.back.seismic.lumberjack.centaur_mseed(path='', scale=<function <lambda>>, **kwargs)[source]
Parameters
  • path (str) – path to miniseed file

  • scale (func, optional) – defaults to identity function, scaling function applied to data values

Return type

list or None

Returns

list of lists containing timestamp (as UTCDateTime instance) and data value.

polyfemos.back.seismic.lumberjack.data_coverage(paths=[], starttime=None, endtime=None, scale=<function <lambda>>, invalid_value=0.0)[source]

Calculates the datacoverage percentage between starttime and endtime.

See Scanner and analyze_parsed_data() for more information.

Parameters
  • paths (list) – list of filepaths (as string values) to the datefiles to be scanned

  • starttime (UTCDateTime) –

  • endtime (UTCDateTime) –

  • scale (func, optional) – defaults to identity function, scaling function applied to data values

  • invalid_value (float, optional) – default to 0.0

Return type

list or None

Returns

list of lists containing timestamp (as UTCDateTime instance), data value, and additional z value as string.

polyfemos.back.seismic.lumberjack.data_mseed(path='', scale=<function <lambda>>, **kwargs)[source]

TODO

Parameters
  • path

  • scale

Return type

Returns

polyfemos.back.seismic.lumberjack.data_timing_quality(path='', average_calc_length=1, scale=<function <lambda>>, **kwargs)[source]

Reads timing quality values from miniseed file using get_record_information() function.

Parameters
  • path (str) – path to miniseed file

  • average_calc_length (int, optional) – defaults to one, setting this value greater than one will result in a little bit smoothed timing quality curve, since the returned datapoints will be averages over average_calc_length values.

  • scale (func, optional) – defaults to identity function, scaling function applied to data values

Return type

list or None

Returns

list of lists containing timestamp (as UTCDateTime instance) and data value.

polyfemos.back.seismic.lumberjack.data_timing_quality_obspy_daily(path='', key='', scale=<function <lambda>>, **kwargs)[source]

Extract timing quality flags from given miniseed file. One value per file, which means one value per day.

Parameters
  • path (str) – path to miniseed file

  • key (str) – key available with get_flags() and timinig_quality key.

  • scale (func, optional) – defaults to identity function, scaling function applied to data values

Return type

list or None

Returns

list of lists containing timestamp (as UTCDateTime instance) and data value.

polyfemos.back.seismic.lumberjack.earthdata_mseed(path='', scale=<function <lambda>>, **kwargs)[source]
Parameters
  • path (str) – path to miniseed file

  • scale (func, optional) – defaults to identity function, scaling function applied to data values

Return type

list or None

Returns

list of lists containing timestamp (as UTCDateTime instance) and data value.

polyfemos.back.seismic.lumberjack.get_earthdata_stream(path='')[source]

A function used with Earth Data state of health miniseed files. Parameter specific scaling function is applied to the raw data.

Parameters

path (str) – path to miniseed file

Return type

Stream or None

Returns

seismic data stream

polyfemos.back.seismic.lumberjack.get_tibs(alertfunc, data)[source]

The alertfunc is applied to the data, starting from the first value of the data iterable.

Parameters
  • alertfunc (func) – A function taking (usually) a numerical value as an argument and returning a boolean value. Returned value is True if the ‘threshold is broken’.

  • data (iterable) – An iterable containing the data values

Return type

bool, bool

Returns

The tib is True if the first value of the data breaks the threshold, thbb is True if any value of the data breaks the threshold. NaN value may be returned if all data points are NaNs or the alerts are not valid.

polyfemos.back.seismic.lumberjack.process_logs(flags, stations, network_code, station_code, starttime, endtime, classes)[source]

Gathers data from various sources, parses the data and writes ‘*.stf’, ‘*.alert’ and ‘*.csv’ files.

Processes data for stations with given network_code and station_code.

Parameters
  • flags (dict) – Flag variables from Interpreter

  • stations (Stations) – The station information collected from ‘*.conf’ files.

  • network_code (str) – Network name as a string, e.g. “FN”

  • station_code (str) – Name of the station as a string, e.g. “MSF”

  • startime (Ordinal) –

  • endtime (Ordinal) –

  • classes (list) – list of strings containing all parameter classes which are included in the station’s state of health processing.

polyfemos.back.seismic.lumberjack.stream_to_xy_data(st, scale=<function <lambda>>, **kwargs)[source]
Parameters
  • st (Stream) – seismic data stream

  • scale (func, optional) – defaults to identity function, scaling function applied to data values

Return type

list or None

Returns

list of lists containing timestamp (as UTCDateTime instance) and data value.

Private Functions

polyfemos.back.seismic.lumberjack._centaur_mseed(pathfunc, times, flags, funckwargs={}, **kwargs)[source]

Generator calling function centaur_mseed() [*] see the documentation of _data_coverage().

Parameters
  • pathfunc (func) – [*]

  • times (list) – [*]

  • flags (dict) – [*]

  • funckwargs (dict, optional) – defaults to empty dict, keyword arguments for centaur_mseed()

Returns

generator yielding UTCDateTime and list of lists of timestamps and data values

polyfemos.back.seismic.lumberjack._data_coverage(pathfunc, times, flags, funckwargs={}, key='DCL', invalid_value=0.0)[source]

The documentation of this function will cover all of the next private generator functions of lumberjack:

All of these generator functions are decorated with _none2invalid() which means that all None data values are replaced with invalid_value

Parameters
  • pathfunc (func) – A function taking julian date and year as keyword arguments and returning path as a string

  • times (list) – a list of UTCDateTime instances

  • flags (dict) – Flag variables from Interpreter

  • funckwargs (dict, optional) – defaults to empty dict, keyword arguments for data_coverage()

  • key (str, optional) – “DCD” (for data coverage from the startof the day) or “DCL” (for the data coverage from the start program start, i.e. realtimeness)

  • invalid_value (float, optional) – defaults to 0.0, i.e. no data available

Returns

generator yielding UTCDateTime and list of lists of timestamps and data values, in this case, the data contains only one value.

polyfemos.back.seismic.lumberjack._data_mseed(pathfunc, times, flags, funckwargs={}, **kwargs)[source]

TODO

See the documentation of _data_coverage().

Parameters
  • pathfunc

  • times

  • flags

  • funckwargs

Return type

Returns

polyfemos.back.seismic.lumberjack._data_timestamp_error(pathfunc, times, flags, funckwargs={}, **kwargs)[source]

Calculates the time difference (in seconds) between current program starttime and timestamp of the last datapoint in the data file. The difference is positive if datapoint’s timestamp is lesser of the two.

[*] see the documentation of _data_coverage().

Parameters
  • pathfunc (func) – [*]

  • times (list) – [*]

  • flags (dict) – [*]

  • funckwargs (dict, optional) – defaults to empty dict,

polyfemos.back.seismic.lumberjack._data_timing_quality(pathfunc, times, flags, funckwargs={}, **kwargs)[source]

Generator calling function data_timing_quality() [*] see the documentation of _data_coverage().

Parameters
  • pathfunc (func) – [*]

  • times (list) – [*]

  • flags (dict) – [*], adds average_calc_lengthy flag to funckwargs

  • funckwargs (dict, optional) – defaults to empty dict, keyword arguments for data_timing_quality()

Returns

generator yielding UTCDateTime and list of lists of timestamps and data values

polyfemos.back.seismic.lumberjack._data_timing_quality_obspy_daily(pathfunc, times, flags, funckwargs={}, key='', **kwargs)[source]

Generator calling function data_timing_quality_obspy_daily() [*] see the documentation of _data_coverage().

Parameters
Returns

generator yielding UTCDateTime and list of lists of timestamps and data values

polyfemos.back.seismic.lumberjack._earthdata_log(pathfunc, times, flags, funckwargs={}, key='', **kwargs)[source]

Generator calling function get_data() [*] see the documentation of _data_coverage().

Parameters
  • pathfunc (func) – [*]

  • times (list) – [*]

  • flags (dict) – [*]

  • funckwargs (dict, optional) – defaults to empty dict, keyword arguments for get_data()

  • key (str) – Adds the key to funckwargs, see edlogreader for available keys

Returns

generator yielding UTCDateTime and list of lists of timestamps and data values

polyfemos.back.seismic.lumberjack._earthdata_mseed(pathfunc, times, flags, funckwargs={}, **kwargs)[source]

Generator calling function earthdata_mseed() [*] see the documentation of _data_coverage().

Parameters
  • pathfunc (func) – [*]

  • times (list) – [*]

  • flags (dict) – [*]

  • funckwargs (dict, optional) – defaults to empty dict, keyword arguments for earthdata_mseed()

Returns

generator yielding UTCDateTime and list of lists of timestamps and data values

polyfemos.back.seismic.lumberjack._get_data(station, par, times, flags)[source]

Format of the values yielded by the returned generator

(day1,
    [
        [timestamp11, value11],
        [timestamp12, value12],
        ...
    ],
),
(day2,
    [
        [timestamp21, value21],
        [timestamp22, value22],
        ...
    ],
),
...
Parameters
Return type

generator

Returns

A generator yielding timestamp for each day and the data of that day, (timestamps and datapoints) for that day.

polyfemos.back.seismic.lumberjack._get_tibs(par, data, alertfile)[source]
Parameters
Return type

bool, bool

Returns

see get_tibs() for more info

polyfemos.back.seismic.lumberjack._get_tibs_dc(alertfunc, data, alertfile, parname='')[source]

Since _data_coverage() function return only one data point each time it is called. The ‘threshold has been broken’ (thbb) value is not meaningfull. This function is used to read earlier tib and thbb values adn to set the thbb value accordingly.

Parameters
  • alertfunc (func) – see get_tibs() for more info

  • data (iterable) – An iterable containing the data values

  • alertfile (str) – A path to ‘*.alert’ file, the ‘threshold has been broken’ value is read from the file if it exists.

  • parname (str) – the name of the parameter with a code ‘DCD’ or ‘DCL’. The thbb value is searched from the alertfile using the parname.

Return type

bool, bool

Returns

see get_tibs() for more info

polyfemos.back.seismic.lumberjack._none2invalid(generator)[source]

Decorates private lumberjack functions. Replaces None values with function specific invalid values. The default invalid value is defined in function getNaN().

Parameters

generator (GeneratorType) –

Return type

GeneratorType

Returns

decorated generator

polyfemos.back.seismic.lumberjack._station_and_times(stations, network_code, station_code, starttime, endtime)[source]

A function used to select right Station instances for time within timeinterval from starttime to endtime.

The names of the network_code and station will be same for all Station instances returned, but other values and parameters of the station may change during the timespan. At the moment, only day accuracy is supported when selecting right stations.

Parameters
  • stations (Stations) –

  • network_code (str) – Network code as a string, e.g. “FN”

  • station_code (str) – Code of the station as a string, e.g. “MSF”

  • starttime (Ordinal) –

  • endtime (Ordinal) –

Returns

generator yielding Station instances and list of times when the station is valid.