ctdam.proc package

Subpackages

Submodules

ctdam.proc.module module

class ctdam.proc.module.Module[source]

Bases: ABC

An interface to implement new processing modules against.

Is meant to perform and unify all the necessary work that is needed to streamline processing on Sea-Bird CTD data. Implementing classes should only overwrite the transformation method, that does the actual altering of the data. All other organizational overhead should be covered by this interface. This includes parsing to .cnv output with correct handling of the metadata header.

add_processing_metadata()[source]

Parses the module processing information into cnv-compliant metadata lines.

These take on the form of {MODULE_NAME}_{KEY} = {VALUE} for every key-value pair inside of the given dictionary with the modules processing info.

load_file(file_path)[source]

Loads the target files information into an CnvFile instance.

Parameters:

file_path (Path) – Path to the target file.

Return type:

CnvFile

abstractmethod to_cnv()[source]
class ctdam.proc.module.ArrayModule[source]

Bases: Module

abstractmethod transformation()[source]
Return type:

bool

get_array()[source]
Return type:

ndarray

get_data_size()[source]
Return type:

int

create_flag_array_if_missing()[source]
handle_new_flags(new_flag_array)[source]
check_whether_working_on_binned_data()[source]
to_cnv()[source]
class ctdam.proc.module.DataFrameModule[source]

Bases: Module

abstractmethod transformation()[source]

The actual data transformation on the CTD data.

Needs to be implemented by the implementing classes.

Return type:

DataFrame

to_cnv(additional_data_columns=[], custom_data_columns=None)[source]

Writes the internal CnvFile instance to disk.

Uses the CnvFile’s output parser for that and organizes the different bits of information for that.

Parameters:
  • additional_data_columns (list[str]) –

    A list of columns that in addition to the ones inside the original dataframe.

    (Default value = [])

  • custom_data_columns (list | None) –

    A list of coulumns that will exclusively used to select the data items for the output .cnv .

    (Default value = None)

to_csv()[source]

Writes the dataframe as .csv to disk.

exception ctdam.proc.module.MissingParameterError(step_name, parameter_name)[source]

Bases: Exception

A custom error to throw when necessary parameters are missing from the input .cnv file.

ctdam.proc.procedure module

class ctdam.proc.procedure.Procedure(configuration, seabird_exe_directory=None, available_hex_converters=['datcnv', 'hex2py'], auto_run=True, procedure_fingerprint_directory=None, file_type_dir=None, plot=False, verbose=False, timeout=60)[source]

Bases: object

Runs a couple of processing steps in sequence on one or more CTD data source files.

It can use seabird internal processing modules, as well as custom ones. These can be in the form of independent windows exes or just pure python code. The input data can be .hex, .cnv or pandas DataFrames. The input and all module and extra information is stored in a dict that usually will be generated by the settings Configuration module that reads a toml config.

Parameters:
  • configuration (dict | Configuration) – The information necessary to run a processing procedure.

  • seabird_exe_directory (Path | str | None) – The path to the directory where the Sea-Bird exes reside in. Usually not necessary, as this Class knows the default install path.

  • available_hex_converters (list[str]) – A list of the known hex converters.

  • auto_run (bool) – Whether to autopilot the whole procedure.

  • procedure_fingerprint_directory (Path | str | None) – A path to a directory where the fingerprint are meant to be stored in. If none given, this option is considered to be turned off.

  • file_type_dir (Path | str | None) – A path to a directory where the individual Sea-Bird file types are differentiated into respective directories. If none given, this option is considered to be turned off.

  • verbose (bool) – Sets whether the Sea-Bird modules are run silently or not.

  • timeout (int) – The time in seconds after which individual processing steps will be killed automatically.

Returns:

  • In auto_run mode, a .cnv file or an instance of CnvFile, depending on the – file_type parameter inside of the configuration.

  • Otherwise it is an invocation that collected and evaluated the information – necessary to run a processing procedure on one or more target files.

run(file='')[source]

Runs given file or uses the one inside of the config.

A ‘run’ consists of the application of all the given modules to a given file. It is the structure that can be represented by a fingerprint file.

Parameters:

files (CnvFile | Path | str :) –

The input file.

(Default value = ‘’)

Return type:

CTDData

load_config()[source]

Thorough input/format check of the processing configuration, that either stems from a .toml config file, or is a self-build dictionary. Checks for the presence of certain keys, and then, depending on their importance, either fails or sets default values.

check_config_entry(key, default_value)[source]

Handles configuration file entries.

is_seabird_module(module)[source]

Answers the simple boolean question, whether the module in question is a seabird module or not.

Does that by checking for the presence of a certain key ‘psa’. All modules that are meant to run as a standalone executable should follow this principle and set their config file to the psa key.

Parameters:

module (dict) – The specific module parameters.

Return type:

bool

create_seabird_module(module_info)[source]
Return type:

ProcessingModule

create_seabird_step(module, input_path, output_name=None)[source]
Return type:

ProcessingStep

convert(hex_path, hex_converter)[source]

Covers the conversion of hex to cnv file.

At the moment, this is simply done by using DatCnv, so we could just use the general Sea-Bird module pipeline. This is therefore meant to be future-compatible for a time where we might have developed other hex-converters.

Parameters:
  • hex_path (Path) – The path to the target hex file.

  • hex_converter (dict) – The module parameters for the conversion.

Return type:

CTDData

new_file_path(file=PosixPath('.'))[source]

Creates the new output file path.

Takes the file type directory or the given output directory and joins them with the given output name.

Parameters:

file (Path) – The current path to the target file.

Return type:

Path

go()[source]

Performs the processing on all target files.

This is the ‘main’ method of the procedure. All previous methods prepare data for this method to then finally transform the input files into the wanted format. The main purpose of this method is the coordination of the two different forms of processing modules: standalone executables with config files, mainly Sea-Bird processing modules, and python-internal classes that implement the Module interface. The caveats are mainly the switching from one form to the other. This for example results in a in or out parsing of a CnvFile object.

The output is controlled by self.output_type and is either a cnv file at a target path or a CnvFile object.

Return type:

CTDData

procedure_fingerprint()[source]

Handles the creation of individual processing procedure fingerprints.

A fingerprint is a ‘receipt’ of one invocation of this class to one or more target files or data. It shall serve the purpose of an easy to understand proof of what exactly has been done with the data to retreive the given result. The especially neat thing is, that they are at the very same time plain configuration files, allowing the easy re-running of processing procedures. What distinguishes them from the usual configuration files, is one, the exact target file list, that can be ommitted upon invocation, and a timestamp that prefixes the source file name and adheres to ISO 8601.

Return type:

Configuration | None

handle_fill_type_dir_creation()[source]

ctdam.proc.settings module

class ctdam.proc.settings.Configuration(path, data=None)[source]

Bases: UserDict

The internal representation of the .toml configuration files, that store the ctdam.proc information.

Allows the interaction with these config files to be pretty much equivalent to a basic python dictionary.

write(path_to_write=None)[source]

Writes the ctdam.proc information to a file.

Parameters:

path_to_write

The path to write the file to. If none given, the input path will be used.

(Default value = None)

modify(key, value)[source]

Allows the access and modification of nested data points.

Parameters:
  • key

  • value

exception ctdam.proc.settings.IncompleteConfigFile(message)[source]

Bases: Exception

An exception to indicate misformed configuration files.

ctdam.proc.utils module

ctdam.proc.utils.default_seabird_exe_path()[source]

Creates a platform-dependent default path to the Sea-Bird exes.

Return type:

Path

ctdam.proc.utils.is_directly_measured_value(parameter)[source]

Returns whether a parameter has been measured via a sensor or is calculated.

Return type:

bool

ctdam.proc.utils.get_alignment_delay_and_correlation_values(processing_info)[source]

Finds the two numerical values in the processing output produced by the custom alignment tool. These are extracted separately for each sensor and sorted inside of list[tuple] structure.

Return type:

list

ctdam.proc.utils.fill_file_type_dir(file_type_dir, file, copy=True)[source]

Copies the target input and output files into individual type directories.

A ‘file type directory’ is a directory that is meant to collect all the file of the same file extension that accumulate over multiple processings. For typical Sea-Bird processings you usually end up with something like this:

root-dir
  • hex

  • cnv

  • XMLCON

  • btl

  • bl

  • hdr

Parameters:
  • file (Path)

  • copy (bool) – (Default value = True)

exception ctdam.proc.utils.BinnedDataError(file_name, step_name)[source]

Bases: Exception

A custom error to throw when binned data has been detected.

Module contents