Document field descriptions for NTSPortal
table-mappings.RmdThe following tables contain descriptions for all fields in NTSPortal tables. The dot-notation is used to refer to nested fields.
feature tables
These tables hold processing results (dbas and
nts processing). They are a list of non-target “features”
or chromatographic peaks with associated metadata.
| field | type | description | unit | example |
|---|---|---|---|---|
| adduct | keyword | ESI adduct form | [M+H]+ | |
| area | integer | chromatographic peak area | ||
| area_internal_standard | integer | chromatographic peak area | ||
| area_relative_to_internal_standard | double (runtime) | area/area_internal_standard | ||
| batchname | keyword (runtime) | Leading directories of path field | /beegfs/nts/ntsportal/msrawfiles/unit_tests | |
| cas | keyword | chemical abstracts service registry number | 95-14-7 | |
| chrom_method | keyword | chromatographic method code | bfg_nts_rp1 | |
| comment | text | free as text or array of text blocks | ||
| comp_group | keyword | compound usage or classification(s) | Industrial_process | |
| compound_annotation | nested | |||
| compound_annotation.adduct | keyword | ESI adduct form | [M+H]+ | |
| compound_annotation.cas | keyword | chemical abstracts service registry number | 95-14-7 | |
| compound_annotation.formula | keyword | molecular formula | C7H4Cl2O2 | |
| compound_annotation.inchi | keyword | international chemical identifier | InChI=1S/C7H5NS/c1-2-4-7-6(3-1)8-5-9-7/h1-5H | |
| compound_annotation.inchikey | keyword | hash of international chemical identifier | ILLQTAPXYCRJRB-UHFFFAOYSA-N | |
| compound_annotation.isotopologue | keyword | isotopic difference to monoisotopic | 37Cl | |
| compound_annotation.mz_diff_lib | float | mz difference to library | mDa | |
| compound_annotation.name | keyword | compound name | ||
| compound_annotation.rt_diff_lib | half_float | rt difference to library | minutes | |
| compound_annotation.score_ms2_match | short | Dot product matching score out of 1000 | 452 | |
| concentration | half_float | estimated concentration | ng/L | |
| csl_experiment_id | integer | CSL experiment ID for spectral match | 26 | |
| data_source | keyword | institution responsible for data processing | bfg | |
| date_import | date | ingest time | ||
| date_measurement | date | instrumental analysis time | ||
| duration | keyword | Composite sample duration (ISO 8601) grab: P0 | P1Y | |
| eic | nested | |||
| eic.int | scaled_float | intensity | ||
| eic.time | scaled_float | chromatographic run time | s | |
| esi_ion_spec | keyword (runtime) | Concatination of name, pol, adduct, isotopologue | 1,3-Dicyclohexylurea_pos_[M+H]+_monoisotopic | |
| feature_table_alias | keyword | Alias to use for feature table | ntsp25.3_feature_bfg | |
| filename | keyword (runtime) | Basename of path field | KO_06_1_pos.mzXML | |
| formula | keyword | molecular formula | C7H4Cl2O2 | |
| gkz | integer | gewaesserkennzahl of sampling location | 26 | |
| inchi | keyword | international chemical identifier | InChI=1S/C7H5NS/c1-2-4-7-6(3-1)8-5-9-7/h1-5H | |
| inchikey | keyword | hash of international chemical identifier | ILLQTAPXYCRJRB-UHFFFAOYSA-N | |
| instrument | text | instrument type code | LC-ESI-QTOF TripleTOF 6600 SCIEX | |
| instrument_name | keyword | Instrument name | wasser_tof | |
| intensity | integer | chromatographic peak height at apex | ||
| intensity_internal_standard | integer | peak intensity (height at apex) | ||
| intensity_relative_to_internal_standard | double (runtime) | intensity/intensity_internal_standard | ||
| internal_standard | keyword | int. std. compound name | Bezafibrate-d6 | |
| isotopologue | keyword | isotopic difference to monoisotopic | 37Cl | |
| km | half_float | river kilometer of sampling location | ||
| library_mz | float | mass-to-charge ratio in library | Da | |
| library_rt | half_float | library retention time at peak apex | minutes | |
| licence | keyword | open data licence for data | dl-de/by-2-0 | |
| loc | geo_point | sampling coordinates (WGS84) | ||
| matrix | keyword | sampling matrix code | spm | |
| ms1 | nested | |||
| ms1.int | scaled_float | centroid intensity, relative to feature intensity | ||
| ms1.mz | float | mass-to-charge ratio | Da | |
| ms2 | nested | |||
| ms2.int | scaled_float | centroid intensity, relative to max | ||
| ms2.mz | float | mass-to-charge ratio | Da | |
| multi_hit_id | keyword | ID of duplicate features from multiple annotations | ||
| mz | float | mass-to-charge ratio | Da | |
| mz_diff_lib | float | mz difference to library | mDa | |
| name | keyword | compound name as written in library | ||
| path | keyword | absolute path to measurement file (mzXML, mzML) | ||
| pol | keyword | 3-letter ESI polarity mode | pos | |
| river | keyword | river name of sampling location (local spelling) | rhein | |
| rt | half_float | chromatographic retention time at peak apex | minutes | |
| rt_diff_lib | half_float | rt difference to library | minutes | |
| rtt | nested | |||
| rtt.method | keyword | chromatographic method code | bfg_nts_rp1 | |
| rtt.predicted | boolean | is predicted (non-experimental) retention time | ||
| rtt.rt | half_float | chromatographic retention time at peak apex | min | |
| sample_source | keyword | institute responsible for sampling | bful/lfulg saxony | |
| score_ms2_match | short | Dot product matching score out of 1000 | 452 | |
| start | date | sampling time (start time for composite) | ||
| station | keyword | sampling location code | rhein_ko_l | |
| tag | keyword | labels to assist searching |
msrawfiles tables
These tables hold metadata and processing settings for each raw MS measurement file (mzXML file) in NTSPortal.
| field | type | description | unit | example |
|---|---|---|---|---|
| batchname | keyword (runtime) | Leading directories of path field | /beegfs/nts/ntsportal/msrawfiles/unit_tests | |
| blank | boolean | Is sample a blank? | ||
| blank_regex | keyword | Regex. for blank code in filename | ||
| chrom_method | keyword | chromatographic method code | bfg_nts_rp1 | |
| comment | text | free as text or array of text blocks | ||
| csl_instruments_allowed | keyword | Allowed instruments for library spectra | ||
| data_source | keyword | institution responsible for data processing | bfg | |
| date_import | date | DEPRECATED, use date_ingest | ||
| date_ingest | date | ingest time | ||
| date_measurement | date | instrumental analysis time | ||
| dbas_area_threshold | integer | Minimum area of EIC peak | ||
| dbas_blank_int_factor | integer | Blank correction max. inten. ratio (sample/blank) | ||
| dbas_blank_regex | keyword | DEPRECATED use blank_regex | ||
| dbas_date_format | keyword | DEPRECATED | ymd | |
| dbas_date_regex | keyword | DEPRECATED | Des_()_ | |
| dbas_fp | keyword | Compounds removed from peaklist (false positives) | ||
| dbas_instr | keyword | DEPRECATED use csl_instruments_allowed | ||
| dbas_is_name | keyword | DEPRECATED use internal_standard | Bezafibrate-d6 | |
| dbas_is_table | keyword | Path to int. std. table (CSV) | ||
| dbas_minimum_detections | integer | Minimum MS2 matches in batch (otherwise FP) | ||
| dbas_mztolu | float | m/z tol. for library matching (1st round/rough) | Da | |
| dbas_mztolu_fine | float | m/z tol. for library matching | Da | |
| dbas_ndp_m | integer | Peak inten. weighting factor for dot product | ||
| dbas_ndp_n | integer | m/z weighting factor for dot product | ||
| dbas_ndp_threshold | integer | Min. dot product MS2 match to library (0-1000) | ||
| dbas_replicate_regex | keyword | DEPRECATED | ()[123] | |
| dbas_rtTolReinteg | float | Ret. time tol. for reingetration (gap filling) | ||
| dbas_rttolm | float | Ret. time tol. for library match | min | |
| dbas_station_regex | keyword | Sampling location code from filename | ^(.*) | |
| duration | keyword | Composite sample duration (ISO 8601) grab: P0 | P1Y | |
| feature_table_alias | keyword | Alias to use for feature table | ntsp25.3_feature_bfg | |
| filename | keyword (runtime) | Basename of path field | KO_06_1_pos.mzXML | |
| filesize | float | Meas. file size | MB | |
| gkz | integer | gewaesserkennzahl of sampling location | 26 | |
| instrument | keyword | Instrument type code | LC-ESI-QTOF TripleTOF 6600 SCIEX | |
| instrument_name | keyword | Instrument name | wasser_tof | |
| internal_standard | keyword | Internal standard used in processing | Bezafibrate-d6 | |
| km | float | river kilometer of sampling location | ||
| licence | keyword | open data licence for data | dl-de/by-2-0 | |
| loc | geo_point | Sampling coordinates (WGS84) | ||
| matrix | keyword | Sampling matrix code | spm | |
| nts_alig_delta_mz | float | Alignment m/z tol. within batch | mDa | |
| nts_alig_delta_rt | float | Alignment ret. time tol. within batch | s | |
| nts_alig_filter_min_features | byte | Min. detections per replicate set | ||
| nts_alig_filter_num_consecutive | byte | Min. consecutive detections | ||
| nts_alig_filter_type | keyword | Alig. tab. filter type | consecutive/replicate/min_features | |
| nts_annotation_ce_max | short | Max. collision energy of library spectra | ||
| nts_annotation_ce_min | short | Min. collision energy of library spectra | ||
| nts_annotation_ces_max | short | Max. collision energy spread of library spectra | ||
| nts_annotation_ces_min | short | Min. collision energy spread of library spectra | ||
| nts_annotation_int_cutoff | float | Rel. inten. cutoff of data MS2 when matching | ||
| nts_annotation_ms2_mz_tol | float | mDa | ||
| nts_annotation_mz_tol | float | Min. m/z match to library | mDa | |
| nts_annotation_rt_offset | float | min | ||
| nts_annotation_rt_tol | float | Min. ret. time match to library | min | |
| nts_annotation_threshold_dp_score | short | Min. dot product MS2 match to library (0-1000) | ||
| nts_blank_correction_factor | float | Alig. tab. filter max. inten. ratio (sample/blank) | ||
| nts_componen2_correlation | float | Future use | ||
| nts_componen2_frac_shape | float | Future use | ||
| nts_componen2_mz_tol | float | Future use | mDa | |
| nts_componen2_rt_tol | float | Future use | s | |
| nts_componentization_dynamic_tolerance | boolean | Compute componentization tol. using 13C peak | ||
| nts_componentization_ppm | float | Known components m/z diff., ppm tol. | ||
| nts_componentization_rt_tol | float | Peak apex ret. time tol. | s | |
| nts_componentization_rt_tol_l | float | Peak ret. time at half-height left, tol. | s | |
| nts_componentization_rt_tol_r | float | Peak ret. time at half-height right, tol. | s | |
| nts_componentization_rt_tol_sum | float | Peak three point ret. time sum, tol. | s | |
| nts_eic_extraction_width | float | Sample tab. opt. EIC m/z width | mDa | |
| nts_int_threshold | float | Peak-picking lower EIC peak apex intensity limit | ||
| nts_max_num_peaks | float | Peak-picking max. subpeaks/noise within EIC peak | ||
| nts_mz_max | float | Peak-picking upper m/z limit | Da | |
| nts_mz_min | float | Peak-picking lower m/z limit | Da | |
| nts_mz_step | float | Peak-picking EIC binning width | Da | |
| nts_peak_noise_scans | float | Peak-picking scans around peak to detect noise | ||
| nts_peak_width_max | float | Peak-picking min. EIC peak width at baseline | s | |
| nts_peak_width_min | float | Peak-picking max. EIC peak width at baseline | s | |
| nts_precursor_mz_tol | float | Peak-picking m/z tol. to find MS2 precursor | ppm | |
| nts_rt_max | float | Peak-picking upper ret. time limit | min | |
| nts_rt_min | float | Peak-picking lower ret. time limit | min | |
| nts_sn | float | Peak-picking lower S/N limit | ||
| path | keyword | Absolute path to measurement file (mzXML, mzML) | ||
| pol | keyword | 3-letter ESI polarity mode | pos | |
| replicate_regex | keyword | Regular expression to group files into replicates | ()[123] | |
| river | keyword | river name of sampling location (local spelling) | rhein | |
| sample_source | keyword | institute responsible for sampling | bful/lfulg saxony | |
| spectral_library_path | keyword | Path to spectral library (CSL) SQLite file | ||
| start | date | Start of sampling in UTC | ||
| start_date_format | keyword | Format of date in filename | ymd | |
| start_date_regex | keyword | Regex. for date in filename, date in brackets | Des_()_ | |
| station | keyword | Sampling location code | rhein_ko_l | |
| tag | keyword | Labels to assist searching |
analysis_dbas tables
These tables hold results of summary statistics for
feature tables (currently only for dbas
processing).
| field | type | description | unit | example |
|---|---|---|---|---|
| lm | float | Slope of regression (linear model) rel. area, time | ||
| matrix | keyword | Sampling matrix code | spm | |
| name | keyword | compound name(s) | ||
| pol | keyword | 3-letter ESI polarity mode | pos | |
| station | keyword | sampling location code | rhein_ko_l |
spectral_library tables
These documents hold a copy of the Collective Spectral Library (CSL).
| field | type | description | unit | example |
|---|---|---|---|---|
| adduct | keyword | ESI adduct form | [M+H]+ | |
| cas | keyword | Chemical abstracts service registry number | 95-14-7 | |
| ce | float | Collision energy | ||
| ce_unit | keyword | Collision energy units | eV | |
| ces | float | Collision energy spread | ||
| collision_type | keyword | Type of fragmentation for this spectrum | ||
| comment | text | free as text or array of text blocks | ||
| comp_group | keyword | compound usage or classification(s) | Industrial_process | |
| csl_experiment_id | integer | CSL experiment ID for spectral match | 26 | |
| data_source | keyword | institution responsible for data processing | bfg | |
| esi_ion_spec | keyword (runtime) | Concatination of name, pol, adduct, isotopologue | 1,3-Dicyclohexylurea_pos_[M+H]+_monoisotopic | |
| formula | keyword | molecular formula | C7H4Cl2O2 | |
| inchi | keyword | international chemical identifier | InChI=1S/C7H5NS/c1-2-4-7-6(3-1)8-5-9-7/h1-5H | |
| inchikey | keyword | hash of international chemical identifier | ILLQTAPXYCRJRB-UHFFFAOYSA-N | |
| instrument | keyword | Instrument type code | LC-ESI-QTOF TripleTOF 6600 SCIEX | |
| ionisation | keyword | Type of LC-MS interface used | ||
| isotopologue | keyword | Isotopic difference to monoisotopic | 37Cl | |
| licence | keyword | open data licence for data | CC BY 4.0 | |
| ms2 | nested | |||
| ms2.int | scaled_float | centroid intensity | ||
| ms2.mz | float | mass-to-charge ratio | Da | |
| mw | float | DEPRECATED | ||
| mz | float | mass-to-charge ratio | Da | |
| name | keyword | compound name | ||
| pol | keyword | 3-letter ESI polarity mode | pos | |
| rtt | nested | |||
| rtt.doi | keyword | DOI for chromatgraphic method | 10.1016/j.chroma.2015.11.014 | |
| rtt.method | keyword | Chromatgraphic method code | bfg_nts_rp1 | |
| rtt.predicted | boolean | Is the retention time modelled? | ||
| rtt.rt | half_float | Chromatographic retention time at peak apex | min | |
| smiles | keyword | SMILES line notation for compound structure | ||
| tag | keyword | labels to assist searching | ||
| time_added | date | Time of entry into CSL |
nondetect_dbas tables
These documents hold non-detects by file, i.e., which compounds of the CSL were not detected in a measurement file.
| field | type | description | unit | example |
|---|---|---|---|---|
| chrom_method | keyword | chromatographic method code | bfg_nts_rp1 | |
| comment | text | free as text or array of text blocks | ||
| data_source | keyword | institution responsible for data processing | bfg | |
| date_import | date | ingest time | ||
| date_measurement | date | instrumental analysis time | ||
| dbas_alias_name | keyword | index alias name | ntsp_dbas_bfg | |
| duration | scaled_float | duration of composite sampling, grab sample: 0 | day | |
| gkz | integer | gewaesserkennzahl of sampling location | 26 | |
| instrument | text | instrument type code | LC-ESI-QTOF TripleTOF 6600 SCIEX | |
| km | half_float | river kilometer of sampling location | ||
| licence | keyword | open data licence for data | dl-de/by-2-0 | |
| loc | geo_point | sampling coordinates (WGS84) | ||
| matrix | keyword | sampling matrix code | spm | |
| name | keyword | compound name(s) | ||
| path | keyword | absolute path to measurement file (mzXML, mzML) | ||
| pol | keyword | 3-letter ESI polarity mode | pos | |
| river | keyword | river name of sampling location (local spelling) | rhein | |
| sample_source | keyword | institute responsible for sampling | bful/lfulg saxony | |
| start | date | sampling time (start time for composite) | ||
| station | keyword | sampling location code | rhein_ko_l | |
| tag | keyword | labels to assist searching |