Configuration Setup JSON
During the sensor and reference setup methods, a record of the setup configuration
is saved locally to a setup.json
file. This file is used to indicate to the
ingestion module how the data should be interpreted into the
sensortoolkit Data Formatting Scheme (SDFS).
This file is passed to a subroutine sensortoolkit.sensor_ingest.standard_ingest()
to import the recorded dataset and convert headers and date/time-like columns to SDFS formatting.
Sensor setup.json
Setup.json files for air sensors are generated by running the
sensortoolkit.AirSensor.sensor_setup()
module and contain information
about recorded sensor datasets that is used by the standard ingestion module
As sensors often record data with different formatting and header naming schemes, these files assist in converting data recorded in their original format into SDFS scheme for parameter data names and date/time formatting.
The sensor setup.json file is named [sensor_name]_setup.json
where [sensor_name]
is the name assigned to the sensor via sensor.name
. This file is located within
the users’ project directory in the following relative path:
\Data and Figures\sensor_data\[sensor_name]\[sensor_name]_setup.json
{
"path": "C:/Users/.../Documents/toucan_evaluation",
"data_rel_path": "/data/sensor_data/Toco_Toucan/raw_data",
"data_type": "sensor",
"file_extension": ".csv",
"header_iloc": 5,
"data_row_idx": null,
"sdfs_header_names": [
"NO2",
"O3",
"PM25",
"Temp",
"RH",
"DP"
],
"col_headers": {
"col_idx_0": {
"Time": {
"sdfs_param": "DateTime",
"in_file_list_idx": [
0,
1,
2
],
"header_class": "datetime",
"drop": false,
"dt_format": "%Y/%m/%d %H:%M:%S",
"dt_timezone": "EST"
}
},
"col_idx_1": {
"NO2 (ppb)": {
"sdfs_param": "NO2",
"in_file_list_idx": [
0,
1,
2
],
"unit_transform": null,
"header_class": "parameter",
"drop": false
}
},
"col_idx_2": {
"O3 (ppb)": {
"sdfs_param": "O3",
"in_file_list_idx": [
0,
1,
2
],
"unit_transform": null,
"header_class": "parameter",
"drop": false
}
},
"col_idx_3": {
"PM2.5 (\u00b5g/m\u00b3)": {
"sdfs_param": "PM25",
"in_file_list_idx": [
0,
1,
2
],
"unit_transform": null,
"header_class": "parameter",
"drop": false
}
},
"col_idx_4": {
"TEMP (\u00b0C)": {
"sdfs_param": "Temp",
"in_file_list_idx": [
0,
1,
2
],
"unit_transform": null,
"header_class": "parameter",
"drop": false
}
},
"col_idx_5": {
"RH (%)": {
"sdfs_param": "RH",
"in_file_list_idx": [
0,
1,
2
],
"unit_transform": null,
"header_class": "parameter",
"drop": false
}
},
"col_idx_6": {
"DP (\u00b0C)": {
"sdfs_param": "DP",
"in_file_list_idx": [
0,
1,
2
],
"unit_transform": null,
"header_class": "parameter",
"drop": false
}
},
"col_idx_7": {
"Inlet": {
"sdfs_param": "",
"in_file_list_idx": [
0,
1,
2
],
"header_class": "parameter",
"drop": true
}
}
},
"name": "Toco_Toucan",
"dataset_kwargs": {
"name": "Toco_Toucan"
},
"_dataset_selection": "files",
"file_list": [
"C:/Users/.../Documents/toucan_evaluation\\data\\sensor_data\\Toco_Toucan\\raw_data\\toco_toucan_RT01_raw.csv",
"C:/Users/.../Documents/toucan_evaluation\\data\\sensor_data\\Toco_Toucan\\raw_data\\toco_toucan_RT02_raw.csv",
"C:/Users/.../Documents/toucan_evaluation\\data\\sensor_data\\Toco_Toucan\\raw_data\\toco_toucan_RT03_raw.csv"
],
"encoding_predictions": {},
"serials": {
"1": "RT01",
"2": "RT02",
"3": "RT03"
},
"number_of_sensors": 3
}
Reference setup.json
The reference setup.json file is named reference_setup.json
and is located within
the users’ project directory in the following relative path:
\Data and Figures\reference_data\[data_type]\[site_name]_[site_id]\reference_setup.json
,
where [data_type]
is the name of the reference data source (i.e., ‘airnowtech’, ‘local’, etc.),
['site_name']
is the name of the monitoring site, where spaces have been replaced by ‘_’, and
[site_id]
is the AQS site ID (if applicable).
Below is an example reference_setup.json for a reference monitor dataset corresponding to EPA’s RTP campus ambient monitoring site for air sensor testing. The sensor and reference setup.json files share many similar attributes, however highlighted sections of code correspond to reference or monitoring site specific attributes that are important for creating a processed (SDFS formatted) version of the reference dataset.
{
"path": "C:\\Users\\...\\Documents\\sensortoolkit_testing",
"data_rel_path": "/data/reference_data/local/raw/Burdens_Creek_370630099/",
"data_type": "reference",
"file_extension": ".csv",
"header_iloc": 2,
"data_row_idx": null,
"sdfs_header_names": [
"PM25",
"PM10"
],
"col_headers": {
"col_idx_0": {
"Date & Time": {
"sdfs_param": "DateTime",
"in_file_list_idx": [0, 1],
"header_class": "datetime",
"drop": false,
"dt_format": "%-m/%-d/%Y %-I:%M %p",
"dt_timezone": "EST"
}
},
"col_idx_1": {
"Grimm PM2.5": {
"sdfs_param": "",
"in_file_list_idx": [0, 1],
"header_class": "parameter",
"drop": true
}
},
"col_idx_2": {
"Grimm PM10": {
"sdfs_param": "",
"in_file_list_idx": [0, 1],
"header_class": "parameter",
"drop": true
}
},
},
"col_idx_3": {
"T640_2_PM25": {
"sdfs_param": "PM25",
"in_file_list_idx": [0, 1],
"unit_transform": null,
"header_class": "parameter",
"drop": false
}
},
"col_idx_4": {
"T640_2_PM10": {
"sdfs_param": "PM10",
"in_file_list_idx": [0, 1],
"unit_transform": null,
"header_class": "parameter",
"drop": false
}
}
},
"dataset_kwargs": {
"ref_data_source": "local",
"site_name": "Burdens_Creek",
"site_aqs": "370630099"
},
"agency": "OAQPS",
"site_name": "Burdens Creek",
"site_aqs": "37-063-0099",
"site_lat": "35.88",
"site_lon": "-78.87",
"fmt_site_name": "Burdens_Creek",
"fmt_site_aqs": "370630099",
"ref_data_subfolder": "Burdens_Creek_370630099",
"_dataset_selection": "files",
"file_list": [
"C:\\Users\\...\\Documents\\sensortoolkit_testing\\data\\reference_data\\local\\raw\\Burdens_Creek_370630099\\min_201908_PM.csv",
"C:\\Users\\...\\Documents\\sensortoolkit_testing\\data\\reference_data\\local\\raw\\Burdens_Creek_370630099\\min_201909_PM.csv"
],
"PM25_Unit": "Micrograms/cubic meter (LC)",
"PM25_Param_Code": "Micrograms/cubic meter (LC)",
"PM25_Method_Code": 238,
"PM25_Method": "Teledyne T640X at 16.67 LPM",
"PM25_Method_POC": "1",
"PM10_Unit": "Micrograms/cubic meter (LC)",
"PM10_Param_Code": "Micrograms/cubic meter (LC)",
"PM10_Method_Code": 239,
"PM10_Method": "Teledyne API T640X at 16.67 LPM",
"PM10_Method_POC": "1"
}