weatherDB
station
This module has a class for every type of station. E.g. StationN (or StationN). One object represents one Station with one parameter. This object can get used to get the corresponding timeserie. There is also a StationGroup class that groups the three parameters precipitation, temperature and evapotranspiration together for one station.
StationN
- class weatherDB.station.StationN(id, **kwargs)
Bases:
weatherDB.station.StationNBaseA class to work with and download 10 minutes precipitation data for one station.
- __init__(id, **kwargs)
Create a Station object.
- Parameters
id (int) – The stations ID.
_skip_meta_check (bool, optional) – Should the check if the station is in the database meta file get skiped. Pay attention, when skipping this, because it can lead to problems. This is for computational reasons, because it makes the initialization faster. Is used by the stations classes, because the only initialize objects that are in the meta table. The default is False
- Raises
NotImplementedError – _description_
- update_horizon(skip_if_exist=True)
Update the horizon angle (Horizontabschirmung) in the meta table.
Get new values from the raster and put in the table.
- Parameters
skip_if_exist (bool, optional) – Skip updating the value if there is already a value in the meta table. The default is True.
- Returns
The horizon angle in degrees (Horizontabschirmung).
- Return type
float
- update_richter_class(skip_if_exist=True)
Update the richter class in the meta table.
Get new values from the raster and put in the table.
- Parameters
skip_if_exist (bool, optional) – Skip updating the value if there is already a value in the meta table. The default is True
- Returns
The richter class name.
- Return type
str
- richter_correct(period=(None, None), **kwargs)
Do the richter correction on the filled data for the given period.
- Parameters
period (TimestampPeriod or (tuple or list of datetime.datetime or None), optional) – The minimum and maximum Timestamp for which to get the timeseries. If None is given, the maximum or minimal possible Timestamp is taken. The default is (None, None).
- Raises
Exception – If no richter class was found for this station.
- corr(period=(None, None))
- last_imp_richter_correct(_last_imp_period=None)
Do the richter correction of the last import.
- Parameters
_last_imp_period (_type_, optional) – Give the overall period of the last import. This is only for intern use of the stationsN methode to not compute over and over again the period. The default is None.
- last_imp_corr(_last_imp_period=None)
A wrapper for last_imp_richter_correct().
- fillup(period=(None, None), **kwargs)
Fill up missing data with measurements from nearby stations.
- get_corr(period=(None, None))
- get_qn(period=(None, None))
- get_richter_class(update_if_fails=True)
Get the richter class for this station.
Provide the data from the meta table.
- Parameters
update_if_fails (bool, optional) – Should the richter class get updatet if no exposition class is found in the meta table? If False and no exposition class was found None is returned. The default is True.
- Returns
The corresponding richter exposition class.
- Return type
string
- get_horizon()
Get the value for the horizon angle. (Horizontabschirmung)
This value is defined by Richter (1995) as the mean horizon angle in the west direction as: H’=0,15H(S-SW) +0,35H(SW-W) +0,35H(W-NW) +0, 15H(NW-N)
- Returns
The mean western horizon angle
- Return type
float or None
StationT
- class weatherDB.station.StationT(id, **kwargs)
Bases:
weatherDB.station.StationTETBaseA class to work with and download temperaure data for one station.
- __init__(id, **kwargs)
Create a Station object.
- Parameters
id (int) – The stations ID.
_skip_meta_check (bool, optional) – Should the check if the station is in the database meta file get skiped. Pay attention, when skipping this, because it can lead to problems. This is for computational reasons, because it makes the initialization faster. Is used by the stations classes, because the only initialize objects that are in the meta table. The default is False
- Raises
NotImplementedError – _description_
- get_multi_annual()
Get the multi annual value(s) for this station.
- Returns
The corresponding multi annual value. For T en ET the yearly value is returned. For N the winter and summer half yearly sum is returned in tuple.
- Return type
list or number
- get_adj(period=(None, None))
Get the adjusted timeserie.
The timeserie is adjusted to the multi annual mean. So the overall mean of the given period will be the same as the multi annual mean.
- Parameters
period (TimestampPeriod or (tuple or list of datetime.datetime or None), optional) – The minimum and maximum Timestamp for which to get the timeseries. If None is given, the maximum or minimal possible Timestamp is taken. The default is (None, None).
agg_to (str or None, optional) – Aggregate to a given timespan. Can be anything smaller than the maximum timespan of the saved data. If a Timeperiod smaller than the saved data is given, than the maximum possible timeperiod is returned. For T and ET it can be “month”, “year”. For N it can also be “hour”. If None than the maximum timeperiod is taken. The default is None.
- Returns
A timeserie with the adjusted data.
- Return type
pandas.DataFrame
StationET
- class weatherDB.station.StationET(id, **kwargs)
Bases:
weatherDB.station.StationTETBaseA class to work with and download potential Evapotranspiration (VPGB) data for one station.
- __init__(id, **kwargs)
Create a Station object.
- Parameters
id (int) – The stations ID.
_skip_meta_check (bool, optional) – Should the check if the station is in the database meta file get skiped. Pay attention, when skipping this, because it can lead to problems. This is for computational reasons, because it makes the initialization faster. Is used by the stations classes, because the only initialize objects that are in the meta table. The default is False
- Raises
NotImplementedError – _description_
- get_adj(period=(None, None))
Get the adjusted timeserie.
The timeserie is adjusted to the multi annual mean. So the overall mean of the given period will be the same as the multi annual mean.
- Parameters
period (TimestampPeriod or (tuple or list of datetime.datetime or None), optional) – The minimum and maximum Timestamp for which to get the timeseries. If None is given, the maximum or minimal possible Timestamp is taken. The default is (None, None).
agg_to (str or None, optional) – Aggregate to a given timespan. Can be anything smaller than the maximum timespan of the saved data. If a Timeperiod smaller than the saved data is given, than the maximum possible timeperiod is returned. For T and ET it can be “month”, “year”. For N it can also be “hour”. If None than the maximum timeperiod is taken. The default is None.
- Returns
A timeserie with the adjusted data.
- Return type
pandas.DataFrame
StationND
- class weatherDB.station.StationND(id, **kwargs)
Bases:
weatherDB.station.StationNBase,weatherDB.station.StationCanVirtualBaseA class to work with and download daily precipitation data for one station.
Those station data are only downloaded to do some quality checks on the 10 minute data. Therefor there is no special quality check and richter correction done on this data. If you want daily precipitation data, better use the 10 minutes station(StationN) and aggregate to daily values.
- property quality_check
(!) Disallowed inherited
- property last_imp_quality_check
(!) Disallowed inherited
- property get_corr
(!) Disallowed inherited
- property get_adj
(!) Disallowed inherited
- property get_qc
(!) Disallowed inherited
- __init__(id, **kwargs)
Create a Station object.
- Parameters
id (int) – The stations ID.
_skip_meta_check (bool, optional) – Should the check if the station is in the database meta file get skiped. Pay attention, when skipping this, because it can lead to problems. This is for computational reasons, because it makes the initialization faster. Is used by the stations classes, because the only initialize objects that are in the meta table. The default is False
- Raises
NotImplementedError – _description_
GroupStation
- class weatherDB.station.GroupStation(id, error_if_missing=True, **kwargs)
Bases:
objectA class to group all possible parameters of one station.
So if you want to create the input files for a simulation, where you need T, ET and N, use this class to download the data for one station.
- __init__(id, error_if_missing=True, **kwargs)
- get_available_paras(short=False)
Get the possible parameters for this station.
- Parameters
short (bool, optional) – Should the short name of the parameters be returned. The default is “long”.
- Returns
A list of the long parameter names that are possible for this station to get.
- Return type
list of str
- get_filled_period(kind='best', from_meta=True)
Get the combined filled period for all 3 stations.
This is the maximum possible timerange for these stations.
- Parameters
kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.
from_meta (bool, optional) – Should the period be from the meta table? If False: the period is returned from the timeserie. In this case this function is only a wrapper for .get_period_meta. The default is True.
- Returns
The maximum filled period for the 3 parameters for this station.
- Return type
- get_df(period=(None, None), kind='best', paras='all', agg_to='day')
Get a DataFrame with the corresponding data.
- Parameters
period (TimestampPeriod or (tuple or list of datetime.datetime or None), optional) – The minimum and maximum Timestamp for which to get the timeseries. If None is given, the maximum or minimal possible Timestamp is taken. The default is (None, None).
kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.
agg_to (str, optional) – To what aggregation level should the timeseries get aggregated to. The minimum aggregation for Temperatur and ET is daily and for the precipitation it is 10 minutes. If a smaller aggregation is selected the minimum possible aggregation for the respective parameter is returned. So if 10 minutes is selected, than precipitation is returned in 10 minuets and T and ET as daily. The default is “10 min”.
- Returns
A DataFrame with the timeseries for this station and the given period.
- Return type
pd.Dataframe
- classmethod get_meta_explanation(infos='all')
Get the explanations of the available meta fields.
- Parameters
infos (list or string, optional) – The infos you wish to get an explanation for. If “all” then all the available information get returned. The default is “all”
- Returns
a pandas Series with the information names as index and the explanation as values.
- Return type
pd.Series
- get_meta(paras='all', **kwargs)
Get the meta information for every parameter of this station.
- Parameters
paras (list of str or str, optional) – Give the parameters for which to get the meta information. Can be “n”, “t”, “et” or “all”. If “all”, then every available station parameter is returned. The default is “all”
kwargs (dict, optional) – The optional keyword arguments are handed to the single Station get_meta methodes. Can be e.g. “info”.
- Returns
dict with the information. there is one subdict per parameter. If only one parameter is asked for, then there is no subdict, but only a single value.
- Return type
dict
- get_geom()
- get_name()
- create_roger_ts(dir, period=(None, None), kind='best', r_r0=1)
Create the timeserie files for roger as csv.
This is only a wrapper function for create_ts with some standard settings.
- Parameters
dir (pathlib like object or zipfile.ZipFile) – The directory or Zipfile to store the timeseries in. If a zipfile is given a folder with the statiopns ID is added to the filepath.
period (TimestampPeriod like object, optional) – The period for which to get the timeseries. If (None, None) is entered, then the maximal possible period is computed. The default is (None, None)
kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.
r_r0 (int or float, list of int or float or None, optional) – Should the ET timeserie contain a column with R/R0. If None, then no column is added. If int or float, then a R/R0 column is appended with this number as standard value. If list of int or floats, then the list should have the same length as the ET-timeserie and is appanded to the Timeserie. If pd.Series, then the index should be a timestamp index. The serie is then joined to the ET timeserie. The default is 1.
- Raises
Warning – If there are NAs in the timeseries or the period got changed.
- create_ts(dir, period=(None, None), kind='best', agg_to='10 min', r_r0=None, split_date=False)
Create the timeserie files as csv.
- Parameters
dir (pathlib like object or zipfile.ZipFile) – The directory or Zipfile to store the timeseries in. If a zipfile is given a folder with the statiopns ID is added to the filepath.
period (TimestampPeriod like object, optional) – The period for which to get the timeseries. If (None, None) is entered, then the maximal possible period is computed. The default is (None, None)
kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.
agg_to (str, optional) – To what aggregation level should the timeseries get aggregated to. The minimum aggregation for Temperatur and ET is daily and for the precipitation it is 10 minutes. If a smaller aggregation is selected the minimum possible aggregation for the respective parameter is returned. So if 10 minutes is selected, than precipitation is returned in 10 minuets and T and ET as daily. The default is “10 min”.
r_r0 (int or float or None or pd.Series or list, optional) – Should the ET timeserie contain a column with R/R0. If None, then no column is added. If int, then a R/R0 column is appended with this number as standard value. If list of int or floats, then the list should have the same length as the ET-timeserie and is appanded to the Timeserie. If pd.Series, then the index should be a timestamp index. The serie is then joined to the ET timeserie. The default is None.
split_date (bool, optional) – Should the timestamp get splitted into parts, so one column for year, one for month etc.? If False the timestamp is saved in one column as string.
- Raises
Warning – If there are NAs in the timeseries or the period got changed.
stations
This module has grouping classes for all the stations of one parameter. E.G. StationsN (or StationsN) groups all the Precipitation Stations available. Those classes can get used to do actions on all the stations.
StationsN
- class weatherDB.stations.StationsN
Bases:
weatherDB.stations.StationsBaseA class to work with and download 10 minutes precipitation data for several stations.
- update_richter_class(stids='all')
Update the Richter exposition class.
Get the value from the raster, compare with the richter categories and save to the database.
- Parameters
stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.
- Raises
ValueError – If the given stids (Station_IDs) are not all valid.
- richter_correct(stids='all')
Richter correct the filled data.
- Parameters
stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.
- Raises
ValueError – If the given stids (Station_IDs) are not all valid.
- last_imp_corr(stids='all')
Richter correct the filled data for the last imported period.
- Parameters
stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.
- Raises
ValueError – If the given stids (Station_IDs) are not all valid.
StationsT
- class weatherDB.stations.StationsT
Bases:
weatherDB.stations.StationsTETBaseA class to work with and download temperature data for several stations.
StationsET
- class weatherDB.stations.StationsET
Bases:
weatherDB.stations.StationsTETBaseA class to work with and download potential Evapotranspiration (VPGB) data for several stations.
StationsND
- class weatherDB.stations.StationsND
Bases:
weatherDB.stations.StationsBaseA class to work with and download daily precipitation data for several stations.
Those stations data are only downloaded to do some quality checks on the 10 minutes data. Therefor there is no special quality check and richter correction done on this data. If you want daily precipitation data, better use the 10 minutes station class (StationN) and aggregate to daily values.
GroupStations
- class weatherDB.stations.GroupStations
Bases:
objectA class to group all possible parameters of all the stations.
- __init__()
- get_valid_stids()
- classmethod get_meta_explanation(infos='all')
Get the explanations of the available meta fields.
- Parameters
infos (list or string, optional) – The infos you wish to get an explanation for. If “all” then all the available information get returned. The default is “all”
- Returns
a pandas Series with the information names as index and the explanation as values.
- Return type
pd.Series
- get_meta(paras='all', stids='all', **kwargs)
Get the meta Dataframe from the Database.
- Parameters
paras (list or str, optional) – The parameters for which to get the information. If “all” then all the available parameters are requested. The default is “all”.
stids (string or list of int, optional) – The Stations to return the meta information for. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.
**kwargs (dict, optional) – The keyword arguments are passed to the station.GroupStation().get_meta methode. From there it is passed to the single station get_meta methode. Can be e.g. “infos”
- Returns
dict of pandas.DataFrame or geopandas.GeoDataFrame
or pandas.DataFrame or geopandas.GeoDataFrame – The meta DataFrame. If several parameters are asked for, then a dict with an entry per parameter is returned.
- Raises
ValueError – If the given stids (Station_IDs) are not all valid.
ValueError – If the given paras are not all valid.
- get_para_stations(paras='all')
Get a list with all the multi parameter stations as stations.Station{parameter}-objects.
- Parameters
paras (list or str, optional) – The parameters for which to get the objects. If “all” then all the available parameters are requested. The default is “all”.
- Returns
returns a list with the corresponding station objects.
- Return type
Station-object
- Raises
ValueError – If the given stids (Station_IDs) are not all valid.
- get_group_stations(stids='all', **kwargs)
Get a list with all the stations as station.GroupStation-objects.
- Parameters
stids (string or list of int, optional) – The Stations to return. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.
**kwargs (optional) – The keyword arguments are handed to the creation of the single GroupStation objects. Can be e.g. “error_if_missing”.
- Returns
returns a list with the corresponding station objects.
- Return type
Station-object
- Raises
ValueError – If the given stids (Station_IDs) are not all valid.
- create_ts(dir, period=(None, None), kind='best', stids='all', agg_to='10 min', r_r0=None, split_date=False)
Download and create the weather tables as csv files.
- Parameters
dir (path-like object) – The directory where to save the tables. If the directory is a ZipFile, then the output will get zipped into this.
period (TimestampPeriod like object, optional) – The period for which to get the timeseries. If (None, None) is entered, then the maximal possible period is computed. The default is (None, None)
kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.
stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.
agg_to (str, optional) – To what aggregation level should the timeseries get aggregated to. The minimum aggregation for Temperatur and ET is daily and for the precipitation it is 10 minutes. If a smaller aggregation is selected the minimum possible aggregation for the respective parameter is returned. So if 10 minutes is selected, than precipitation is returned in 10 minuets and T and ET as daily. The default is “10 min”.
r_r0 (int or float or None or pd.Series or list, optional) – Should the ET timeserie contain a column with R/R0. If None, then no column is added. If int, then a R/R0 column is appended with this number as standard value. If list of int or floats, then the list should have the same length as the ET-timeserie and is appanded to the Timeserie. If pd.Series, then the index should be a timestamp index. The serie is then joined to the ET timeserie. The default is None.
split_date (bool, optional) – Should the timestamp get splitted into parts, so one column for year, one for month etc.? If False the timestamp is saved in one column as string.
- create_roger_ts(dir, period=(None, None), stids='all', kind='best', r_r0=1)
Create the timeserie files for roger as csv.
This is only a wrapper function for create_ts with some standard settings.
- Parameters
dir (pathlib like object or zipfile.ZipFile) – The directory or Zipfile to store the timeseries in. If a zipfile is given a folder with the stations ID is added to the filepath.
period (TimestampPeriod like object, optional) – The period for which to get the timeseries. If (None, None) is entered, then the maximal possible period is computed. The default is (None, None)
stids (string or list of int, optional) – The Stations for which to compute. Can either be “all”, for all possible stations or a list with the Station IDs. The default is “all”.
kind (str) – The data kind to look for filled period. Must be a column in the timeseries DB. Must be one of “raw”, “qc”, “filled”, “adj”. If “best” is given, then depending on the parameter of the station the best kind is selected. For Precipitation this is “corr” and for the other this is “filled”. For the precipitation also “qn” and “corr” are valid.
r_r0 (int or float or None or pd.Series or list, optional) – Should the ET timeserie contain a column with R_R0. If None, then no column is added. If int, then a R/R0 column is appended with this number as standard value. If list of int or floats, then the list should have the same length as the ET-timeserie and is appanded to the Timeserie. If pd.Series, then the index should be a timestamp index. The serie is then joined to the ET timeserie. The default is 1.
- Raises
Warning – If there are NAs in the timeseries or the period got changed.
broker
This submodule has only one class Broker. This one is used to do actions on all the stations together. Mainly only used for updating the DB.
Broker
- class weatherDB.broker.Broker
Bases:
objectA class to manage and update the database.
Can get used to update all the stations and parameters at once.
This class is only working with super user privileges.
- __init__()
- update_raw(only_new=True, paras=['n_d', 'n', 't', 'et'])
Update the raw data from the DWD-CDC server to the database.
- Parameters
only_new (bool, optional) – Get only the files that are not yet in the database? If False all the available files are loaded again. The default is True.
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].
- update_meta(paras=['n_d', 'n', 't', 'et'])
Update the meta file from the CDC Server to the Database.
- Parameters
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].
- update_ma(paras=['n_d', 'n', 't', 'et'])
Update the multi-annual data from raster to table.
- Parameters
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].
- update_period_meta(paras=['n_d', 'n', 't', 'et'])
Update the periods in the meta table.
- Parameters
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].
- quality_check(paras=['n', 't', 'et'], with_fillup_nd=True)
Do the quality check on the stations raw data.
- Parameters
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n”, “t”, “et”]. The default is [“n”, “t”, “et”].
with_fillup_nd (bool, optional) – Should the daily precipitation data get filled up if the 10 minute precipitation data gets quality checked. The default is True.
- last_imp_quality_check(paras=['n', 't', 'et'], with_fillup_nd=True)
Quality check the last imported data.
Also fills up the daily precipitation data if the 10 minute precipitation data should get quality checked.
- Parameters
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n”, “t”, “et”]. The default is [“n”, “t”, “et”].
with_fillup_nd (bool, optional) – Should the daily precipitation data get filled up if the 10 minute precipitation data gets quality checked. The default is True.
- fillup(paras=['n', 't', 'et'])
Fillup the timeseries.
- Parameters
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].
- last_imp_fillup(paras=['n', 't', 'et'])
Fillup the last imported data.
- Parameters
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].
- richter_correct()
Richter correct all of the precipitation data.
- last_imp_corr()
Richter correct the last imported precipitation data.
- update_db(paras=['n_d', 'n', 't', 'et'])
The regular Update of the database.
Downloads new data. Quality checks the newly imported data. Fills up the newly imported data.
- Parameters
paras (list of str, optional) – The parameters for which to do the actions. Can be one, some or all of [“n_d”, “n”, “t”, “et”]. The default is [“n_d”, “n”, “t”, “et”].
- initiate_db()
Initiate the Database.
Downloads all the data from the CDC server for the first time. Updates the multi-annual data and the richter-class for all the stations. Quality checks and fills up the timeseries.